Guardrails — How we ship AI responsibly

We do not ship what we do not evaluate.

Every AI engagement at Orion includes an evaluation harness as a delivered artifact. Your team owns it, can re-run it, and can extend it when the domain evolves or a model version changes. The harness is the contract — not the prompt, not the model, not the marketing demo.

If we cannot agree on a measurable definition of "the system works," we will tell you the engagement is not ready to start.

Models do not ship without humans in the loop somewhere.

We do not deploy fully-autonomous agentic systems into production paths that affect real people, real money, or real records — without an explicit human review boundary somewhere in the loop.

That boundary moves over time as confidence is earned. It does not move until the evaluation harness says it should.

Data stays where it started.

We architect AI systems so that your training data, retrieval corpora, embeddings, and inference traces remain in your accounts and under your identity boundaries. Bedrock-on-VPC, OpenSearch in your account, prompts and completions logged where your auditors can see them.

Nothing leaves your boundary without your written authorization. Not anonymized samples. Not for "training improvements." Not for our marketing.

We refuse work that obfuscates accountability.

We will not build a system whose primary purpose is to make a human decision look automated when it is not. We will not build a system whose primary purpose is to make an automated decision look human-reviewed when it is not. We will tell you which one a proposed build is, on the first call.

We will turn down work whose intended use we would not defend in writing.

We name our model choices and we name when we change them.

Every deliverable identifies, in writing, which models are in use, at what versions, with what parameters, and how those choices were made. When the model bumps, the evaluation harness re-runs and the report goes to you before any production deploy.

"We use the best AI model" is not an answer we are willing to give.

Observability covers cost, not just latency.

Token spend, refusal rate, retrieval hit rate, downstream error rate — not just p95 latency. The dashboard your finance team sees and the dashboard your engineering team sees are the same dashboard.

A system that works but doubles your monthly bill in silence is not a system that works.

We tell you when to kill the project.

The two-week Quantum Labs spike has a defined exit. If the spike does not clear the bar we set with you at the start, we recommend killing the project, and we explain why.

We do not collect retainer fees off momentum. The healthiest engagements we run end on time, on scope, and with you in a position to keep operating without us.

How we ship AI
we are willing to put
our name on.

Seven commitments we keep.

We do not ship what we do not evaluate.

Models do not ship without humans in the loop somewhere.

Data stays where it started.

We refuse work that obfuscates accountability.

We name our model choices and we name when we change them.

Observability covers cost, not just latency.

We tell you when to kill the project.

The principles above only work because
the engagement is shaped to enforce them.

One named engineer scopes, builds, and operates to handoff.

Weekly written check-in. Demo on every cycle.

Written, measurable, agreed before kickoff.

A second engineer on call from week one.

You own everything we produce. We keep nothing.

Runbooks, on-call playbooks, and a kill switch.

Fixed-price spike. Time-and-materials after.

We will name work we will not take.

Hold us to all of it.

How we ship AIwe are willing to putour name on.

Seven commitments we keep.

We do not ship what we do not evaluate.

Models do not ship without humans in the loop somewhere.

Data stays where it started.

We refuse work that obfuscates accountability.

We name our model choices and we name when we change them.

Observability covers cost, not just latency.

We tell you when to kill the project.

The principles above only work becausethe engagement is shaped to enforce them.

One named engineer scopes, builds, and operates to handoff.

Weekly written check-in. Demo on every cycle.

Written, measurable, agreed before kickoff.

A second engineer on call from week one.

You own everything we produce. We keep nothing.

Runbooks, on-call playbooks, and a kill switch.

Fixed-price spike. Time-and-materials after.

We will name work we will not take.

Hold us to all of it.

How we ship AI
we are willing to put
our name on.

The principles above only work because
the engagement is shaped to enforce them.