// Guardrails

How we ship AI
we are willing to put
our name on.

These are the operating principles we hold ourselves to on AI engagements. We publish them so you can hold us to them too.

// Seven principles

Seven commitments we keep.

01

We do not ship what we do not evaluate.

Every AI engagement at Orion includes an evaluation harness as a delivered artifact. Your team owns it, can re-run it, and can extend it when the domain evolves or a model version changes. The harness is the contract — not the prompt, not the model, not the marketing demo.

If we cannot agree on a measurable definition of "the system works," we will tell you the engagement is not ready to start.

02

Models do not ship without humans in the loop somewhere.

We do not deploy fully-autonomous agentic systems into production paths that affect real people, real money, or real records — without an explicit human review boundary somewhere in the loop.

That boundary moves over time as confidence is earned. It does not move until the evaluation harness says it should.

03

Data stays where it started.

We architect AI systems so that your training data, retrieval corpora, embeddings, and inference traces remain in your accounts and under your identity boundaries. Bedrock-on-VPC, OpenSearch in your account, prompts and completions logged where your auditors can see them.

Nothing leaves your boundary without your written authorization. Not anonymized samples. Not for "training improvements." Not for our marketing.

04

We refuse work that obfuscates accountability.

We will not build a system whose primary purpose is to make a human decision look automated when it is not. We will not build a system whose primary purpose is to make an automated decision look human-reviewed when it is not. We will tell you which one a proposed build is, on the first call.

We will turn down work whose intended use we would not defend in writing.

05

We name our model choices and we name when we change them.

Every deliverable identifies, in writing, which models are in use, at what versions, with what parameters, and how those choices were made. When the model bumps, the evaluation harness re-runs and the report goes to you before any production deploy.

"We use the best AI model" is not an answer we are willing to give.

06

Observability covers cost, not just latency.

Token spend, refusal rate, retrieval hit rate, downstream error rate — not just p95 latency. The dashboard your finance team sees and the dashboard your engineering team sees are the same dashboard.

A system that works but doubles your monthly bill in silence is not a system that works.

07

We tell you when to kill the project.

The two-week Quantum Labs spike has a defined exit. If the spike does not clear the bar we set with you at the start, we recommend killing the project, and we explain why.

We do not collect retainer fees off momentum. The healthiest engagements we run end on time, on scope, and with you in a position to keep operating without us.

// Operating model

The principles above only work because
the engagement is shaped to enforce them.

The seven principles are commitments. The operating model is the mechanics that keep them from drifting under deadline pressure. Every Quantum Leap engagement runs against the same skeleton.

Staffing

One named engineer scopes, builds, and operates to handoff.

No partner-led pitch and junior-staffed delivery. The engineer on the first call is the engineer on the last call. We will name them in writing.

Cadence

Weekly written check-in. Demo on every cycle.

No status decks. Every week we ship a working build and a written note covering what landed, what is blocked, what the next week looks like. Slack threads, not standing meetings.

Success bar

Written, measurable, agreed before kickoff.

Every engagement starts with a one-page criteria document: what the system must do, on what inputs, scored how. The evaluation harness reports against it. There is no other definition of "done."

Escalation

A second engineer on call from week one.

If the named engineer is unavailable, escalation goes to a second engineer who has been read into the engagement from day one. No mid-project handoffs to strangers.

IP & data

You own everything we produce. We keep nothing.

Prompts, agent definitions, evaluation harnesses, integration code, infrastructure templates — all yours, on your accounts, under your IP. We retain no copies after handoff and we will sign the paperwork that says so.

Handoff

Runbooks, on-call playbooks, and a kill switch.

Every engagement ships with operational runbooks, an on-call playbook, and a documented way to disable the system fast if something is wrong. You are in a position to keep operating without us — that is the test of a clean handoff.

Pricing

Fixed-price spike. Time-and-materials after.

The two-week spike has a fixed price quoted on the first call. Build engagements after a spike are time-and-materials with a written cap. No surprise renewals, no auto-extend clauses.

Refusal

We will name work we will not take.

On the first call we will tell you if a proposed build crosses a line we will not cross — accountability obfuscation, surveillance-by-AI, decisions that affect real people without a human review boundary. We will say so in writing.

Hold us to all of it.

If we ever appear to be drifting from one of these, tell us.