AI Agent Credential Security: The Broker Pattern We Shipped

Never Give an AI Agent a Credential: A Broker, and the Process We Trusted to Build One

June 23, 2026

5 min

Never Give an AI Agent a Credential: A Broker, and the Process We Trusted to Build One

When an AI agent holds a secret, the sandbox doesn't matter. Here's the credential broker abstraction that keeps secrets out of agents' hands — and the four-part process we used to ship security-critical code with confidence.

[

Preet Bhinder,

]

You can run an AI agent under gVisor, lock down its egress, harden the container it lives in, and screen every request it tries to make — all good work you should do it. But none of it matters the moment the agent can read an API key.

An agent that holds a credential is just one prompt injection, confused tool call, or model mistake away from doing whatever that credential allows — and it will do it from inside your perimeter, with your key, and appear exactly like legitimate traffic. Hardening the sandbox around the agent raises the cost of escape, but does nothing about the fact that the valuable thing was sitting inside the box the whole time.

So this is the principle we work from: assume the sandbox eventually fails, and make sure the credential was never inside it. We’re not trying to build a sandbox so good that holding secrets becomes safe; rather, we’re taking the secrets out of the agent's hands entirely.

Finding a fix through abstraction (not re-architecture)

Making this structural change doesn’t require rebuilding your infrastructure completely, but rather, a small piece of indirection: a credential broker that sits in the agent's request path. The agent makes its outbound API call as usual, except it carries a placeholder — never the real secret. The broker swaps the placeholder for the real credential as the request leaves, and forwards it on. This means the upstream service sees a valid key (while the agent never does), and secrets stay exactly where they already live.

Our secrets live in HashiCorp Vault, and the broker reads them from there. Nothing about Vault, our rotation, or our access policies has to change. The agent's view of the world shrinks to a placeholder, and the blast radius of a compromised agent shrinks with it.

The Infisical team built an open-source tool for precisely this, called Agent Vault. It injects secrets into an agent's outbound calls so the agent never holds them, and it is an extremely useful piece of software: the abstraction is right and the implementation is clean. We wanted to use it to broker credentials for our own internal AI agents.

The HashiCorp gap, and contributing the fix

Agent Vault could read secrets from its own built-in store and from Infisical. But it couldn’t read them from HashiCorp Vault, which is where our secrets live. Without a HashiCorp Vault backend, the tool was a non-starter for us.

But because Agent Vault is open source, the path to resolving this issue was straightforward: we forked it, built HashiCorp Vault as a first-class credential store, and opened a pull request to send the feature back upstream. The new store supports token and AppRole authentication and KV versions one and two, and is wired through the same paths every other store uses: a database migration, the server's sync worker, the CLI, and the web UI.

The thing we deliberately avoided was a brittle wrapper bolted onto the side: a shim that translated between HashiCorp Vault and Agent Vault's existing stores (and broke the first time the tool changed). A shim around the outside is the shortcut that becomes someone's incident later. Forking, building a proper tested feature, and contributing it back is the responsible move.

The last code you’ll ship on vibes

The entire point of this work is to keep secrets away from agents, and to be the necessary obstacle and gatekeeper standing between a compromised agent and a real credential. It’s as security-critical as code gets: a bug in a credential broker leaks the key, injects the wrong secret, or fails open.

This means the broker is the last code you should ship on vibes. High stakes demand a high-trust process. An agent that confidently writes plausible-looking code is fine when the cost of a mistake is simply a failed test. But it’s not fine when the cost of a mistake is a leaked credential.

The same instinct that says "never give the agent the secret" says "never trust the agent's word that the broker works." So we built the feature with an AI agent — Claude Opus 4.8 — then wrapped that agent in the discipline you would demand of a strong but green engineer. The process had four parts.

1. Acceptance criteria first, from real user journeys

Before the agent ever wrote a line, I wrote a manual test plan grounded in user journeys — the kind you would hand a junior engineer so that "done" is not a matter of opinion.

One entry read, roughly: "As a developer, I want to use the UI to see credentials synced from a HashiCorp Vault namespace." Others covered token and AppRole authentication, and KV versions one and two. These were all things a human could do and observe, and each had to be true before the work counted as finished.

When you tell an agent, "Build HashiCorp Vault support," you get the agent's idea of “done.” But when you hand it the journeys, you get yours. This is the cheapest, highest-leverage step in this process — and often the one people skip.

2. Demo-driven verification, on every change

Agents are confident. They’ll tell you the integration works in the same even tone regardless of whether it actually works or not. So I didn’t let the agent assert that anything worked. Rather, it had to show me.

The agent built a runnable demo and a demo script alongside the code, and I stood up a real HashiCorp Vault in Docker so the demo ran end to end on every meaningful change:

Reveal a secret in HashiCorp Vault
Reveal the same secret synced into Agent Vault
curl through the proxy and watch the injected credential reach the upstream call
Rotate the secret in Vault
Watch the rotation sync through

If any step failed, the run was loud about it and we fixed it before moving on.

On top of the demo, there were unit tests, an integration test that ran the real Vault client against a stubbed Vault, and an end-to-end test that injected a synced credential through the man-in-the-middle proxy. A demo you re-run constantly turns "the agent says it works" into "I watched it work" — a critical distinction on security-critical code.

3. Adversarial review by agents that didn't write the code

Only after the work passed its own demo did I bring in review. I ran a panel of specialized review agents — reviewers for architecture reviewer, code, and security; plus open-source-maintainer and principal-engineer perspectives. Each agent was told to be skeptical and to assume the work was wrong.

This adversarial review step was crucial. Any agent that grades its own work does so kindly, making the agent that wrote the code the worst possible judge of it. But this collection of reviewers had not written it and were not rewarded for it passing. Among the issues they surfaced:

A database migration-number collision that would have broken the service at startup
A token that was set at boot but never validated, so a bad credential would have failed silently later instead of loudly at startup
A misleading code comment that described behavior the code did not have
A dependency added without justification

I fixed each one or documented why it could stay.

4. A human owned every decision

I set the acceptance criteria and made the architectural calls — including the decision to mirror the existing Infisical credential store rather than invent a new shape, so that the new store looked like code the maintainers already understood. I overrode review suggestions I disagreed with and caught agent mistakes; and the review agents I built also caught mistakes I would have missed.

In the same way you don’t stop reviewing a junior engineer's work because they’re talented, I didn’t stop reviewing the agents’ work. The cost of a look is low — but the cost of a silent failure is high.

The outcomes our credential broker enables

The result of this entire process is a HashiCorp Vault backend for an open-source credential broker, with the full test suite green and a pull request open to upstream it. For us, it means our internal AI agents can run against real APIs without ever holding a real key. The secret stays in HashiCorp Vault. The broker injects it at the edge of the request. The agent sees a placeholder. And if the agent is ever compromised, a placeholder is all an attacker gets.

Our key takeaway: Don’t work harder to make it safe for agents to hold secrets. Instead, make sure they never do.

‍

If working through these kinds of problems and thoughtfully shaping the future of AI workflows excites you, check out our open roles. We’re hiring and would love to hear from you.

[

Preet Bhinder,

]