Meet the Agent: How the Solutions Architecture Team Built Mycroft

An AI agent to reduce vendor API documentation decoding from 15 hours to one hour per week
Todd Soderstrom
Meet the Agent: How the Solutions Architecture Team Built Mycroft

As a senior integrations engineer on the solutions architecture team, my mandate is straightforward to describe and painful to scale: when we’re asked to pull data from a new source, we make it happen. The range is wide — different industries, different scales, different generations of API design — and the queue doesn’t stop.

The hard part isn’t the engineering. It’s the decoding. Every vendor documents their API differently, authenticates differently, and has its own opinions about pagination, errors, and what a “result” looks like. Each new integration starts with me sitting down to figure all of that out from scratch. On average, our team was spending 15 hours a week manually decoding vendor API documentation and writing integration code. So I decided to build an agent that does it in one hour instead.

I named it Mycroft, after Sherlock Holmes’ older brother. Here’s how it works, what went wrong along the way, and what we learned about building automation that improves over time.

15 hours a week on API documentation

Each time a new data source arrives with fresh API documentation, someone on my team has to manually decode what it actually means — not just read it, but understand it deeply enough to generate working code. It’s important work — repetitive but requiring real intelligence — but hard to scale with humans alone. Plus, the documentation quality we receive can vary wildly.

With Mycroft, we decided to turn each vendor into callable tool that our AI agents could use to query that vendor's data, so getting the wiring right mattered.

What Mycroft actually delivers

Mycroft turns a vendor's API documentation into a working tool — a callable, tested, documented capability that our investigative agents can use to query that vendor's data. Each run produces real code merged into the product, not a script someone has to clean up. A vendor goes in and a pull request comes out.

A generator paired with a critic, at every step

The reason Mycroft is reliable is that nothing it produces ships unchallenged. Every generative step — drafting the spec, writing the code, generating tests, reviewing against the docs — is paired with a separate critic agent whose only job is to grade that output and fail it if it falls short. When the critic flags a problem, a focused fix loop runs (up to three attempts) before Mycroft is allowed to advance to the next stage.

That adversarial gating is the actual mechanism behind Mycroft’s reliability. Most automation pipelines hit a ceiling because generation and validation happen in the same step, run by the same agent, with the same blind spots. Splitting them — generator on one side, skeptic on the other — means errors get caught when they're cheap to fix, instead of after they've propagated through three more stages of a complex process.

By the time a developer sees Mycroft's output, every stage has already been written, critiqued, fixed, and re-critiqued. The reviewer is reading a finished pull request with the critic reports attached, not steering a half-built draft.

Mycroft rewrites itself between runs

The continuous improvement story isn't that Mycroft "learns from failure" in some abstract sense. The mechanism is concrete: after every run, Mycroft analyzes what its critics caught, what slipped through to code review, and where the fix loops spent their attempts. It then edits the prompts and instructions of its own agents in the source repo and opens a pull request to update itself.

The agents literally rewrite themselves between runs. The same human review gate that catches bugs in vendor tools also catches bad self-modifications, so the loop stays supervised — but institutional knowledge compounds in the codebase, not just in someone's head. Mycroft tomorrow is meaningfully better at this than Mycroft yesterday, and the diff is auditable.

What we got wrong

Bringing Mycroft to life wasn’t without its challenges, and we made a few critical observations along the way:

Code generation without validation fails

The first iteration assumed Mycroft just needed to generate code. Early versions produced syntactically correct output that had zero chance of actually connecting to a vendor's API. Live validation at every stage — including hitting real endpoints with real credentials — turned out to be essential.

API documentation quality varies wildly

Some vendors ship clean, well-structured OpenAPI specs. Others ship creative interpretations of what a spec could be. We had to build significant preprocessing logic to normalize inputs before generation could work.

Full autonomy is slower than human-in-the-loop

The original goal was a fully autonomous bot. In practice, a developer reviewing Mycroft's PR takes about 30 minutes instead of the original 4–5 hours of building. Keeping a human at the merge gate makes the system faster overall.

These iterations weren't failures — they were ultimately the work that made Mycroft more useful.

Where Mycroft still needs humans

Mycroft works best when it has semantic structure to grab onto — clean specs, structured schemas, well-defined endpoints. Unstructured prose without any schema information is harder.

It’s also important to note that having an AI agent like Mycroft has multiplied the capacity of my team — but it doesn't replace technical judgment. We still verify security reviews and compliance checks before integration.

From 15 hours to one

With Mycroft, we brought 15 hours a week of manual API integration work down to one hour. That's the difference between scaling vendor integrations and being bottlenecked by documentation. Mycroft handles the volume; we focus on the decisions that require human expertise.

{{horizontal-line}}

If building AI agents that do real work and produce real outcomes sounds like your kind of challenge, check out our open roles. We’re hiring across all teams.

This is some text inside of a div block.
Subscribe and stay up to date with our insights
No items found.