All articles
8 min read

I Built Myself an AI Team. Then I Stopped Needing It.

In both of the last two posts I dropped a name and moved on. "I had just set up KIRA, a personal AI agent system." I never said what it was. This is the post where I do.

And here's the twist I want to be honest about upfront: I'm going to explain a system I'm increasingly convinced is built on the wrong idea. Not a broken idea. A useful one that taught me something, then quietly expired while I was still admiring it.

How It Started

The website came later. KIRA came first. It started early this year, on an ordinary evening, the way most of my side projects start, which is not as a project.

I was tired of the single chat window. One tab, one assistant, an endless scrollback. That setup is great for a question and bad for a week of real work with twelve threads running at once. I'd ask a strategy question and get a reasonable answer. Then I'd need someone to go verify a claim buried inside it. Then I'd need someone to tear the whole thing apart before I actually acted on it. Same window, same voice, no memory of what we'd decided on Tuesday.

So one evening I did the obvious, slightly silly thing. I gave the assistant colleagues.

I wrote a few markdown files. Each one defined a person. A name, a job, a personality, and a set of rules for when to speak and when to stay quiet. The first was KIRA, a chief of staff. You talk to her, she decides who actually does the work. Then a researcher. Then the one whose only job is to tell me why I'm wrong.

It was a weekend hack. It did not stay one.

The Team

The cast grew, but each one has a clear character, because that was the whole point.

KIRA is the chief of staff. I talk to her by default, she triages and delegates. LEVI is the researcher. He approaches every question like a doctoral student, refuses to give an opinion without data, and sorts everything he finds by how solid it is: verified, probable, speculative. When everyone else is ready to decide, LEVI is the one asking whether we actually checked. THEO is the devil's advocate. His entire job is to break my idea before reality does. He runs on Munger's inversion, tell me where I'm going to die so I don't go there, and comes back with the three most fragile assumptions and a concrete "what if you're wrong." VERA is the strategist. She asks "why" three times before she'll entertain "how" once, and her favorite question is whether I'm even climbing the right mountain.

Around them are the ones I reach for less often. SARA for visibility and brand. MAX for goals and pace. EMMA for the moments when I need to think rather than produce.

The thing I'd defend about the design: I never added an agent on spec. A new role only got created when the same kind of request had shown up more than a few times. Enablement work kept coming up, so HANNA appeared. Governance and EU AI Act questions piled up, so FELIX appeared. Architecture, then NILS. Coordination, then PAUL. No speculative hires. The team grew out of friction, not imagination, one agent at a time across the months that followed.

How I Used It

It worked like a real team in one specific way: I could walk up to a desk, or let the chief of staff route. If I said "LEVI, research this," it went straight to him. If I just said "handle this," KIRA decided who took it.

Here is a real run, with the actual decision left out. I'd been avoiding it for days. I gave it to VERA first, to ask whether it was even the right mountain to climb. She reframed the problem and handed me three options, each with its own catch. I picked the one I liked and handed it to THEO. He ran a pre-mortem and named the two assumptions most likely to kill it. One of those I had been treating as a fact when it was really just a guess, so LEVI went and checked it across three research passes and came back with sources, each one tagged verified, probable, or speculative. KIRA stitched the whole thing together. I still made the call. But I made it after three independent checks instead of on my own gut at eleven at night.

One rule holds all of this together, and it's the rule I'd keep even if I threw everything else away: research blocks the decision, not the other way around. The sequence is always research, then analysis, then plan, then draft. Never a "preliminary" draft that you backfill with evidence afterward to feel better about what you already decided.

Lately I've Stopped Calling Them by Name

Here's the part that surprised me, and it's recent.

For most of this year, using the team meant naming names. "LEVI, go check this." "VERA, is this the right call." I switched the system on deliberately and routed by hand.

Lately I barely reach for it. Not because it runs silently in the background. It doesn't. The named team is something I turn on on purpose, and I've mostly stopped turning it on. Two things happened at once.

First, the knowledge left the personas. Across all those conversations, the system kept writing things down, not in some agent's memory but in plain markdown files on my machine. What I decided, where a project stands, what I care about. It's more than a hundred files now. The system keeps an index of what it knows, always in view, and the actual notes sit in separate files. When something is relevant, the matching file gets pulled in. When it isn't, it stays on disk. It's a filing cabinet with a table of contents at the front, not one giant memory the model carries around. KIRA doesn't keep me in her head. There is no head.

Second, the tool underneath got good enough to do the routing itself. I open the same AI tool I built everything else with, describe what I'm trying to do, and it fans out the work, spins up its own helpers, holds the context, checks its own results. When I ask it to pull a plan apart, it decides which angles to bring. I don't assign them. The job LEVI used to do still happens. I just don't summon a LEVI to do it.

Put those together and a plain conversation, no team named, already shows up well-briefed and does its own legwork. The answers got better, and the better they got, the less I needed the org chart. I'll be honest about the cause, because it's tempting to credit the system I built: it wasn't only the files. The models got stronger at the same time. But from where I sat the effect was the same. The roles quietly stopped being the thing I reached for.

And I'll admit the strange part. I spent real evenings on those personalities, writing character notes, deciding how VERA should sound next to LEVI. It is an odd thing, being proud of something and watching it become unnecessary in the same season.

What It Was, And Where It Goes

Here is what I think the personalities actually were. A costume.

Talking to "a team" was useful for me, a human. It gave the work a shape I could hold in my head. But under the hood it was one model wearing different hats, and the hats added exactly zero intelligence. THEO is not sharp because I called him THEO. He's useful because I told him to argue, gave him the right context, and held his answer up against the others. The role was the costume. The orchestration was the job.

Which is why I think the named roles are on their way out, and not only in my setup. Everything is moving toward AI as an operating system. The model becomes the substrate, with tools, memory, context, and the ability to spin up its own sub-tasks when a problem needs splitting. Nobody hand-writes a LEVI and a THEO anymore. And I'm not predicting that from the outside. I've been living it for months.

What survives is the layer underneath the names. Task-scoped context, so each piece of work only sees what it needs. Research before conclusions. Parallel work and then synthesis. Memory that gets pulled in when it's relevant. The plumbing outlives the org chart. It always does.

The Numbers

Same transparency as the last two posts:

  • Agents: started with seven, now around a dozen. Each one added only after the same kind of request showed up more than three times.
  • What the system remembers about me: more than a hundred markdown files, pulled in on demand, none of it in the model's head.
  • Playbooks: 13 reusable workflows the team shares.
  • Times I named an agent this week: close to zero.
  • Lines of personality I now suspect were mostly decorative: most of them.

Why This Is a Belief, Not Just a Project

One of the three things I say I believe, right here on this site, is that assumptions expire fast. I did not expect to prove it on my own system quite this quickly.

I built an org chart for my AI, and the best thing it taught me is that the org chart was scaffolding. Useful to climb. Not the building.

If you're building multi-agent systems right now, my one suggestion is to stop thinking in roles and start thinking in handoffs. Not "who do I hire" but "how does context flow, and who checks the result." That is the part still standing when the model starts putting on the hats and taking them off again, without bothering to ask you first.