14 AI agents, one CLI
I manage infrastructure across multiple Proxmox clusters, half a dozen M365 tenants, cloud providers, trading systems, and a handful of SaaS platforms. At some point, the cognitive overhead of context-switching between these domains became the bottleneck — not the work itself.
So I built DynaCore: a system where each operational domain has its own specialized AI agent with the right tools, context, and guardrails. There are 14 of them now.
The agents
Each agent has a name, a persona briefing, and access to specific tools.
A Proxmox specialist can run pvesh and qm commands
but can't touch M365. An Exchange Online specialist manages mailbox permissions
and license assignments but knows nothing about VMs. A network specialist
handles UniFi controllers, WireGuard tunnels, and firewall rules.
There's a monitoring agent that checks service health and triages incidents. An architecture agent that reviews DR plans and capacity but never executes changes. A trading agent that places orders but requires explicit confirmation for anything above a threshold. A marketing agent that drafts LinkedIn posts and content calendars.
The agents aren't chatbots. They're Claude Code subagent instances with
constrained tool access, loaded from persona files at session start. When I
say "check the health of all clusters," the monitoring agent SSHes into each
host, reads systemctl status, inspects log files, and reports back.
When I say "offboard this user from M365," the Exchange specialist disables the
account, sets up mail forwarding, converts the mailbox to shared, and releases
the license.
Orchestration
Lock is the orchestrator. It sits in the main Claude Code session and dispatches work to agents based on the request. If I ask something that touches networking and Proxmox, Lock spawns both agents in parallel and synthesizes the results.
The key design decision: agents are stateless between invocations. They read the current state of the system every time. No cached assumptions about what a VM's IP is or what licenses a tenant holds. This costs more tokens but eliminates an entire class of stale-state bugs.
Lock also decides which model to use. A simple user listing runs on Haiku — fast and cheap. A complex multi-tenant migration plan runs on Opus. The routing isn't hardcoded; it's a judgment call Lock makes based on task complexity and blast radius.
The evaluator pattern
For high-stakes work, agents don't operate alone. Trade proposals go from the execution agent to a portfolio analysis agent for risk review before confirmation. Code changes go from the development agent to a monitoring agent that live-tests the endpoint. Infrastructure changes go from the provisioning agent to an architecture agent that checks DR implications.
This evaluator-optimizer loop runs up to three rounds. If the evaluator still flags issues after three iterations, Lock surfaces it to me with both perspectives. I've never needed to override after three rounds — they either converge or surface a genuine ambiguity that needs human judgment.
What it actually looks like
I type a message in Telegram or the OpenDray app. Lock reads it, picks the right agent(s), dispatches them, waits for results, and sends me a summary. If an agent needs my input (approve this trade, confirm this destructive operation, pick between two options), it asks through the same channel.
On a typical day, I interact with 4-5 agents without thinking about which one is doing what. The specialization is invisible from my side. Lock handles routing. The agents handle execution. I handle decisions.
The system has been running for months. The part that surprised me most: the agents are better at their jobs than I am at most of them. Not because the AI is smarter, but because each agent's context is narrow and current. It never forgets to check the backup schedule before resizing a disk. It never skips the license count before offboarding a user. It never pushes to main without running tests.
I used to manage these systems. Now I manage the agents that manage them.