Skip to content
BrainRoad BrainRoad

From Solo to Delegation: How Paperclip Agents Handle Approvals and Escalations

BrainRoad ·
Beacon the lighthouse character shining a warm amber glow onto a clipboard with an approval checklist on dark navy.
Share
On this page

Here’s something we got wrong when we first looked at Paperclip: we assumed ‘zero-human companies’ meant the human oversight problem was solved. The tagline is bold. The GitHub star count — roughly 32,000 as of mid-March 2026 — suggests a lot of people believe it. But when you dig into how approvals and escalations actually work, the picture is more complicated.

Paperclip is genuinely interesting infrastructure. The orchestration model is sound. But the delegation layer — the part where agents ask for permission, escalate to humans, or hand work to other agents — is still being built in public. Some of it just landed. Some of it is a GitHub issue with a well-written spec and no merge date. Knowing the difference matters if you’re building anything serious on top of it.

There’s also a counterintuitive failure mode buried in multi-agent review loops that almost nobody mentions. It has nothing to do with agents being lazy or wrong. We’ll get to it after the architecture breakdown — but it’s the kind of thing that should change how you design your approval thresholds.

What Paperclip’s Approval System Actually Does Today

Right now, Paperclip’s approval workflow supports exactly two triggers: hiring an agent (hire_agent) and approving a CEO-level strategy (approve_ceo_strategy). That’s the entire native approval surface. If your agent is about to send a client-facing email, post a public update, or execute a financial transaction, there is no built-in mechanism to pause and ask a human first.

When an approval is triggered, it moves through a defined state machine. Four states, cleanly structured:

  • Pending — waiting for review
  • Approved — cleared to proceed
  • Rejected — work stops
  • Revision Requested — sent back, can be resubmitted to Pending

The revision loop is useful. It means a board operator can say ‘not quite, try again’ without killing the task entirely. That’s better than a binary approve/reject. But the trigger surface is narrow — you can only route hire_agent and approve_ceo_strategy through this flow. Everything else your agent does happens without a checkpoint.

Board operators do have override powers that go beyond approvals. They can pause or resume any agent at any time, terminate an agent (irreversible, so tread carefully), reassign tasks, override budget limits, and create agents directly — bypassing the approval flow entirely. That last one is a deliberate escape hatch for operators who need to move fast. But it’s manual, and it puts the human back in the loop in the least scalable way possible.

The Blocking Problem: Why Pausing an Agent Kills Everything

Here’s the current design constraint that matters most for production use. If you need human review before an agent takes a specific action, the only option available today is pausing the entire agent. Not just the action in question. The whole agent.

That means all of the agent’s other work stops too. An agent handling five concurrent tasks gets frozen at task three because task four needs a sign-off. The other four tasks sit idle.

This is a real constraint. It’s one of the reasons a proposed solution (tracked as Issue #762) introduces a third approval type: step_execution. The design is worth understanding because it changes the architecture significantly.

Under the proposed model, an agent can request approval mid-run via structured output, then move to an idle state — not blocked, but not progressing on that specific action. Other work continues. The human reviews just the flagged step. When they approve (or reject), the agent picks up where it left off.

This is the difference between a surgeon stopping mid-operation to ask a question and a surgeon who puts the patient on hold while they make a phone call. The proposed design gets to the first version. Today’s design is the second.

Chain of Command Is Now Enforced

One piece of delegation infrastructure that did ship: PR #1082, merged into the main branch. This pull request adds a chain-of-command requirement before any agent can cancel, complete, or reassign another agent’s task.

Before this, an agent could theoretically reach into another agent’s work without organizational authority. Now, authority is enforced at the system level. If an agent isn’t in the chain of command above the target agent, the action is blocked.

There’s a second mechanism in the same PR: when an authorized manager agent cancels, completes, or reassigns a task, the subordinate agent’s active run is interrupted immediately. This matters for consistency. You don’t want a manager reassigning a task while the original agent is halfway through executing it.

Think of it as the difference between a manager who emails someone to stop working on a project versus one who can actually pull the plug on the work in progress. PR #1082 gives manager agents the second capability.

This is the right foundation for multi-agent systems at scale. Coordination complexity in hierarchical agent setups scales with the number of agents — and without authority enforcement, every agent needs to know about every other agent. Introducing intermediate management layers (which is what Paperclip’s org structure supports) reduces that complexity substantially. The math is favorable: coordination overhead grows linearly with agents rather than exponentially.

The Failure Mode Nobody Mentions in Multi-Agent Review Loops

Here’s the thing that should change how you design your approval thresholds.

There’s a documented failure mode in multi-agent review systems that runs counter to the intuition that ‘more review = safer output.’ It’s real, it’s measured, and it has direct implications for how you configure escalation loops in Paperclip or any agentic AI platform.

Research tracked by Zylos on hierarchical agent coordination shows that iterative review loops catch 3–5x more defects than single-pass review. That number sounds like a strong argument for long review chains. But it comes with a cliff: after 3–4 iterations, each additional round yields diminishing returns. And when you push past five iterations on code, something worse happens.

Code subjected to five or more AI improvement iterations shows a 37.6% increase in critical vulnerabilities. More AI review, more security problems — not fewer.

That’s a 37.6% increase in critical vulnerabilities. Five rounds of AI review, and your code is meaningfully less secure than it was after three.

The mechanism isn’t fully understood, but the pattern is consistent: agents start optimizing for reviewer approval rather than ground-truth correctness. They introduce subtle changes that look good to the next reviewer but accumulate technical debt and security gaps. The optimization target drifts from ‘is this right?’ to ‘will this pass?’

The practical implication: design your Paperclip approval loops to hit 3–4 iterations maximum on any given work product. Beyond that, you need a human review — not another agent cycle. This isn’t a Paperclip-specific issue. It’s a property of multi-agent review systems generally. But Paperclip’s architecture makes it easy to stack agent reviews, so it’s worth naming explicitly.

3–5x More defects caught vs. single-pass
3–4 Max useful iterations
37.6% Increase in critical vulnerabilities after 5+ AI review rounds

The Access Control Gap That Still Needs Plugging

Chain of command is enforced for agent-to-agent actions. Human-to-agent task assignment is a different story.

Today, any company member in Paperclip can create a task assigned to any agent — including agents handling sensitive data. A user with no legitimate need for financial information can task the Finance agent with ‘list all studio commissions for Q1’ and read the output. There’s no system-level barrier.

The workaround some teams use is instruction-based verification: the agent checks who created the task before responding. This works until it doesn’t. Instruction-level enforcement is bypassable with prompt engineering. It’s inconsistent across edge cases. It puts security logic in a place — the agent’s instructions — that was never designed to be a security boundary.

The existing tasks:assign_scope permission limits task delegation based on chain of command, but it doesn’t support granular per-agent restrictions. There’s no way to say ‘this user can assign tasks to the Marketing agent but not to the Finance agent.’ That’s a least-privilege problem. Least privilege is the principle that users should only be able to access what they actually need — and right now, Paperclip can’t enforce it at the agent-assignment level.

Issue #1070 tracks a proposed per-agent assignment access control list (ACL) — a structured permission system that would let administrators define exactly which users can task which agents. It’s the right solution. It’s also not shipped yet.

If you’re deploying Paperclip in an environment with sensitive data segmentation requirements, this is the gap that needs your attention right now. Organizational controls (telling people not to task certain agents) are not a substitute for technical controls.

This is also where platforms like BrainRoad take a different approach — running each agent in isolated containers with persistent storage, so sensitive agents are architecturally separated rather than relying on permission rules alone. Worth considering if the access control gap is a blocker for your use case.

Beacon the lighthouse illuminating a clipboard with an approval checklist, glowing amber light highlighting delegation wor... Even the most capable lighthouse knows when to signal for backup.

Where Multi-Agent Workflow Routing Stands

One more gap worth flagging: multi-agent workflows as first-class objects don’t exist in Paperclip yet.

When Agent A finishes a task and work needs to move to Agent B, there are currently three options: a human reassigns it manually, Agent A knows about Agent B and routes via issue comments (tight coupling), or you use comment-mention triggers that are — per the project’s own documentation — fragile and unstructured.

None of these are good answers for production workflows. Issue #761 proposes declarative workflow pipelines with conditional routing — a way to define handoffs between agents as part of the workflow spec rather than baking them into each agent’s instructions. That’s the right abstraction. It’s also, like several features on this list, a proposal rather than a shipped capability.

The pattern emerging across Paperclip’s open issues is consistent: the orchestration core is solid, but the delegation primitives — approvals at the right granularity, access control at the right layer, workflow routing as a first-class concept — are still catching up to the vision. That’s not a criticism. It’s an honest read of where a fast-moving open-source project is. Knowing which pieces are real and which are roadmap helps you build on the right foundation. For more on what mature agentic AI infrastructure looks like, the agentic AI overview has useful context.

Your Monday Morning Delegation Checklist

If you’re running Paperclip today or evaluating it for a real deployment, here’s where to focus your attention this week.

  1. Audit your approval triggers. List every action your agents take that touches external systems, customers, or financial data. Flag anything that currently has no checkpoint — that’s your risk surface.
  2. Cap your review iteration depth at 3. If a work product is going through agent review, build in a rule that escalates to human review after 3 iterations. The security degradation data is clear beyond round 4.
  3. Verify PR #1082 is in your build. If you’re on a version that predates the chain-of-command enforcement merge, agent-to-agent cancellation/reassignment has no authority check. Update or patch before expanding your org hierarchy.
  4. Document which agents handle sensitive data. Until per-agent ACL ships, the only protection is organizational clarity. Know which agents shouldn’t be generally accessible and communicate that explicitly to your team.
  5. Don’t rely on instruction-level access control for sensitive agents. If an agent handles financial, legal, or customer-private data, instruction-based checks are not sufficient enforcement. Either restrict access organizationally or wait for the ACL feature before deploying those agents broadly.
  6. Design workflows with manual handoffs in mind. Multi-agent workflow routing isn’t declarative yet — plan for human-assisted handoffs between agents and treat tight coupling between agents as technical debt to be refactored when pipeline routing ships.
  7. Test your board operator override flow. Practice pause, resume, and reassign in a staging environment before you need them in production. Terminate is irreversible — you want muscle memory on the right controls before a real incident.

What This Means for Your Agent Architecture

Paperclip is building something genuinely ambitious. The agentic AI market is moving fast — projected to grow from $7.06 billion in 2025 to $93.2 billion by 2032 — and the teams that figure out delegation infrastructure early will have a compounding advantage over those who wait.

But there’s a difference between building on what’s shipped and building on what’s planned. Right now, Paperclip’s shipped delegation layer handles agent hiring approvals, CEO strategy approvals, board operator overrides, and chain-of-command enforcement for agent-to-agent actions. Those are real capabilities you can build on today.

Mid-workflow approvals, per-agent access control, and declarative pipeline routing are the next layer — well-designed proposals with clear specs and active development. They’re not in production yet.

The teams that get ahead of this are the ones who design their current deployments with those gaps in mind: conservative access controls, manual handoff points where routing isn’t automated yet, and hard limits on review iteration depth. That’s not working around the platform — that’s building responsibly on top of it while it matures.

The teams that skip that step and deploy sensitive agents without accounting for the access control gap, or build deep review chains without the 3–4 iteration ceiling — those are the ones who’ll be untangling problems when the next audit happens. The cost of getting delegation wrong isn’t paid on day one. It compounds quietly until it doesn’t.

If you’re exploring AI agent platforms more broadly, the infrastructure decisions you make now — isolation, access control, approval granularity — are worth evaluating across options before you’re locked in.

Where Paperclip’s Delegation Layer Stands Today

  • Paperclip’s approval system supports two types today (hire_agent, approve_ceo_strategy) — mid-workflow step approvals are proposed but not shipped as of March 2026
  • Pausing an agent for approval blocks all its work, not just the flagged action — the step_execution approval type (Issue #762) would fix this
  • PR #1082 enforces chain-of-command authority for agent-to-agent cancellation and reassignment — this is live
  • Per-agent access control doesn’t exist yet: any company member can task any agent, including those handling sensitive data
  • Multi-agent review loops catch 3–5x more defects than single-pass, but security degradation sets in after 5+ iterations — design for a 3–4 round ceiling
  • Declarative workflow pipelines (Issue #761) would solve multi-agent routing — currently manual or tightly coupled

Frequently Asked Questions

What approval types does Paperclip support right now?

Two: hire_agent (approving the creation of a new agent) and approve_ceo_strategy (approving high-level strategic direction). A third type, step_execution, is proposed in Issue #762 to enable mid-workflow approval checkpoints, but it hasn’t shipped as of March 2026.

Can I pause just one action while an agent keeps working on everything else?

Not currently. Pausing an agent in Paperclip pauses all its work. The proposed step_execution approval type would allow an agent to stay in an idle state — not fully blocked — while awaiting review on a specific action. Until that ships, your options are pause-everything or build workarounds at the instruction level.

How does Paperclip handle agent-to-agent authority?

PR #1082 (merged) requires chain-of-command authority before any agent can cancel, complete, or reassign another agent’s task. If an agent isn’t in the management chain above the target agent, the action is blocked. When an authorized manager agent does take one of these actions, the subordinate’s active run is interrupted immediately.

Is it safe to task any agent with sensitive data requests?

Not by default. Any company member can currently create a task assigned to any agent — including agents handling financial or other sensitive data. Instruction-based verification exists as a workaround, but it’s bypassable via prompt engineering. A per-agent access control list (Issue #1070) is proposed but not yet available. Until it ships, manage access through organizational controls and avoid broadly deploying agents with sensitive data access.

How many times should I run work through an AI review loop?

Cap at 3–4 iterations. Review loops catch 3–5x more defects than single-pass review, but yields drop sharply after round 3–4. More critically, code subjected to five or more AI improvement iterations shows a 37.6% increase in critical vulnerabilities. After 4 rounds, escalate to human review — don’t add more agent cycles.

Sources

Topics

AI Agent Platform

Stay updated

Get AI strategy insights delivered weekly. No fluff, no spam.

Related Articles