The Multi-Agent Illusion
You ask your AI-powered IDE to "have the security auditor review this code." A specialized agent springs into action—separate from the main assistant, purpose-built for security analysis. Right? The reality is more nuanced. There are actually two models for how "agents" work, and understanding both changes how you should design.
The Persona Lens Model series
- The Multi-Agent Illusion (this post) — What really happens when you "spawn" an agent
- Anatomy of a Persona Lens — Inside an agent definition file
- Designing with the Persona Lens Model — Practical patterns for persona-based systems
The Common Assumption
Most developers have a mental model that looks something like this:
flowchart TB
User[You] --> Main[Main AI]
Main --> Security[Security Agent]
Main --> Reviewer[Code Reviewer]
Main --> Writer[Doc Writer]
The assumption: when you invoke an "agent," the system creates a new AI instance—a separate worker with its own context, running in parallel. The word "spawn" reinforces this.
The reality is more nuanced. There are actually two distinct models at play.
Model 1: The Persona Lens (Same Context)
When you mention an agent in conversation or the main agent "thinks like" a specialist, you get the Persona Lens Model:
flowchart TB
subgraph SAME["SAME CONTEXT WINDOW"]
User["Your Message"]
Context["Conversation History"]
Lens["Persona Lens Applied"]
LLM["Claude (Base Model)"]
Output["Specialized Output"]
end
User --> Context --> Lens --> LLM --> Output
| Component | What Happens |
|---|---|
| Your Message | "Have the security auditor review this" |
| Context | Full conversation history stays loaded |
| Lens Applied | Agent file's prompt injected as instructions |
| LLM | Same Claude instance, same context window |
| Output | Shaped by persona, but shares memory with main conversation |
Key characteristics:
- No context isolation—agent sees everything from the conversation
- No fresh perspective—anchored on previous discussion
- Fast—no startup overhead
- Stateful within the conversation
Model 2: True Subagents (Isolated Context)
When Cursor delegates a task via the Task tool, you get True Subagents:
flowchart TB
subgraph PARENT["PARENT CONTEXT"]
Main["Main Agent"]
end
subgraph SUB1["SUBAGENT 1 (Fresh Context)"]
P1["Prompt + Agent Definition"]
C1["Claude Instance"]
R1["Result"]
P1 --> C1 --> R1
end
subgraph SUB2["SUBAGENT 2 (Fresh Context)"]
P2["Prompt + Agent Definition"]
C2["Claude Instance"]
R2["Result"]
P2 --> C2 --> R2
end
Main -->|"Task"| SUB1
Main -->|"Task"| SUB2
R1 -->|"Summary"| Main
R2 -->|"Summary"| Main
| Component | What Happens |
|---|---|
| Task Delegation | Parent explicitly spawns subagent via Task tool |
| Fresh Context | Subagent starts clean—no conversation history |
| Isolated Window | Own context window, doesn't pollute parent |
| Parallel Execution | Multiple subagents can run simultaneously |
| Result Summary | Only final output returns to parent |
Key characteristics:
- True context isolation—subagent starts fresh
- Fresh perspective—no anchoring on failed attempts
- Parallel capable—multiple subagents run concurrently
- Higher overhead—separate context window startup
The "Apply Lens" Mechanism
Both models use the same agent definition file, but apply it differently. Here's what drives the lens:
flowchart TB
subgraph FILE["AGENT DEFINITION FILE"]
subgraph FM["Frontmatter (YAML)"]
Name["name: security-auditor"]
Model["model: inherit"]
Desc["description: Security specialist..."]
RO["readonly: true"]
BG["is_background: false"]
end
subgraph BODY["Prompt Body (Markdown)"]
Role["## Role<br/>You are a security auditor..."]
Expertise["## Expertise<br/>OWASP, auth, crypto..."]
Process["## Process<br/>1. Scope → 2. Model → 3. Analyze..."]
Output["## Output Format<br/>### Findings..."]
Constraints["## Constraints<br/>Never ignore vulnerabilities..."]
end
end
FM --> CONFIG["Configuration"]
BODY --> SYSTEM["System Prompt"]
CONFIG --> APPLY["Apply Lens"]
SYSTEM --> APPLY
Frontmatter Fields (Configuration)
| Field | Purpose | Example |
|---|---|---|
name |
Identifier for invocation | security-auditor |
description |
When to use (agent reads this to decide) | "Use for auth, payments, sensitive data" |
model |
Which model to use | inherit, fast, or specific model |
readonly |
Restrict write operations | true for auditors |
is_background |
Run without blocking | true for long research |
Prompt Body (System Instructions)
| Section | What It Does |
|---|---|
| Role | Establishes identity and perspective ("You are a skeptical security auditor") |
| Expertise | Defines knowledge boundaries (OWASP, STRIDE, crypto best practices) |
| Process | Step-by-step methodology (scope → model threats → analyze → report) |
| Output Format | Exact structure for responses (findings by severity, remediation) |
| Constraints | Explicit boundaries ("Never ignore potential vulnerabilities") |
Real Example: Security Auditor
# Frontmatter
---
name: security-auditor
description: |
Security specialist. Use when implementing auth, payments,
handling sensitive data, or reviewing code for vulnerabilities.
model: inherit
readonly: true
---
# Prompt Body
You are a **security auditor**. You think like an attacker
to protect like a defender.
## Core Philosophy
- Assume breach - Design with the assumption attackers will get in
- Defense in depth - Multiple layers of protection
- Least privilege - Minimum access needed for the task
## Process
1. **Scope**: Identify security-sensitive code paths
2. **Model threats**: Apply STRIDE framework
3. **Analyze**: Check for common vulnerabilities
4. **Validate**: Verify findings are exploitable
5. **Report**: Prioritize by severity with remediation
## Output Format
### Findings
#### [CRITICAL] Vulnerability Name
**Location**: file:line
**Impact**: What an attacker could do
**Remediation**: How to fix
When You Get Which Model
| Trigger | Model Used | Context |
|---|---|---|
| Mention agent in chat ("ask security auditor") | Persona Lens | Shared |
| Agent auto-selected by routing | Persona Lens | Shared |
| Explicit Task delegation | True Subagent | Isolated |
Built-in explore, bash, browser |
True Subagent | Isolated |
| Background research tasks | True Subagent | Isolated |
Cursor's Built-in Subagents
Cursor includes three built-in true subagents:
| Subagent | Purpose | Why Isolated |
|---|---|---|
| Explore | Codebase search | Generates noisy intermediate output |
| Bash | Shell commands | Command output is verbose |
| Browser | Web automation | DOM snapshots are large |
These always run in isolated context to prevent polluting the main conversation.
The Mental Model Shift
| Persona Lens | True Subagent |
|---|---|
| Same context window | Fresh context window |
| Sees conversation history | Starts clean |
| Fast (no startup overhead) | Higher latency (new context) |
| Sequential only | Parallel capable |
| Anchored on prior discussion | Fresh perspective |
| Good for quick consultations | Good for verification, deep research |
Both produce specialized behavior. The difference is context isolation.
Why Fresh Context Matters
After extended debugging, your main conversation has:
- 50 failed approaches in context
- Anchoring on initial hypothesis
- Context cluttered with error messages
A true subagent starts fresh:
- No knowledge of failed attempts
- No anchoring bias
- Approaches the problem from first principles
- Might spot what you've been staring past
This is why verification subagents work—they haven't been part of the journey, so they question everything.
Practical Implications
For users:
- Use persona lens for quick consultations within a conversation
- Use explicit Task delegation when you need fresh eyes
- Understand that "agents" in the same conversation share context
For builders:
- "Multi-agent coordination" means different things depending on model
- Persona lens: sequencing prompt applications, merging outputs
- True subagents: managing isolated contexts, aggregating results
For evaluators:
- When a vendor claims "autonomous agents working together," ask:
- Is it one context or many?
- How do agents share state?
- Can they run in parallel?
Common Misconceptions
-
"This means multi-agent systems are fake" — No. Both models produce real, differentiated behavior. The architecture differs from what terminology implies.
-
"Persona lens is inferior" — Not at all. For quick consultations where you want context preserved, persona lens is faster and more coherent.
-
"I should always use true subagents" — No. Subagents have startup overhead and lose conversation context. Use them when you specifically need fresh perspective or parallel execution.
What to Do Next
- Know which model you're triggering: Mention in chat = persona lens. Explicit delegation = subagent.
- Use subagents for verification: Fresh eyes catch what familiarity misses.
- Read the next post: We'll dissect what's inside a persona file and how models apply it.
"Agents are either persona lenses (same context) or true subagents (isolated context). Know which you're using."
Next: Anatomy of a Persona Lens — What's inside an agent definition file and how models apply it.
Related: Subagents: Fresh Eyes on Demand — Deep dive on context isolation and parallel execution.