AI That Teaches Itself
What if your AI development system got better automatically? Not through manual rule-writing, but by observing how you work and proposing improvements. This post shows how to build feedback loops that learn from real usage.
The Cursor System series
- Beyond Rules — The four artifact types
- Agent Personas — Personas that stay in character
- Smart Routing — Match tasks to specialists
- Autonomous Workflows — Let agents chain safely
- Testing Artifacts — Catch broken rules before they break
- Meta-Learning (this post) — Agents that learn from failures
The Learning Loop
Most AI customization is reactive: something goes wrong, you add a rule. The meta-learning system inverts this—it proactively observes usage and suggests improvements.
flowchart LR
subgraph LOOP["Continuous Improvement Loop"]
O["OBSERVE"]:::primary
P["PATTERN"]:::secondary
PR["PROPOSE"]:::secondary
T["TEST"]:::secondary
D["DEPLOY"]:::accent
end
O --> P --> PR --> T --> D
D --> O
| Phase | Questions |
|---|---|
| Observe | Rules triggered? Commands used? Manual work? Friction points? |
| Pattern | What sequences repeat? What should be automated? |
| Propose | New rule? New command? Update existing? |
| Test | Does it conflict? Does it help? |
| Deploy | Apply and monitor |
Patterns Detected:
- User ran lint manually 5 times → auto-lint rule
- User asked "how to..." 3 times → missing documentation
- Command /review failed twice → needs better error handling
What to Observe
The system tracks several dimensions of usage:
1. Rule Effectiveness
| Metric | What It Tells You |
|---|---|
| Trigger frequency | Is the rule relevant? |
| Override frequency | Is the rule too strict? |
| Conflict frequency | Does it clash with others? |
| Helpful vs ignored | Is it actually guiding behavior? |
Rule Observation Log:
naming/RULE.md:
triggered: 23 times
followed: 21 times
overridden: 2 times (user said "use snake_case here")
conflicts: 0
verdict: EFFECTIVE
verbose-logging/RULE.md:
triggered: 3 times
followed: 0 times
overridden: 3 times
conflicts: 0
verdict: REVIEW (always overridden)
2. Manual Repetition
When users do the same manual action multiple times, it's a signal:
Manual Action Tracking:
"npm run format": 6 times
"git add . && git commit": 4 times
"npm test": 8 times
"console.log debugging": 5 times
Patterns:
- Format manually → enable auto-format
- Commit without /checkpoint → promote /checkpoint usage
- Test frequently → auto-test after changes
- Console.log debugging → suggest Debugger agent
3. Questions Asked
Repeated questions indicate missing knowledge:
Question Tracking:
"how does the auth flow work?": 3 times
"where is the database config?": 2 times
"what's the API response format?": 4 times
Patterns:
- Auth questions → create auth-flow documentation
- Config questions → create config-location rule
- API questions → create api-conventions rule
4. Command Usage
Which commands are used, skipped, or fail:
Command Tracking:
/checkpoint: 12 uses, 0 failures
/review: 8 uses, 1 failure (no files staged)
/debug: 3 uses, 0 failures
/cleanup: 2 uses (but /checkpoint used 12 times)
Patterns:
- /cleanup underused → users may not know it exists
- /review failure → improve error handling
- /checkpoint popular → consider auto-commit for small changes
5. Agent Spawning
When agents help vs when they're skipped:
Agent Tracking:
Debugger:
spawned: 5 times
helpful: 5 times
verdict: KEEP
Optimizer:
spawned: 0 times
should_have_spawned: 2 times (user manually optimized)
verdict: IMPROVE ROUTING
Critic:
spawned: 1 time
helpful: 1 time
not_spawned_when_useful: 3 times
verdict: LOWER TRIGGER THRESHOLD
Pattern Recognition Thresholds
Not every observation becomes a suggestion. Use thresholds:
Pattern Thresholds:
# Automation opportunities
manual_action_repeated:
threshold: 3+ times
action: Propose automation rule
# Documentation gaps
question_repeated:
threshold: 2+ times
action: Propose documentation or rule
# Command improvements
command_syntax_error:
threshold: 2+ times
action: Improve command help text
# Routing improvements
agent_not_spawned_when_useful:
threshold: 2+ times
action: Adjust routing patterns
# Rule relevance
rule_always_overridden:
threshold: 3+ times
action: Review rule, consider removing
# Conflict detection
rule_conflict:
threshold: 1+ times
action: Flag for resolution
The MetaAnalyzer Agent
The MetaAnalyzer is a specialized agent that analyzes sessions and proposes improvements:
# .cursor/agents/meta-analyzer.md
---
name: MetaAnalyzer
model: claude-sonnet-4-20250514
description: |
# Meta Analyzer
You analyze development sessions to identify improvement opportunities
for the Cursor artifact system.
## Role
Find friction points, missing automation, and unused potential.
## Analysis Dimensions
### Rule Effectiveness
- Which rules triggered?
- Were they helpful or overridden?
- Any conflicts detected?
### Automation Opportunities
- What manual actions repeated?
- What could be automated safely?
### Missing Coverage
- What questions were asked multiple times?
- What topics have no rules/docs?
### Agent Utilization
- Were specialists spawned when helpful?
- Were appropriate agents available?
## Process
1. Parse session transcript
2. Extract actions, questions, failures
3. Match against pattern thresholds
4. Generate improvement proposals
5. Prioritize by impact
## Output Format
# Session Analysis Report
## Executive Summary
- Duration: [time]
- Main activity: [category]
- Key finding: [most impactful observation]
- Top recommendation: [highest-priority proposal]
## High-Impact Opportunities
### 1. [Opportunity Name] ([Priority])
**Evidence**: [what was observed]
**Proposal**: [what to do]
**Impact**: [expected benefit]
## Proposals Generated
- [path to proposal file]
## Constraints
- Only propose what evidence supports
- Prioritize by actual impact, not theoretical
- Include evidence for every proposal
- Generate actionable proposals, not vague suggestions
---
Session Analysis in Action
User: /analyze-session --propose
MetaAnalyzer spawned...
Analyzing session transcript...
- Messages: 47
- Duration: 2.5 hours
- Commands used: 12
- Agents spawned: 3
Session Analysis Report
Executive Summary:
- Duration: 2.5 hours
- Main activity: API development
- Key finding: Manual formatting repeated 6 times
- Top recommendation: Enable auto-format rule
High-Impact Opportunities:
| # | Opportunity | Evidence | Proposal | Impact |
|---|---|---|---|---|
| 1 | Auto-Format Rule (IMMEDIATE) | User ran npm run format 6 times manually |
Enable auto-format in autonomous-workflows | ~12 manual commands saved per session |
| 2 | API Validation Agent (MEDIUM) | User asked "is this API design right?" 3 times | Create APIDesigner agent | Specialized guidance for API work |
| 3 | Missing Error Handling Rule (MEDIUM) | SecurityAuditor flagged missing error handling 4 times | Add error-handling rule to base/ | Catch issues proactively |
| 4 | Unused /cleanup Command (LOW) | /cleanup used 0 times, but manual cleanup done | Improve discoverability or subsume into /checkpoint | Reduced friction |
Proposals Generated:
.cursor/proposals/auto-format-rule.md.cursor/proposals/api-designer-agent.md.cursor/proposals/error-handling-rule.md
Review proposals? [Y/n]
Proposal Review Workflow
User: "Review proposals"
Proposal Review
1. auto-format-rule.md
Type: Rule modification
Impact: HIGHChange: Add auto-format to autonomous-workflows rule
+ | Lint/format | After code edit | Run and fix |Accept? [y/n/edit]
User: y
✓ Applied auto-format rule
2. api-designer-agent.md
Type: New agent
Impact: MEDIUMCreates:
.cursor/agents/api-designer.md
Updates:.cursor/rules/agent-routing/RULE.mdAccept? [y/n/edit]
Automated vs. Manual Learning
Not all improvements should be automatic:
Auto-Apply (Safe)
- Add new observation data
- Update usage statistics
- Flag patterns that cross thresholds
Propose and Wait (Default)
- New rules
- New agents
- Modified existing artifacts
Manual Only (Risky)
- Delete rules or agents
- Change critical guardrails
- Modify security-related rules
Metrics Over Time
Track improvement over sessions:
System Health Dashboard
Rule Effectiveness (Last 30 Days)
| Metric | Value | Trend |
|---|---|---|
| Rules triggered | 234 | — |
| Rules followed | 221 (94.4%) | — |
| Rules overridden | 13 (5.6%) | ↑ improving (+2% from last month) |
Automation Coverage
| Metric | Current | Previous |
|---|---|---|
| Manual actions | 45 | 78 |
| Automated actions | 189 | 156 |
| Automation ratio | 80.8% | 66.7% |
Agent Utilization
| Metric | Value |
|---|---|
| Spawns | 34 |
| Helpful | 31 (91.2%) |
| Missed opportunities | 4 |
Proposals
| Metric | Value |
|---|---|
| Generated | 12 |
| Accepted | 9 |
| Rejected | 2 |
| Pending | 1 |
| Acceptance rate | 81.8% |
Key Takeaways
-
Observe usage systematically. Track rules, commands, agents, manual actions, questions.
-
Use thresholds for pattern detection. Not every observation is actionable—set thresholds.
-
Generate proposals, don't auto-apply. The human reviews and accepts changes.
-
Evidence backs every proposal. Show what was observed before suggesting what to change.
-
Track effectiveness of changes. If a new rule is always overridden, remove it.
-
The system improves over time. Automation ratio goes up, manual work goes down.
Series Wrap-Up
This series covered the complete Cursor artifact system:
- Beyond Rules — Commands for workflows, Agents for expertise, Skills for portability
- Agent Personas — Five elements: Role, Expertise, Process, Output, Constraints
- Smart Routing — Pattern-based agent selection
- Autonomous Workflows — Coalescing and safe automation
- Testing Artifacts — Structural, content, and behavioral validation
- Meta-Learning (this post) — Continuous improvement from usage
Together, these create an AI development system that's:
- Consistent — Same inputs produce same outputs
- Efficient — Automation reduces manual work
- Safe — Guardrails prevent dangerous operations
- Improving — Learning from real usage over time
The goal isn't to make AI do everything—it's to make AI do the right things, the right way, every time.
This concludes The Cursor System series. For related content, see Cursor Rules vs MCP and Rules on Rules.