Goals Matter More Than Code
How misaligned objectives between compilers, LLMs, and humans created 14 failures—and what we learned about designing for intent.
The Question Nobody Asked
When we started building the persona generation system, we jumped straight to implementation. Parse the input. Build the spec. Compile to JSON. Deploy.
We never stopped to ask: what is each component actually trying to accomplish?
- What does the compiler want?
- What does the LLM want?
- What does the API want?
- What does the user want?
These goals aren't the same. And when they conflict, systems break in ways that are hard to debug — because the code is "correct" by each component's definition of correct.
Four Actors, Four Goals
The Compiler's Goal: Structural Validity
The compiler has one job: ensure the output conforms to its type definitions.
interface Widget {
name: string;
type: number;
[key: string]: unknown;
}
The compiler asks: "Does this object have a name that's a string and a type that's a number?"
If yes, the compiler is satisfied. It doesn't care if:
- The
nameis empty ("") - The
typeis a valid widget type ID - The nested config is what the API expects
- The output will actually work at runtime
The compiler's goal is syntactic correctness, not semantic correctness.
The API's Goal: Contract Enforcement
The API has a different goal: reject anything that doesn't match its internal expectations.
The API's contract (partially documented, mostly implicit):
- Widget
namemust be non-empty and match a known widget type - Widget config must be nested under a key matching the
name - Actions must have
version,displaySettings,typeArguments,tools - Workflow name must match the persona's existing workflow (if updating)
The API doesn't care about TypeScript types. It cares about runtime invariants.
The API's goal is runtime safety, enforced through validation.
The LLM's Goal: Pattern Completion
When an LLM generates code or JSON, its goal is: produce output that looks like what it's seen before.
Given a prompt like "generate a voice AI persona config," the LLM:
- Recalls patterns from training data
- Completes the most likely structure
- Fills in plausible values
The LLM doesn't validate against a spec. It doesn't know the API contract. It produces what seems right based on statistical likelihood.
The LLM's goal is plausibility, not correctness.
The User's Goal: Outcome Achievement
The user doesn't care about any of this. Their goal: "I want a working Voice AI that handles sales calls."
They don't care if:
- The widget format is
widget_configvs[widgetName] - The action namespace is
["search", ...]vs["actions", ...] - The workflow name matches an internal identifier
They care: does it work when I call it?
The user's goal is functional outcome, not structural correctness.
How Goal Misalignment Caused Failures
Failure 1: Empty Widget Configs
What happened:
// Compiler produced this
const widget = {
name: "conversationSettings",
type: 39,
conversationSettings: {}, // Empty
};
Goal analysis:
| Actor | Was their goal met? | Why? |
|---|---|---|
| Compiler | Yes | Valid TypeScript object |
| API | Yes | Non-empty name, valid type |
| LLM | Yes | Plausible structure |
| User | No | Voice AI has no greeting, no identity |
The user's persona worked technically but said nothing useful. The welcome message was undefined. The identity was blank.
Root cause: The compiler's goal (valid structure) was met. The user's goal (functional outcome) was not. Nobody's goal included "populate sensible defaults."
Failure 2: Wrong Namespace
What happened:
// Compiler's namespace mapping
const ACTION_NAMESPACES = {
search: ["search", "emainternal"], // Wrong
};
// Produced
{ "action": { "name": { "namespaces": ["search", "emainternal"], "name": "search" } } }
Goal analysis:
| Actor | Was their goal met? | Why? |
|---|---|---|
| Compiler | Yes | String array is valid |
| API | No | Namespace doesn't exist in action registry |
| LLM | N/A | Didn't generate this |
| User | No | Workflow fails to deploy |
Root cause: The compiler's type system can't express "this string array must be one of these specific allowed values." The API enforces this at runtime. The goals diverged at the boundary.
Failure 3: Workflow Name Mismatch
What happened:
// Compiler generated
const workflowDef = {
workflowName: {
name: { namespaces: ["ema", "workflows"], name: "sales_assistant" },
},
};
// API expected
// "workflowName must match the persona's existing workflow name"
Goal analysis:
| Actor | Was their goal met? | Why? |
|---|---|---|
| Compiler | Yes | Generated valid workflow structure |
| API | No | Name doesn't match existing workflow |
| LLM | N/A | Didn't generate this |
| User | No | Deploy rejected with cryptic error |
Root cause: The compiler's goal was "generate a workflow." The API's goal was "update an existing workflow." These are different operations with different requirements. The compiler had no concept of "existing state."
Failure 4: Confident Wrong (LLM Hallucination)
What happened:
When we asked the LLM to generate widget config:
{
"widget_name": "voiceSettings",
"widget_type_id": 38,
"widget_config": {
"voice_model": "eleven_labs_v2",
"language": "en-US"
}
}
Goal analysis:
| Actor | Was their goal met? | Why? |
|---|---|---|
| Compiler | Yes | Valid object shape |
| API | No | Wrong field names, hallucinated voice model |
| LLM | Yes | Output looks plausible |
| User | No | Config rejected |
Root cause: The LLM's goal is pattern completion. It completed a pattern that looked like widget config. The field names were wrong. The voice model doesn't exist. But statistically, the output was plausible.
Intent vs. Implementation: The Core Problem
Every failure shared a common pattern: the implementation was correct by its own definition, but incorrect by the user's definition.
flowchart TB
A["User Intent:<br/>'Create a Voice AI for sales'"]:::primary
A --> B["Implementation:<br/>parse → build → compile → deploy"]:::secondary
B --> C["Actual Outcome:<br/>'Empty persona, API rejects deploy'"]:::warning
The pipeline never asked: "What does the user actually need to happen?"
It asked: "What's the next valid transformation?"
Designing for Intent: The Fix
Step 1: Define Success at the User Level
Before writing any code, we defined what success looks like from the user's perspective:
Success criteria for "create Voice AI for sales":
1. Persona exists in the platform
2. Persona has type=voice
3. Welcome message greets the caller appropriately
4. Identity describes the sales purpose
5. Workflow can receive calls (has voice trigger)
6. Workflow can respond (has response generation)
Notice: none of these mention widget_config format or namespace arrays. These are outcome requirements, not structural requirements.
Step 2: Work Backward from Outcomes
For each outcome, we identified what must be true:
| Outcome | Structural Requirement | Who Enforces |
|---|---|---|
| Persona exists | createAiEmployee() succeeds |
API |
| Has welcome message | conversationSettings.welcomeMessage populated |
Our code |
| Valid workflow | All required action fields present | API |
| Callable | voice_trigger node exists |
Our code |
Then we asked: who currently ensures each requirement?
For "valid workflow," the answer was: nobody. The compiler didn't know the required fields. The LLM didn't know the exact format. The API only rejected after the fact.
Step 3: Assign Goals to Components
We explicitly assigned goals to each component:
flowchart TB
subgraph IP["INTENT PARSER"]
IP1["Goal: Extract requirements"]:::primary
IP2["Input: 'Voice AI for sales'"]:::secondary
IP3["Output: type, purpose"]:::secondary
end
subgraph TS["TEMPLATE SELECTOR"]
TS1["Goal: Find valid starting point"]:::secondary
TS2["Input: type"]:::secondary
TS3["Output: template_id"]:::secondary
end
subgraph CG["CONFIG GENERATOR"]
CG1["Goal: Produce settings"]:::primary
CG2["Input: purpose"]:::secondary
CG3["Output: welcomeMessage"]:::secondary
end
subgraph MG["MERGER"]
MG1["Goal: Combine configs"]:::secondary
MG2["Input: template + generated"]:::secondary
MG3["Output: merged config"]:::secondary
end
subgraph VL["VALIDATOR"]
VL1["Goal: Catch errors"]:::warning
VL2["Input: final config"]:::secondary
VL3["Output: valid or errors"]:::secondary
end
subgraph DP["DEPLOYER"]
DP1["Goal: Persist to platform"]:::accent
DP2["Input: validated config"]:::secondary
DP3["Output: persona_id"]:::secondary
end
IP --> TS --> CG --> MG --> VL --> DP
Each component has one goal. No component tries to do everything.
Step 4: Define Contracts Between Components
Instead of implicit assumptions, we made contracts explicit:
// Contract: Intent Parser → Template Selector
interface ParsedIntent {
personaType: "voice" | "chat" | "dashboard";
purpose: string;
requirements: string[];
}
// Contract: Template Selector → Config Generator
interface TemplateSelection {
templateId: string;
templateConfig: Record<string, unknown>; // Known-valid structure
}
// Contract: Config Generator → Merger
interface GeneratedConfig {
conversationSettings?: {
welcomeMessage: string;
identityAndPurpose: string;
};
// Only fields we explicitly generate
}
// Contract: Merger → Validator
interface MergedConfig {
widgets: Array<{
name: string;
type: number;
[key: string]: unknown;
}>;
// Must have all required fields
}
Now each component knows exactly what it receives and what it must produce.
The LLM's Role: Intent Interpretation, Not Structure Generation
The key insight: LLMs are good at understanding intent, not at producing exact structures.
Before: LLM Generates Structure
Prompt: "Generate a widget config for voice settings"
LLM Output: {
"widget_name": "voiceSettings",
"widget_type_id": 38,
"widget_config": { ... } // Wrong format, hallucinated values
}
Problem: The LLM is guessing at structure. It doesn't know the API contract.
After: LLM Interprets Intent, Template Provides Structure
Prompt: "What voice settings does a sales AI need?"
LLM Output: {
"welcomeMessage": "Hello, thank you for calling. I'm here to help with your sales inquiry.",
"identityAndPurpose": "You are a professional sales assistant focused on understanding customer needs.",
"speechCharacteristics": "Friendly, professional, patient"
}
Then: Merge these values into template's valid structure.
The LLM produces semantic content. The template provides structural correctness. Each component does what it's good at.
Brownfield: Intent-Driven Transformation
For existing personas, the LLM's role is different: understand what change the user wants, then apply it.
Before: Keyword Matching
// Old approach: parse keywords
if (input.includes("add") && input.includes("search")) {
workflow.nodes.push(createSearchNode());
}
Problem: Brittle. "Include a lookup step" wouldn't match. Neither would "add knowledge base retrieval."
After: Intent Interpretation + Schema-Guided Transformation
// New approach: LLM understands intent, works with typed schema
const prompt = `
Current workflow has these nodes: ${JSON.stringify(currentSpec.nodes)}
User request: "${userInput}"
What changes are needed? Output a WorkflowSpec with the modifications.
The schema is:
${WORKFLOW_SCHEMA_FOR_LLM}
`;
const transformedSpec = await llm.generate(prompt);
const workflowDef = compileWorkflow(transformedSpec);
The LLM understands:
- "add search" = add a search node
- "include a lookup step" = add a search node
- "add knowledge base retrieval" = add a search node
All three map to the same outcome because the LLM interprets intent, not keywords.
Goal Alignment Checklist
When designing a pipeline with multiple actors (compiler, LLM, API, user), ask:
1. What is each actor's goal?
| Actor | Goal | How It's Achieved |
|---|---|---|
| User | Working outcome | Entire pipeline succeeds |
| LLM | Plausible output | Pattern completion |
| Compiler | Valid types | Type checking |
| API | Safe operations | Runtime validation |
2. Where do goals conflict?
- Compiler says "valid" but API says "invalid" → structural mismatch
- LLM says "plausible" but API says "doesn't exist" → hallucination
- User says "working" but system says "deployed" → outcome vs. status
3. Who ensures each requirement?
For every success criterion:
- Who checks this?
- When is it checked?
- What happens on failure?
If the answer is "nobody" or "too late," add a component.
4. What does each component NOT know?
- Compiler doesn't know: runtime contract, existing state, semantic meaning
- LLM doesn't know: exact API format, valid enum values, version requirements
- API doesn't know: user intent, what would fix the error, why you sent this
Design components to fill each other's gaps.
The Resulting Architecture
flowchart TB
UI["USER INTENT<br/>'Create a Voice AI for sales'"]:::primary
IP["INTENT PARSER (LLM)<br/>Understand what the user wants"]:::agent
TS["TEMPLATE SELECTION<br/>Get valid starting structure"]:::secondary
VG["VALUE GENERATION (LLM)<br/>Create persona-specific content"]:::agent
SM["STRUCTURE MERGE<br/>Combine template + values"]:::secondary
VL["VALIDATION<br/>Catch errors before API"]:::warning
DP["DEPLOYMENT<br/>Persist to platform"]:::secondary
UO["USER OUTCOME<br/>Working Voice AI"]:::accent
UI --> IP --> TS --> VG --> SM --> VL --> DP --> UO
LLM handles: intent interpretation, content generation (semantic tasks) Templates handle: structural correctness (known-valid starting point) Deterministic code handles: merging, validation, deployment (mechanical tasks) User gets: working outcome, not just "deployed successfully"
Key Takeaways
1. Define success at the user level first
Don't start with "what does the compiler need?" Start with "what does the user need to see happen?"
2. Each component should have one goal
If a component is trying to "understand intent AND generate valid structure AND validate format," it will fail at something.
3. LLMs interpret intent; templates provide structure
Let each do what it's good at. LLMs understand "sales AI" → friendly greeting. Templates provide { name: "conversationSettings", type: 39 }.
4. Validate at boundaries, not just at the end
The API is the final validator. But if you only find errors there, you've lost all context about what went wrong and why.
5. Goals conflict — design for the conflicts
The compiler's goal (valid types) and the API's goal (valid runtime structure) are different. Acknowledge this. Build bridges between them.
Try This
Next time you're building a pipeline with multiple components:
- Write down each component's goal in one sentence
- Identify where goals conflict — these are your failure points
- Assign explicit contracts between components
- Ask "who ensures this?" for every success criterion
- Let each component do one thing well
The 14 failures we experienced weren't bugs in any single component. They were goal misalignments that nobody designed for.
Once we designed for intent — user intent as the top goal, component goals as subgoals that must align — everything worked.
This post is part of a series on AI-assisted development workflows. See Compiler-Generated vs. LLM-Orchestrated Code for the technical deep dive, and From Compiler Workflows to LLM Workflows for the conceptual foundation.