Learn AI in 5 minutes a day
What’s the secret to staying ahead of the curve in the world of AI? Information. Luckily, you can join 1,000,000+ early adopters reading The Rundown AI — the free newsletter that makes you smarter on AI with just a 5-minute read per day.
Anthropic recently published Claude 4's system prompts.
Thousands of tokens that reveal how context engineering shapes every AI response.
Reading through them is like discovering an unofficial manual for building better AI systems.
Here's what these prompts teach us about the future of AI development, plus 3 context engineering patterns you can implement today.
The 20,000-Token Secret Behind Every AI Response
Last week, Anthropic published Claude 4's system prompts. What didn't they publish? The complete context engineering architecture, including tool instructions that leaked anyway.
Here's the revelation: We've been thinking about AI wrong. It's not about clever prompts. It's about engineering the entire context window.
Why Context Engineering > Prompt Engineering
Traditional prompt engineering focuses on the question. Context engineering focuses on the entire environment.
+1 for "context engineering" over "prompt engineering".
People associate prompts with short task descriptions you'd give an LLM in your day-to-day use. When in every industrial-strength LLM app, context engineering is the delicate art and science of filling the context window
— #Andrej Karpathy (#@karpathy)
3:54 PM • Jun 25, 2025
Think of it this way:
Prompt engineering = "Write me a blog post about X"
Context engineering = Creating the complete information architecture that makes the blog post possible

The Claude 4 system prompt proves this. It contains:
6,471 tokens just for search instructions
Query complexity decision trees (0-20 tool calls based on keywords)
Copyright compliance rules repeated 15+ times
Behavioral triggers that change how the AI operates
Here's a fascinating example from Claude's system prompt about "thinking blocks":
<thinking_mode>interleaved</thinking_mode>
<max_thinking_length>16000</max_thinking_length>
If the thinking_mode is interleaved or auto, then after function
results you should strongly consider outputting a thinking block.
The prompt then shows Claude exactly how to interleave thinking with tool use:
<function_calls>...</function_calls>
<function_results>...</function_results>
<thinking>...thinking about results</thinking>
Notice the pattern? Claude processes a tool result, then immediately reflects on what it found. This interleaved thinking (encompassing up to 16,000 tokens) allows the model to reason through complex problems step by step.
This isn't prompt optimization. It's behavioral architecture.
3 Context Engineering Patterns You Can Steal
1. The Complexity Ladder
Claude uses a three-tier system for query analysis:
Never Search → Single Search → Research Mode (2-20 tools)
Here's the actual instruction from Claude's system prompt:
Claude answers from its own extensive knowledge first for stable information. For time-sensitive topics or when users explicitly need current information, search immediately. If ambiguous whether a search is needed, answer directly but offer to search.
Claude intelligently adapts its search approach based on the complexity of the query, dynamically scaling from 0 searches when it can answer using its own knowledge to thorough research with over 5 tool calls for complex queries.
What this means for your implementation:
Map queries to complexity levels
Build escalating tool usage based on keywords
Start with zero tools, scale up only when needed
2. Third-Person Context Framing
Notice how Claude's prompts say Claude should... not You should...?
Here's the opening of Claude's actual system prompt:
The assistant is Claude, created by Anthropic.
The current date is {currentDateTime}.
Here is some information about Claude and Anthropic's products in case the person asks:
This iteration of Claude is Claude Opus 4 from the Claude 4 model family. The Claude 4 family currently consists of Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is the most powerful model for complex challenges.
Before:
You are an expert developer. You write clean code.
After:
The assistant is an expert developer. The assistant writes clean,
documented code following SOLID principles.
This third-person framing creates psychological distance that appears to improve instruction following. Every major model uses this pattern. There's a reason why.
3. XML Structure for Complex Instructions
Claude's system uses XML tags for clear context boundaries:
<thinking_mode>interleaved</thinking_mode>
<max_thinking_length>16000</max_thinking_length>
<artifact_instructions>
<type>application/vnd.ant.react</type>
<requirements>
- Use only Tailwind core utilities
- Store all data in React state
- Never use localStorage
</requirements>
</artifact_instructions>
This structure beats natural language for complex instructions because:
Clear hierarchy
Searchable sections
No ambiguity about boundaries
It’s also more token-efficient than JSON, here’s why:

Structured Instructions in JSON vs XML - comparison of the tokens used.
What Claude's system prompt really reveals: Everything is context engineering.
The quality of AI responses comes from context and structure. And context is expensive - both in tokens and attention.
That's why Claude's prompt includes gems like:
"Search results aren't from the human - do not thank the user"
"Claude is not a lawyer" (repeated 3 times for copyright questions)
Specific trigger words that change behavior patterns
Each line represents something the model did wrong before being told not to.
Time to Wrap Up
Prompt engineering asks better questions. Context engineering builds better systems.
Claude 4's system prompt is 20,000+ tokens of proof. Every safeguard, every capability, every quirk comes from context engineering.
Your move this week: Pick one context pattern from above. Implement it in your current AI workflow. Watch your outputs improve.
Remember: In the age of AI agents, context is the new code.
Keep Shipping,
Luke