TL;DR

Context Engineering is the new discipline replacing traditional prompt engineering. Instead of massive, static prompts that lead to "context rot" and high costs, Context Engineering architects dynamic systems to feed Large Language Models (LLMs) only the necessary information at the right time. This is achieved through techniques like Query Rewriting, Active Memory Management (for key facts), and standardized tools like the Model Context Protocol (MCP) for connecting to external APIs. The focus shifts from talking to a model to building the world it lives in.

Apologies for the absence of a post last week, the day job and family holidays got in the way! In my previous a couple of weeks ago I waffled on about Vibe Coding, which is only one aspect of AI that seems to be placing “prompt engineering” as a thing of the past. If vibe coding is how we interact with the output of AI, Context Engineering is how we manage the input.

Context Engineering is the discipline of designing the architecture that feeds an LLM the right information at the right time. It is not about changing the model itself, but about building the bridges that connect it to the outside world, retrieving external data, connecting it to live tools, and giving it a memory.

From Prompts to Context

I’ve heard it mentioned in a few articles on this matter that "if your prompt is a recipe, the model is your kitchen".

In traditional prompt engineering, you tried to cram everything into the recipe. You would write a massive prompt containing the persona, the task, the rules, and all the reference text. But models have a limited "context window" (i.e. their working memory). Overloading this window increases costs, slows down response times, and causes models to suffer from "context rot," where they forget important instructions.

Context engineering solves this by treating the prompt as a dynamic, living ecosystem. It acts like the mise en place for a chef, gathering only the exact ingredients and tools needed for the immediate task before cooking.

A Real Example

The Old Way (Static Prompting)

From yester-year, as far back as 2024! We would employ a workflow where we try to solve the AI's lack of knowledge by cramming everything into a single, massive text box.

The Process: You build a 5,000-word system prompt that includes the persona instructions, the entire 50-page company return policy, and the complete transcript of the user's last 20 messages.
The Bottleneck: This approach relies on a static "retrieve, then generate" pipeline. As the conversation grows, the "context window" (the AI's active working memory) becomes overloaded. The model suffers from "context rot" or "context distraction", it begins to forget instructions buried in the middle of the prompt, hallucinations increase, and your API costs skyrocket because you are paying to process thousands of irrelevant tokens on every single turn.

The New Way (Context Engineering Ecosystems)

In this new workflow, instead of a single prompt, we architect a dynamic ecosystem:

Query Rewriting: A frustrated user types, "How do I make this work when it keeps failing?" Instead of feeding this vague complaint to your main AI, a background "Query Rewriter" agent intercepts it. It analyzes the session and rewrites the hidden search to: "API call failure, troubleshooting authentication headers, rate limiting". This ensures the database retrieves the exact technical manual needed.
Active Memory Management: Instead of passing the entire chat history back to the model, an automated "Memory Manager" runs an ETL (Extract, Transform, Load) pipeline in the background. It extracts key facts (e.g., extracting the fact {"shoe_size": 10} from a long conversation), consolidates it by deleting the user's old size 9 preference to avoid conflicting data, and stores it in a Vector Database. On the next turn, the system only injects that single relevant fact into the prompt.
Standardized Tools (MCP): Instead of writing custom integration code for every API your agent needs to touch, you use the Model Context Protocol (MCP). Dubbed the "USB-C for AI," MCP allows your agent to seamlessly connect to standardized servers. The agent uses a tool like process_refund(order_id) by outputting structured JSON, observing the result, and adjusting its plan without human intervention.

In summary…

Prompt engineering hasn't disappeared; it has just been absorbed into something much bigger.

We have transitioned from being "prompters" who talk to a model, to architects who build the world the model lives in. Whether you are vibe coding a new application into existence with natural language, or context engineering a sophisticated retrieval pipeline for an enterprise AI agent, the focus is no longer on hacking the AI with clever words. It is about orchestrating intent, memory, and data to create truly autonomous systems.

Tuesday, May 12, 2026

Beyond the Prompt: Context Engineering