ä¸ä¸ć塼ç¨ďźĺŚä˝ç˛žĺćäž AI ćé俥ćŻ
Source: Thoughtworks Tech
Context engineering: How to give AI exactly what it needs
If youâve spent any time around AI development circles recently, youâve probably noticed a shift from simple prompting to something people are calling 'context engineering.'
Whether youâre working with Claude 4, GPT-4o or Gemini 2.5, the difference between mediocre and exceptional results often hinges on one thing: how intelligently you design your context window.
What is context engineering?
Context engineering isnât about stuffing more into the prompt â itâs about curating smarter. Think of it as the art of structuring, optimizing and trimming information so large language models respond faster, cheaper and better.
Instead of dumping an entire codebase or dataset into the context window, Context Engineering strategically gives the model only what matters â the right information, in the right format, at the right time.
Tools like MCP, CaC, and Context Engines are making this easier â bridging data sources to AI, turning codebases into living docs, and creating structured summaries from complex projects.
The context paradox: Why more isnât always better
Overwhelming your AI with information often makes it perform worse. That might sound counterintuitive but it's the reality of building AI systems and applications.
Imagine youâre asking an LLM to generate a UserService in Spring Boot. A rookie move is to dump the entire codebase â controllers, full repository logic, configuration classes, utility layers, even README files â into the prompt.
But what happens?
- Token burn:Â 10,000+ tokens for mostly irrelevant data
- Slower inference:Â Long inputs increase latency
- Confused output:Â Model struggles to find signal in the noise
A smarter alternative
Itâs not about less context â itâs about the right context. Instead of dumping everything, context engineering focuses on strategic inclusion.
If youâre generating a UserService, you might only need:
- The user model (fields only)
- The repository interface signature
- The controller endpoint patterns
Thatâs just 300 tokens instead of 10,000 with superior results.
Three context engineering techniques
There are a number of context engineering techniques that can help us get more from AI. In this blog post I'll look specifically at three particular examples:
- Skeleton trimming â where you keep only the essential structure, like method signatures, class declarations and annotations, and strip away implementation details.
- Relevance-first file selection â where you start by identifying only the files that are directly relevant to the task when preparing LLM inputs.
- Context phasing â where, instead of providing all context in one go, you deliver it in stages. Each step contains only the information relevant for that point in the process.
It's worth noting that these three techniques arenât official standards â they come from hands-on experimentation in backend code generation (mainly Java/Spring Boot). Think of them as adaptable patterns.
Let's now look at each one in more detail.
Skeleton trimming
Here's what our code looks like before we implement skeleton trimming (full controller):
...And here's what it looks like after:
Relevance-first file selection
As mentioned above, this is where you start by identifying the files that are relevant to the tast. Here's how this is done...
i) Must-Have inputs:
These are files that define the task directly.
For instance, if youâre generating OrderService, youâll always want:
- OrderRepository interface (just the method declarations)
- Order model (fields and annotations only)
- OrderController (only endpoint mappings)
ii) Conditional extras:
Include these only when your task depends on them â
- If OrderService throws OrderNotFoundException, include that class.
- If it uses a CreateOrderRequest DTO, include that too.
iii)Â Irrelevant files:
Files that donât affect the current task should be ignored.
- You donât need SecurityConfig, Application.java, or unrelated service classes unless your logic touches them.
Context phasing
Each step containing only the information relevant for that point in the process â we don't need to provide all the context at once.
Here's how it works...
- Setup: Define the high-level goal and constraints â e.g., âimplement CRUD operations for OrderService.â
- Structure: Provide just the file structures, models and interfaces needed to outline the solution â e.g., the order model fields, repository interface and DTOs.
- Detail:Â Add specific implementation elements such as exception classes, constants and edge cases.
By pacing the flow of information, you guide the modelâs focus step by step, leading to cleaner and more coherent outputs.
Final thoughts
Effective context management is a cornerstone of building high-performance AI systems. By delivering only the most relevant information one can reduce latency, control costs and improve the precision of model outputs.
By combining standards like MCP with approaches such as code-as-context and context engines (CTX), you give your LLMs exactly the information they need and nothing more. Simple tricks like skeleton trimming can keep token counts low, responses fast and quality high.
Your LLMâs output is only as good as the context you give it. Treat your token budget like premium real estate and fill it with high-value content, not clutter.