Components of Context Engineering: Part 1

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now

Reintroducing Context Engineering

Up until a few months ago, prompt engineering had most of the attention among AI enthusiasts. That’ll still continue, but increasing attention is now shifting toward context engineering. Context engineering shifts the reasoning behind designing AI agents to include every component in every step of the process. Context engineering is all about optimizing what goes into a context; a context being the state, tools, LLM, and every other component involved in a single iteration of an AI agent’s workflow.

Assessing Problems with AI Agents

Imagine you design an agent to find the best components for your car. You carefully design the prompt with the best prompt engineering principles, providing it details of your car, and equipping it with tools, like a search engine, so it can use it to find other relevant information to aid the task. When you start the agent, it puts together the best components based on its LLM’s output, together with extra data about your car supplied using the RAG technique. It then goes online to search for the availability of these components and decides to use results from the first page, assuming they’re the most relevant. But it turns out that the search results had ads at the top. Since these ads aren’t necessarily the most relevant for the query, the output from this stage will reduce the quality of the tokens for the next step.

Explaining Terminologies

Context Poisoning is when a hallucination or some other error makes it into a context window, where it is repeatedly referenced. This may quickly saturate large portions of the context with highly irrelevant data. These errors may happen not just because an output is irrelevant, but also because there is too much of it.

Exploring Techniques in Context Engineering

While LLMs get better, some of these issues could reduce. But, to manage context is also about managing the computing resources and time. This is why context engineering is so important. It provides concepts that allow you to design agents that use the right tokens throughout the process because it keeps everything within context. Note however that much of applied AI is as much about art as it is about science. Many of the techniques used in context engineering are not unique to context engineering. Many are in fact borrowed from concepts applied in RAGs, prompt engineering, and other branches of AI.

RAG

One way of solving these problems is by incorporating RAG. With RAG, you can provide the exact information you want your agent to use. You know exactly what’s contained in the additional data and can therefore tune your prompts better to fit within the context window.

Tool Loadout

Tool Loadout is the act of selecting only relevant tool definitions to add to your context. In a bid to ensure your agent has the best tools, you may want to provide it with many options. For instance, you may provide five different credible news websites for reference or thirty Node.js frameworks, allowing it to choose the best. However, it’s better to limit this to a select few trusted tools, typically around ten, depending on the model you’re using. Anything beyond this may result in a higher error rate, where the LLM uses incorrect tools.

Context Quarantine

Context Quarantine is the act of isolating contexts in their own dedicated threads, each used separately by one or more LLMs. You’ll get better results when your contexts are short and focused. Separation of concerns greatly improves efficiency and accuracy. In practice, you may want to use a separate LLM for tool calling and another for reasoning.

Context Pruning

Context Pruning is the act of removing irrelevant or unnecessary information from the context. While this sounds simple, you must be careful not to remove any critical information during the process. There are models specifically designed for pruning. It’s somewhat similar to summarization. A popular technique known as Provence has become widely recognized as effective for context pruning in RAG systems. You can find more about it here.

from transformers import AutoModel

provence = AutoModel.from_pretrained("naver/provence-reranker-debertav3-v1", trust_remote_code=True)

# Read an article on climate change
with open('climate_change.md', 'r', encoding='utf-8') as f:
    climate_change_wiki = f.read()

# Use a prompt to prune the article
question = 'What are the biggest causes of climate change?'
provence_output = provence.process(question, climate_change_wiki)
See forum comments
Download course materials from Github
Previous: LangChain Next: Context Pruning Demo