- Biweekly Engineering
- Posts
- Building Effective Agents - Lessons from Anthropic | Biweekly Engineering - Episode 37
Building Effective Agents - Lessons from Anthropic | Biweekly Engineering - Episode 37
What Anthropic learned from its customers on building AI agent-based architecture
At the end of 2024, a lot of discussion sparked on what’s next for AI?
A common narration appeared in the AI world over the past year, including from Anthropic’s CEO, was that data for training AI models is “limited”.
It might feel extremely weird to claim the massive amount of data all the models have access to is actually limited. But if you think about it, it kind of makes sense. For example, assume you train a model with StackOverflow data. If LLM models eventually replace StackOverflow, where would the new data come from? Now extrapolating farther, if LLM models eventually replace the knowledge found in the internet, where will new knowledge come from?
In this brand new and long due episode of Biweekly Engineering, we will not discuss about the data issue. Let’s shift our focus to one of the most-discussed topic on what’s next and ongoing for AI — agentic systems.
The majestic Rest and Be Thankful Viewpoint in upper Scotland, UK
Building Effective AI Agent — A Guide from Anthropic
Anthropic draws an explicit distinction between agents and workflows:
Workflows are systems where LLMs and tools are orchestrated through predefined code paths.
Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.
While both systems can be deemed as agentic systems, think of workflows as deterministic due to strict applicability of instructions, whereas agents are more non-deterministic and dynamic, capable of taking decisions based on their interactions with the environment.
Should you build agentic systems? Yes, only if you need it. Just like you shouldn’t use LLM when a problem can be solved with something much simpler, you shouldn’t use agentic systems when having a simple LLM-based approach is enough. The reason is pretty simple — such systems will only add complexity to your architecture.
This piece of advice is actually universal in software engineering — always go for simple and stupid solution if it works. Sophisticated systems should be built if and only if they are required.
A note on non-determinism
LLMs are inherently non-deterministic. There is no guarantee you will get the same output every single time you ask the same question to an LLM. And this is in fact not a bug, but a feature.
But non-determinism is not always easy in software systems. We prefer determinism. It is much easier to handle infrastructure, errors, tests, responses from other systems—basically everything when your systems are deterministic.
Is the non-determinism bad? Not always. There are many use-cases where the non-determinism behaviour gives a much better experience to end users. But at the same time, it makes building systems with LLMs more difficult.
So to reiterate what Anthropic suggested—use agentic systems (workflows and agents) only when you absolutely need them.
When to use frameworks
To build LLM-based systems, there are already quite a few frameworks, and the most popular one is LangChain. LangChain comes with LangGraph, a framework to build agentic systems.
Anthropic suggests to avoid using frameworks in the early stage of development process if the underlying mechanism is not well-understood. The main issue with agentic frameworks, like every other frameworks in software engineering, is that they hide a lot of the details behind layers and layers of abstraction. This is disadvantageous if you want to understand what's going on under the hood, or you need to debug.
So start simple, build on top of it, and go for more complicated approaches based on needs.
Patterns of agentic systems
The basic building block for an agentic system is an LLM enhanced with a few capabilities: retrieval, tools, and memory.
From Anthropic
As the diagram shows, in a basic setting, an LLM upon receiving a request, can query or search for some data (retrieval), call external/internal entities to further process the response (tools), and store data if needed (memory).
Based on this basic building block, Anthropic outlines couple of patterns:
Prompt chaining: Useful when a task is a sequence of steps where each output from one LLM call is fed into the next one.
Prompt chaining
Routing: For a variety of tasks, different LLMs can do different tasks based on their specialities, and a router decides which LLM to call for which case. For example, smaller models can be suitable for easy/common knowledge-based questions but bigger models can be suitable for harder tasks like reasoning.
Routing
Parallelization: A task can be broken down into parallelizable subtasks and multiple LLMs can be invoked at the same time to execute it. An aggregator is used to aggregate/decide the final output.
Parallelization
Orchestrator-workers: For more unpredictable/non-deterministic tasks, an orchestrator LLM is used to figure out what the subtasks should be and dispatches the subtasks to different LLMs. The difference with parallelization is that subtasks in orchestrator-workers pattern are not predefined.
Orchestrator-worker
Evaluator-optimizer: In case we have clear evaluation criteria, there can be one LLM that generates output and another one accepts or rejects with feedback. This feedback loop continues until the evaluator accepts the output from generator.
Evaluator-optimizer
But when to use agents?
The above patterns of agentic systems are basically workflows where we can decide strictly what steps to take. But in case of agents, we do not know what steps might be needed. It is upto the LLM to decide and act. It should be autonomous.
Agent in action
Agents are useful for open-ended problems where we don’t know a fixed path that an LLM can take. Here, we rely on LLM’s own evaluation based on its interaction with the environment and available tools.
One use-case where agents could be incredibly useful is coding. Agents can read the code, decide what to do, run it, take the feedback, and reiterate. Given that different codebases have different criteria, it’s difficult to define a fixed path for an agent to take.
And that’s a wrap for today. The article discusses the patterns in great details with appendixes on agents and tools. A highly recommended piece for anyone interested in AI-driven architecture development.
See you (hopefully) soon in the next one! 🐋
Reply