11 KiB
| title |
|---|
| Chapter 2: ChatModelAgent, Runner, AgentEvent (Console Multi-turn) |
The goal of this chapter is to introduce the ADK's execution abstractions (Agent + Runner) and implement a multi-turn conversation using a Console program.
Code Location
- Entry code: cmd/ch02/main.go
Prerequisites
Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark).
Running
In the examples/quickstart/chatwitheino directory, run:
go run ./cmd/ch02
After seeing the prompt, enter your question (empty line to exit):
you> Hello, explain what an Agent is in Eino?
...
you> Summarize that in one sentence
...
Key Concepts
From Component to Agent
In Chapter 1, we learned about Components, which are replaceable, composable capability units in Eino:
ChatModel: Call large language modelsTool: Execute specific tasksRetriever: Retrieve informationLoader: Load data
The relationship between Component and Agent:
- Components don't form a complete AI application: They are just capability units that need to be organized, orchestrated, and executed
- An Agent is a complete AI application: It encapsulates complete business logic and can run directly
- Agents use Components internally: The most essential ones are
ChatModel(conversation capability) andTool(execution capability)
Why do we need Agents?
With only Components, you would need to handle on your own:
- Managing conversation history
- Orchestrating the call flow (when to call the model, when to call tools)
- Handling streaming output
- Implementing interruption and recovery
- ...
What does an Agent provide?
- A complete runtime framework: Unified execution management through
Runner - Standardized event stream output:
Run() -> AsyncIterator[*AgentEvent], supporting streaming, interruption, and recovery - Extensible capabilities: You can add tools, middleware, interrupt, etc.
- Ready to use out of the box: Once an Agent is created, you can run it directly without worrying about internal details
This chapter's example:
ChatModelAgent is the simplest Agent. It only uses a ChatModel internally, but already has the complete Agent capability framework. Subsequent chapters will show how to add Tool and other capabilities.
Agent Interface
Agent is the core interface in ADK, defining the basic behavior of an intelligent agent:
type Agent interface {
Name(ctx context.Context) string
Description(ctx context.Context) string
// Run executes the Agent and returns an event stream
Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent]
}
Interface responsibilities:
Name()/Description(): Identify the Agent's name and descriptionRun(): The core method for executing the Agent, receiving input messages and returning an event stream
Design philosophy:
- Unified abstraction: All Agents (ChatModelAgent, WorkflowAgent, SupervisorAgent, etc.) implement this interface
- Event-driven: Outputs the execution process through an event stream (
AsyncIterator[*AgentEvent]), supporting streaming responses - Extensibility: When adding tools, middleware, interrupt, and other capabilities later, the interface remains unchanged
ChatModelAgent
ChatModelAgent is an implementation of the Agent interface, built on top of ChatModel:
agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
Name: "Ch02ChatModelAgent",
Description: "A minimal ChatModelAgent with in-memory multi-turn history.",
Instruction: instruction,
Model: cm,
})
ChatModel vs ChatModelAgent: The Essential Difference
| Dimension | ChatModel | ChatModelAgent |
|---|---|---|
| Role | Component | Agent |
| Interface | Generate() / Stream() |
Run() -> AsyncIterator[*AgentEvent] |
| Output | Directly returns message content | Returns an event stream (containing messages, control actions, etc.) |
| Capabilities | Pure model invocation | Extensible with tools, middleware, interrupt, etc. |
| Use case | Simple conversation scenarios | Complex agent applications |
Why do we need ChatModelAgent?
- Unified abstraction: ChatModel is just one type of Component, while Agent is a higher-level abstraction that can compose multiple Components
- Event-driven: Agent outputs an event stream, supporting streaming responses, interruption recovery, state transitions, and other complex scenarios
- Extensibility: ChatModelAgent can have tools, middleware, interrupt, and other capabilities added, while ChatModel can only call the model
- Orchestration-friendly: Agents can be uniformly managed by Runner, supporting checkpoint, recovery, and other runtime capabilities
In simple terms:
- ChatModel = "The component responsible for communicating with the large language model, abstracting away differences between model providers (OpenAI, Ark, Claude, etc.)"
- ChatModelAgent = "An intelligent agent built on top of the model; it can call the model, but can also do much more"
Analogy:
- ChatModel is like a "database driver": responsible for communicating with the database, abstracting away MySQL/PostgreSQL differences
- ChatModelAgent is like the "business logic layer": built on top of the database driver, but also includes business rules, transaction management, etc.
Features:
- Encapsulates the ChatModel invocation logic
- Provides a unified
Run() -> AgentEventoutput format - Can have tools, middleware, and other capabilities added later
Runner
Runner is the entry point for executing an Agent, responsible for managing the Agent's lifecycle:
type Runner struct {
a Agent // The Agent to execute
enableStreaming bool
store CheckPointStore // State storage for interruption recovery
}
Why do we need Runner?
Although Agent provides a Run() method, calling it directly would lack many runtime capabilities:
- Lifecycle management: Runner manages the Agent's startup, recovery, interruption, and other states
- Checkpoint support: Works with
CheckPointStoreto implement interruption recovery (covered in later chapters) - Unified entry point: Provides convenient methods like
Run()andQuery() - Event stream encapsulation: Converts the Agent's event stream into a consumable
AsyncIterator[*AgentEvent]
Usage:
runner := adk.NewRunner(ctx, adk.RunnerConfig{
Agent: agent,
EnableStreaming: true,
})
// Method 1: Pass in a message list
events := runner.Run(ctx, history)
// Method 2: Convenience method, pass in a single query string
events := runner.Query(ctx, "Hello")
AgentEvent
AgentEvent is the event unit returned by Runner:
type AgentEvent struct {
AgentName string
RunPath []RunStep
Output *AgentOutput // Output content
Action *AgentAction // Control action
Err error // Execution error
}
Main fields:
event.Err: Execution errorevent.Output.MessageOutput: Message or message stream (streaming)event.Action: Control actions like interrupt/transfer/exit (used in later chapters)
AsyncIterator: How to Consume the Event Stream
Runner.Run() returns *AsyncIterator[*AgentEvent], which is a non-blocking streaming iterator.
Why use AsyncIterator instead of returning results directly?
Because Agent execution is streaming: the model generates replies token by token, with Tool calls interspersed. If you wait for everything to complete before returning, the user would have to wait longer. AsyncIterator lets you consume each event in real time.
How to consume:
// events is *AsyncIterator[*AgentEvent], returned by runner.Run()
events := runner.Run(ctx, history)
for {
event, ok := events.Next() // Get the next event, blocks until an event is available or the stream ends
if !ok {
break // Iterator closed, all events consumed
}
if event.Err != nil {
// Handle error
}
if event.Output != nil && event.Output.MessageOutput != nil {
// Handle message output (may be streaming)
}
}
Note: Each runner.Run() creates a new iterator; it cannot be reused after being consumed once.
Implementing Multi-turn Conversation
This chapter implements simple multi-turn conversation: user input -> model reply -> user continues input -> ...
Implementation approach:
Without tools, ChatModelAgent only completes one round of model invocation per Run() call. Multi-turn conversation is achieved by maintaining history on the caller side:
- Use
history []*schema.Messageto store the accumulated conversation - On each user input: append the
UserMessageto history - Call
runner.Run(ctx, history)to get the event stream, consume it to get the assistant text - Append the assistant text back to history, then proceed to the next round
Key code snippet (Note: this is a simplified code snippet that cannot be run directly. For the complete code, please refer to cmd/ch02/main.go):
history := make([]*schema.Message, 0, 16)
for {
// 1. Read user input
line := readUserInput()
if line == "" {
break
}
// 2. Append user message to history
history = append(history, schema.UserMessage(line))
// 3. Call Runner to execute Agent
events := runner.Run(ctx, history)
// 4. Consume event stream, collect assistant reply
content := collectAssistantFromEvents(events)
// 5. Append assistant message to history
history = append(history, schema.AssistantMessage(content, nil))
}
Flow diagram:
+------------------------------------------+
| Initialize history = [] |
+------------------------------------------+
|
+------------------------+
| User inputs UserMessage |
+------------------------+
|
+------------------------+
| Append to history |
+------------------------+
|
+------------------------+
| runner.Run(history) |
+------------------------+
|
+------------------------+
| Consume event stream |
+------------------------+
|
+------------------------+
| Append AssistantMessage |
+------------------------+
|
(loop continues)
Chapter Summary
- Agent interface: Defines the basic behavior of an intelligent agent; the core is
Run() -> AsyncIterator[*AgentEvent] - ChatModelAgent: An Agent implementation based on ChatModel, providing a unified execution abstraction
- Runner: The execution entry point for Agents, managing lifecycle, checkpoint, event stream, and other runtime capabilities
- AgentEvent: An event-driven output unit, supporting streaming responses and control actions
- Multi-turn conversation: Implemented by maintaining history on the caller side; each
Run()completes one round of conversation