You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

11 KiB

title
Chapter 2: ChatModelAgent, Runner, AgentEvent (Console Multi-turn)

The goal of this chapter is to introduce the ADK's execution abstractions (Agent + Runner) and implement a multi-turn conversation using a Console program.

Code Location

Prerequisites

Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark).

Running

In the examples/quickstart/chatwitheino directory, run:

go run ./cmd/ch02

After seeing the prompt, enter your question (empty line to exit):

you> Hello, explain what an Agent is in Eino?
...
you> Summarize that in one sentence
...

Key Concepts

From Component to Agent

In Chapter 1, we learned about Components, which are replaceable, composable capability units in Eino:

  • ChatModel: Call large language models
  • Tool: Execute specific tasks
  • Retriever: Retrieve information
  • Loader: Load data

The relationship between Component and Agent:

  • Components don't form a complete AI application: They are just capability units that need to be organized, orchestrated, and executed
  • An Agent is a complete AI application: It encapsulates complete business logic and can run directly
  • Agents use Components internally: The most essential ones are ChatModel (conversation capability) and Tool (execution capability)

Why do we need Agents?

With only Components, you would need to handle on your own:

  • Managing conversation history
  • Orchestrating the call flow (when to call the model, when to call tools)
  • Handling streaming output
  • Implementing interruption and recovery
  • ...

What does an Agent provide?

  • A complete runtime framework: Unified execution management through Runner
  • Standardized event stream output: Run() -> AsyncIterator[*AgentEvent], supporting streaming, interruption, and recovery
  • Extensible capabilities: You can add tools, middleware, interrupt, etc.
  • Ready to use out of the box: Once an Agent is created, you can run it directly without worrying about internal details

This chapter's example:

ChatModelAgent is the simplest Agent. It only uses a ChatModel internally, but already has the complete Agent capability framework. Subsequent chapters will show how to add Tool and other capabilities.

Agent Interface

Agent is the core interface in ADK, defining the basic behavior of an intelligent agent:

type Agent interface {
    Name(ctx context.Context) string
    Description(ctx context.Context) string

    // Run executes the Agent and returns an event stream
    Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent]
}

Interface responsibilities:

  • Name() / Description(): Identify the Agent's name and description
  • Run(): The core method for executing the Agent, receiving input messages and returning an event stream

Design philosophy:

  • Unified abstraction: All Agents (ChatModelAgent, WorkflowAgent, SupervisorAgent, etc.) implement this interface
  • Event-driven: Outputs the execution process through an event stream (AsyncIterator[*AgentEvent]), supporting streaming responses
  • Extensibility: When adding tools, middleware, interrupt, and other capabilities later, the interface remains unchanged

ChatModelAgent

ChatModelAgent is an implementation of the Agent interface, built on top of ChatModel:

agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    Name:        "Ch02ChatModelAgent",
    Description: "A minimal ChatModelAgent with in-memory multi-turn history.",
    Instruction: instruction,
    Model:       cm,
})

ChatModel vs ChatModelAgent: The Essential Difference

Dimension ChatModel ChatModelAgent
Role Component Agent
Interface Generate() / Stream() Run() -> AsyncIterator[*AgentEvent]
Output Directly returns message content Returns an event stream (containing messages, control actions, etc.)
Capabilities Pure model invocation Extensible with tools, middleware, interrupt, etc.
Use case Simple conversation scenarios Complex agent applications

Why do we need ChatModelAgent?

  1. Unified abstraction: ChatModel is just one type of Component, while Agent is a higher-level abstraction that can compose multiple Components
  2. Event-driven: Agent outputs an event stream, supporting streaming responses, interruption recovery, state transitions, and other complex scenarios
  3. Extensibility: ChatModelAgent can have tools, middleware, interrupt, and other capabilities added, while ChatModel can only call the model
  4. Orchestration-friendly: Agents can be uniformly managed by Runner, supporting checkpoint, recovery, and other runtime capabilities

In simple terms:

  • ChatModel = "The component responsible for communicating with the large language model, abstracting away differences between model providers (OpenAI, Ark, Claude, etc.)"
  • ChatModelAgent = "An intelligent agent built on top of the model; it can call the model, but can also do much more"

Analogy:

  • ChatModel is like a "database driver": responsible for communicating with the database, abstracting away MySQL/PostgreSQL differences
  • ChatModelAgent is like the "business logic layer": built on top of the database driver, but also includes business rules, transaction management, etc.

Features:

  • Encapsulates the ChatModel invocation logic
  • Provides a unified Run() -> AgentEvent output format
  • Can have tools, middleware, and other capabilities added later

Runner

Runner is the entry point for executing an Agent, responsible for managing the Agent's lifecycle:

type Runner struct {
    a Agent  // The Agent to execute
    enableStreaming bool
    store CheckPointStore  // State storage for interruption recovery
}

Why do we need Runner?

Although Agent provides a Run() method, calling it directly would lack many runtime capabilities:

  1. Lifecycle management: Runner manages the Agent's startup, recovery, interruption, and other states
  2. Checkpoint support: Works with CheckPointStore to implement interruption recovery (covered in later chapters)
  3. Unified entry point: Provides convenient methods like Run() and Query()
  4. Event stream encapsulation: Converts the Agent's event stream into a consumable AsyncIterator[*AgentEvent]

Usage:

runner := adk.NewRunner(ctx, adk.RunnerConfig{
    Agent:           agent,
    EnableStreaming: true,
})

// Method 1: Pass in a message list
events := runner.Run(ctx, history)

// Method 2: Convenience method, pass in a single query string
events := runner.Query(ctx, "Hello")

AgentEvent

AgentEvent is the event unit returned by Runner:

type AgentEvent struct {
    AgentName string
    RunPath   []RunStep

    Output *AgentOutput  // Output content
    Action *AgentAction  // Control action
    Err    error         // Execution error
}

Main fields:

  • event.Err: Execution error
  • event.Output.MessageOutput: Message or message stream (streaming)
  • event.Action: Control actions like interrupt/transfer/exit (used in later chapters)

AsyncIterator: How to Consume the Event Stream

Runner.Run() returns *AsyncIterator[*AgentEvent], which is a non-blocking streaming iterator.

Why use AsyncIterator instead of returning results directly?

Because Agent execution is streaming: the model generates replies token by token, with Tool calls interspersed. If you wait for everything to complete before returning, the user would have to wait longer. AsyncIterator lets you consume each event in real time.

How to consume:

// events is *AsyncIterator[*AgentEvent], returned by runner.Run()
events := runner.Run(ctx, history)

for {
    event, ok := events.Next()  // Get the next event, blocks until an event is available or the stream ends
    if !ok {
        break  // Iterator closed, all events consumed
    }
    if event.Err != nil {
        // Handle error
    }
    if event.Output != nil && event.Output.MessageOutput != nil {
        // Handle message output (may be streaming)
    }
}

Note: Each runner.Run() creates a new iterator; it cannot be reused after being consumed once.

Implementing Multi-turn Conversation

This chapter implements simple multi-turn conversation: user input -> model reply -> user continues input -> ...

Implementation approach:

Without tools, ChatModelAgent only completes one round of model invocation per Run() call. Multi-turn conversation is achieved by maintaining history on the caller side:

  1. Use history []*schema.Message to store the accumulated conversation
  2. On each user input: append the UserMessage to history
  3. Call runner.Run(ctx, history) to get the event stream, consume it to get the assistant text
  4. Append the assistant text back to history, then proceed to the next round

Key code snippet (Note: this is a simplified code snippet that cannot be run directly. For the complete code, please refer to cmd/ch02/main.go):

history := make([]*schema.Message, 0, 16)

for {
    // 1. Read user input
    line := readUserInput()
    if line == "" {
        break
    }

    // 2. Append user message to history
    history = append(history, schema.UserMessage(line))

    // 3. Call Runner to execute Agent
    events := runner.Run(ctx, history)

    // 4. Consume event stream, collect assistant reply
    content := collectAssistantFromEvents(events)

    // 5. Append assistant message to history
    history = append(history, schema.AssistantMessage(content, nil))
}

Flow diagram:

+------------------------------------------+
|  Initialize history = []                  |
+------------------------------------------+
                   |
        +------------------------+
        |  User inputs UserMessage |
        +------------------------+
                   |
        +------------------------+
        |  Append to history      |
        +------------------------+
                   |
        +------------------------+
        |  runner.Run(history)    |
        +------------------------+
                   |
        +------------------------+
        |  Consume event stream   |
        +------------------------+
                   |
        +------------------------+
        |  Append AssistantMessage |
        +------------------------+
                   |
              (loop continues)

Chapter Summary

  • Agent interface: Defines the basic behavior of an intelligent agent; the core is Run() -> AsyncIterator[*AgentEvent]
  • ChatModelAgent: An Agent implementation based on ChatModel, providing a unified execution abstraction
  • Runner: The execution entry point for Agents, managing lifecycle, checkpoint, event stream, and other runtime capabilities
  • AgentEvent: An event-driven output unit, supporting streaming responses and control actions
  • Multi-turn conversation: Implemented by maintaining history on the caller side; each Run() completes one round of conversation