golang-agents

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

11 KiB

Raw Blame History

title
Chapter 2: ChatModelAgent, Runner, AgentEvent (Console Multi-turn)

The goal of this chapter is to introduce the ADK's execution abstractions (Agent + Runner) and implement a multi-turn conversation using a Console program.

Code Location

Entry code: cmd/ch02/main.go

Prerequisites

Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark).

Running

In the examples/quickstart/chatwitheino directory, run:

go run ./cmd/ch02

After seeing the prompt, enter your question (empty line to exit):

you> Hello, explain what an Agent is in Eino?
...
you> Summarize that in one sentence
...

Key Concepts

From Component to Agent

In Chapter 1, we learned about Components, which are replaceable, composable capability units in Eino:

ChatModel: Call large language models
Tool: Execute specific tasks
Retriever: Retrieve information
Loader: Load data

The relationship between Component and Agent:

Components don't form a complete AI application: They are just capability units that need to be organized, orchestrated, and executed
An Agent is a complete AI application: It encapsulates complete business logic and can run directly
Agents use Components internally: The most essential ones are ChatModel (conversation capability) and Tool (execution capability)

Why do we need Agents?

With only Components, you would need to handle on your own:

Managing conversation history
Orchestrating the call flow (when to call the model, when to call tools)
Handling streaming output
Implementing interruption and recovery
...

What does an Agent provide?

A complete runtime framework: Unified execution management through Runner
Standardized event stream output: Run() -> AsyncIterator[*AgentEvent], supporting streaming, interruption, and recovery
Extensible capabilities: You can add tools, middleware, interrupt, etc.
Ready to use out of the box: Once an Agent is created, you can run it directly without worrying about internal details

This chapter's example:

ChatModelAgent is the simplest Agent. It only uses a ChatModel internally, but already has the complete Agent capability framework. Subsequent chapters will show how to add Tool and other capabilities.

Agent Interface

Agent is the core interface in ADK, defining the basic behavior of an intelligent agent:

type Agent interface {
    Name(ctx context.Context) string
    Description(ctx context.Context) string

    // Run executes the Agent and returns an event stream
    Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent]
}

Interface responsibilities:

Name() / Description(): Identify the Agent's name and description
Run(): The core method for executing the Agent, receiving input messages and returning an event stream

Design philosophy:

Unified abstraction: All Agents (ChatModelAgent, WorkflowAgent, SupervisorAgent, etc.) implement this interface
Event-driven: Outputs the execution process through an event stream (AsyncIterator[*AgentEvent]), supporting streaming responses
Extensibility: When adding tools, middleware, interrupt, and other capabilities later, the interface remains unchanged

ChatModelAgent

ChatModelAgent is an implementation of the Agent interface, built on top of ChatModel:

agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    Name:        "Ch02ChatModelAgent",
    Description: "A minimal ChatModelAgent with in-memory multi-turn history.",
    Instruction: instruction,
    Model:       cm,
})

ChatModel vs ChatModelAgent: The Essential Difference

Dimension	ChatModel	ChatModelAgent
Role	Component	Agent
Interface	`Generate() / Stream()`	`Run() -> AsyncIterator[*AgentEvent]`
Output	Directly returns message content	Returns an event stream (containing messages, control actions, etc.)
Capabilities	Pure model invocation	Extensible with tools, middleware, interrupt, etc.
Use case	Simple conversation scenarios	Complex agent applications

Why do we need ChatModelAgent?

Unified abstraction: ChatModel is just one type of Component, while Agent is a higher-level abstraction that can compose multiple Components
Event-driven: Agent outputs an event stream, supporting streaming responses, interruption recovery, state transitions, and other complex scenarios
Extensibility: ChatModelAgent can have tools, middleware, interrupt, and other capabilities added, while ChatModel can only call the model
Orchestration-friendly: Agents can be uniformly managed by Runner, supporting checkpoint, recovery, and other runtime capabilities

In simple terms:

ChatModel = "The component responsible for communicating with the large language model, abstracting away differences between model providers (OpenAI, Ark, Claude, etc.)"
ChatModelAgent = "An intelligent agent built on top of the model; it can call the model, but can also do much more"

Analogy:

ChatModel is like a "database driver": responsible for communicating with the database, abstracting away MySQL/PostgreSQL differences
ChatModelAgent is like the "business logic layer": built on top of the database driver, but also includes business rules, transaction management, etc.

Features:

Encapsulates the ChatModel invocation logic
Provides a unified Run() -> AgentEvent output format
Can have tools, middleware, and other capabilities added later

Runner

Runner is the entry point for executing an Agent, responsible for managing the Agent's lifecycle:

type Runner struct {
    a Agent  // The Agent to execute
    enableStreaming bool
    store CheckPointStore  // State storage for interruption recovery
}

Why do we need Runner?

Although Agent provides a Run() method, calling it directly would lack many runtime capabilities:

Lifecycle management: Runner manages the Agent's startup, recovery, interruption, and other states
Checkpoint support: Works with CheckPointStore to implement interruption recovery (covered in later chapters)
Unified entry point: Provides convenient methods like Run() and Query()
Event stream encapsulation: Converts the Agent's event stream into a consumable AsyncIterator[*AgentEvent]

Usage:

runner := adk.NewRunner(ctx, adk.RunnerConfig{
    Agent:           agent,
    EnableStreaming: true,
})

// Method 1: Pass in a message list
events := runner.Run(ctx, history)

// Method 2: Convenience method, pass in a single query string
events := runner.Query(ctx, "Hello")

AgentEvent

AgentEvent is the event unit returned by Runner:

type AgentEvent struct {
    AgentName string
    RunPath   []RunStep

    Output *AgentOutput  // Output content
    Action *AgentAction  // Control action
    Err    error         // Execution error
}

Main fields:

event.Err: Execution error
event.Output.MessageOutput: Message or message stream (streaming)
event.Action: Control actions like interrupt/transfer/exit (used in later chapters)

AsyncIterator: How to Consume the Event Stream

Runner.Run() returns *AsyncIterator[*AgentEvent], which is a non-blocking streaming iterator.

Why use AsyncIterator instead of returning results directly?

Because Agent execution is streaming: the model generates replies token by token, with Tool calls interspersed. If you wait for everything to complete before returning, the user would have to wait longer. AsyncIterator lets you consume each event in real time.

How to consume:

// events is *AsyncIterator[*AgentEvent], returned by runner.Run()
events := runner.Run(ctx, history)

for {
    event, ok := events.Next()  // Get the next event, blocks until an event is available or the stream ends
    if !ok {
        break  // Iterator closed, all events consumed
    }
    if event.Err != nil {
        // Handle error
    }
    if event.Output != nil && event.Output.MessageOutput != nil {
        // Handle message output (may be streaming)
    }
}

Note: Each runner.Run() creates a new iterator; it cannot be reused after being consumed once.

Implementing Multi-turn Conversation

This chapter implements simple multi-turn conversation: user input -> model reply -> user continues input -> ...

Implementation approach:

Without tools, ChatModelAgent only completes one round of model invocation per Run() call. Multi-turn conversation is achieved by maintaining history on the caller side:

Use history []*schema.Message to store the accumulated conversation
On each user input: append the UserMessage to history
Call runner.Run(ctx, history) to get the event stream, consume it to get the assistant text
Append the assistant text back to history, then proceed to the next round

Key code snippet (Note: this is a simplified code snippet that cannot be run directly. For the complete code, please refer to cmd/ch02/main.go):

history := make([]*schema.Message, 0, 16)

for {
    // 1. Read user input
    line := readUserInput()
    if line == "" {
        break
    }

    // 2. Append user message to history
    history = append(history, schema.UserMessage(line))

    // 3. Call Runner to execute Agent
    events := runner.Run(ctx, history)

    // 4. Consume event stream, collect assistant reply
    content := collectAssistantFromEvents(events)

    // 5. Append assistant message to history
    history = append(history, schema.AssistantMessage(content, nil))
}

Flow diagram:

+------------------------------------------+
|  Initialize history = []                  |
+------------------------------------------+
                   |
        +------------------------+
        |  User inputs UserMessage |
        +------------------------+
                   |
        +------------------------+
        |  Append to history      |
        +------------------------+
                   |
        +------------------------+
        |  runner.Run(history)    |
        +------------------------+
                   |
        +------------------------+
        |  Consume event stream   |
        +------------------------+
                   |
        +------------------------+
        |  Append AssistantMessage |
        +------------------------+
                   |
              (loop continues)

Chapter Summary

Agent interface: Defines the basic behavior of an intelligent agent; the core is Run() -> AsyncIterator[*AgentEvent]
ChatModelAgent: An Agent implementation based on ChatModel, providing a unified execution abstraction
Runner: The execution entry point for Agents, managing lifecycle, checkpoint, event stream, and other runtime capabilities
AgentEvent: An event-driven output unit, supporting streaming responses and control actions
Multi-turn conversation: Implemented by maintaining history on the caller side; each Run() completes one round of conversation

11 KiB Raw Blame History