golang-agents

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

12 KiB

Raw Blame History

title
Chapter 6: Callback and Trace (Observability)

The goal of this chapter is to understand the Callback mechanism and integrate CozeLoop to implement tracing and observability.

Code Location

Entry code: cmd/ch06/main.go

Prerequisites

Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark). Additionally, you need to set PROJECT_ROOT as in Chapter 4:

export PROJECT_ROOT=/path/to/eino  # Eino core library root directory (defaults to the current directory if not set)

Optional: Configure CozeLoop for tracing:

export COZELOOP_WORKSPACE_ID=your_workspace_id
export COZELOOP_API_TOKEN=your_token

Running

In the examples/quickstart/chatwitheino directory, run:

# Set the project root directory
export PROJECT_ROOT=/path/to/your/project

# Optional: Configure CozeLoop
export COZELOOP_WORKSPACE_ID=your_workspace_id
export COZELOOP_API_TOKEN=your_token

go run ./cmd/ch06

Output example:

[trace] starting session: 083d16da-6b13-4fe6-afb0-c45d8f490ce1
you> Hello
[trace] chat_model_generate: model=gpt-4.1-mini tokens=150
[trace] tool_call: name=list_files duration=23ms
[assistant] Hello! How can I help you?

From Black Box to White Box: Why We Need Callbacks

In the previous chapters, the Agent we implemented was a "black box": you input a question, get an answer, but what happened in between was unclear.

Problems with a black box:

Don't know how many times the model was called
Don't know how long Tool execution took
Don't know how many tokens were consumed
Difficult to locate the cause when something goes wrong

The role of Callbacks:

Callbacks are Eino's sidecar mechanism: Consistent from component to compose (discussed below) to adk
Callbacks trigger at fixed points: 5 key moments in a component's lifecycle
Callbacks extract real-time information: Input, output, errors, streaming data, etc.
Callbacks are versatile: Observation, logging, metrics, tracing, debugging, auditing, etc.

Simple analogy:

Agent = "business logic" (main path)
Callback = "sidecar hooks" (extract information at fixed points)

Key Concepts

Handler Interface

Handler is the core interface in Eino that defines callback handlers:

type Handler interface {
    // Non-streaming input (before the component starts processing)
    OnStart(ctx context.Context, info *RunInfo, input CallbackInput) context.Context

    // Non-streaming output (after the component returns successfully)
    OnEnd(ctx context.Context, info *RunInfo, output CallbackOutput) context.Context

    // Error (when the component returns an error)
    OnError(ctx context.Context, info *RunInfo, err error) context.Context

    // Streaming input (when the component receives streaming input)
    OnStartWithStreamInput(ctx context.Context, info *RunInfo,
        input *schema.StreamReader[CallbackInput]) context.Context

    // Streaming output (when the component returns streaming output)
    OnEndWithStreamOutput(ctx context.Context, info *RunInfo,
        output *schema.StreamReader[CallbackOutput]) context.Context
}

Design philosophy:

Sidecar mechanism: Does not interfere with the main flow, extracts information at fixed points
Full coverage: All components are supported, from component to compose to adk
State passing: OnStart -> OnEnd of the same Handler can pass state via context
Performance optimization: Implementing the TimingChecker interface allows skipping unnecessary timings

RunInfo structure:

type RunInfo struct {
    Name      string        // Business name (node name or user-specified)
    Type      string        // Implementation type (e.g., "OpenAI")
    Component string        // Component type (e.g., "ChatModel")
}

Important notes:

Streaming callbacks must close the StreamReader, otherwise goroutine leaks will occur
Do not modify Input/Output — they are shared by all downstream consumers
RunInfo may be nil — check before using

CozeLoop

CozeLoop is an open-source AI application observability platform by ByteDance, providing:

Tracing: Complete call chain visualization
Metrics monitoring: Latency, token consumption, error rates, etc.
Log aggregation: Centralized management of all logs
Debug support: Online viewing and debugging

Integration:

import (
    clc "github.com/cloudwego/eino-ext/callbacks/cozeloop"
    "github.com/cloudwego/eino/callbacks"
    "github.com/coze-dev/cozeloop-go"
)

// Create CozeLoop client
client, err := cozeloop.NewClient(
    cozeloop.WithAPIToken(apiToken),
    cozeloop.WithWorkspaceID(workspaceID),
)

// Register as a global Callback
callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client))

Callback Trigger Timings

Callbacks are triggered at 5 key moments in a component's lifecycle. In the table below, Timing* are Eino internal constant names (used with the TimingChecker interface), and the corresponding Handler interface methods are shown on the right:

Timing Constant	Handler Method	Trigger Point	Input/Output
`TimingOnStart`	`OnStart`	Before the component starts processing	CallbackInput
`TimingOnEnd`	`OnEnd`	After the component returns successfully	CallbackOutput
`TimingOnError`	`OnError`	When the component returns an error	error
`TimingOnStartWithStreamInput`	`OnStartWithStreamInput`	When the component receives streaming input	StreamReader[CallbackInput]
`TimingOnEndWithStreamOutput`	`OnEndWithStreamOutput`	When the component returns streaming output	StreamReader[CallbackOutput]

Example: ChatModel call flow

+------------------------------------------+
|  ChatModel.Generate(ctx, messages)       |
+------------------------------------------+
                   |
        +------------------------+
        |  OnStart               |  <- Input: CallbackInput (messages)
        +------------------------+
                   |
        +------------------------+
        |  Model processing      |
        +------------------------+
                   |
        +------------------------+
        |  OnEnd                 |  <- Output: CallbackOutput (response)
        +------------------------+

Example: Streaming output flow

+------------------------------------------+
|  ChatModel.Stream(ctx, messages)         |
+------------------------------------------+
                   |
        +------------------------+
        |  OnStart               |  <- Input: CallbackInput (messages)
        +------------------------+
                   |
        +------------------------+
        |  Model processing      |
        |  (streaming)           |
        +------------------------+
                   |
        +---------------------------+
        |  OnEndWithStreamOutput    |  <- Output: StreamReader[CallbackOutput]
        +---------------------------+
                   |
        +------------------------+
        |  Return chunks one     |
        |  by one                |
        +------------------------+

Notes:

Streaming errors (errors mid-stream) do not trigger OnError — they are returned within the StreamReader
OnStart -> OnEnd of the same Handler can pass state via context
There is no guaranteed execution order between different Handlers

Callback Implementation

1. Implement a Custom Callback Handler

Fully implementing the Handler interface requires implementing all 5 methods, which can be verbose. Eino provides the callbacks.HandlerHelper utility class to simplify the implementation:

import "github.com/cloudwego/eino/callbacks"

// Use NewHandlerHelper to register callbacks you're interested in
handler := callbacks.NewHandlerHelper().
    OnStart(func(ctx context.Context, info *callbacks.RunInfo, input callbacks.CallbackInput) context.Context {
        log.Printf("[trace] %s/%s start", info.Component, info.Name)
        return ctx
    }).
    OnEnd(func(ctx context.Context, info *callbacks.RunInfo, output callbacks.CallbackOutput) context.Context {
        log.Printf("[trace] %s/%s end", info.Component, info.Name)
        return ctx
    }).
    OnError(func(ctx context.Context, info *callbacks.RunInfo, err error) context.Context {
        log.Printf("[trace] %s/%s error: %v", info.Component, info.Name, err)
        return ctx
    }).
    Handler()

// Register as a global Callback
callbacks.AppendGlobalHandlers(handler)

Note: RunInfo may be nil (e.g., top-level calls without RunInfo) — check before using.

2. Integrate CozeLoop

func setupCozeLoop(ctx context.Context) (*cozeloop.Client, error) {
    apiToken := os.Getenv("COZELOOP_API_TOKEN")
    workspaceID := os.Getenv("COZELOOP_WORKSPACE_ID")

    if apiToken == "" || workspaceID == "" {
        return nil, nil  // Skip if not configured
    }

    client, err := cozeloop.NewClient(
        cozeloop.WithAPIToken(apiToken),
        cozeloop.WithWorkspaceID(workspaceID),
    )
    if err != nil {
        return nil, err
    }

    // Register as a global Callback
    callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client))

    return client, nil
}

3. Use in main

func main() {
    ctx := context.Background()

    // Set up CozeLoop (optional)
    client, err := setupCozeLoop(ctx)
    if err != nil {
        log.Printf("cozeloop setup failed: %v", err)
    }
    if client != nil {
        defer func() {
            time.Sleep(5 * time.Second)  // Wait for data to be reported
            client.Close(ctx)
        }()
    }

    // Create Agent and run...
}

Key code snippet (Note: this is a simplified code snippet that cannot be run directly. For the complete code, please refer to cmd/ch06/main.go):

// Set up CozeLoop tracing
cozeloopApiToken := os.Getenv("COZELOOP_API_TOKEN")
cozeloopWorkspaceID := os.Getenv("COZELOOP_WORKSPACE_ID")
if cozeloopApiToken != "" && cozeloopWorkspaceID != "" {
    client, err := cozeloop.NewClient(
        cozeloop.WithAPIToken(cozeloopApiToken),
        cozeloop.WithWorkspaceID(cozeloopWorkspaceID),
    )
    if err != nil {
        log.Fatalf("cozeloop.NewClient failed: %v", err)
    }
    defer func() {
        time.Sleep(5 * time.Second)
        client.Close(ctx)
    }()
    callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client))
}

The Value of Observability

1. Performance Analysis

With data collected through Callbacks, you can analyze:

Model call latency distribution
Tool execution time rankings
Token consumption trends

2. Error Tracking

When the Agent encounters problems:

View the complete call chain
Locate which step caused the error
Analyze the root cause

3. Cost Optimization

With token consumption data:

Identify high-consumption conversations
Optimize prompts to reduce tokens
Choose more cost-effective models

Chapter Summary

Callback: Eino's observation hooks that trigger callbacks at key points
CozeLoop: ByteDance's AI application observability platform
Global registration: Register global Callbacks via callbacks.AppendGlobalHandlers
Non-intrusive: Business code doesn't need modification — Callbacks trigger automatically
Observability value: Performance analysis, error tracking, cost optimization

Further Thinking

Other Callback implementations:

OpenTelemetry Callback: Connect to standard observability protocols
Custom logging Callback: Write to local files
Metrics Callback: Connect to monitoring systems like Prometheus

Advanced usage:

Implement sampling in Callbacks (only record some requests)
Implement rate limiting in Callbacks (based on token consumption)
Implement alerting in Callbacks (notify when error rate is too high)

12 KiB Raw Blame History