You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
352 lines
12 KiB
Markdown
352 lines
12 KiB
Markdown
---
|
|
title: "Chapter 6: Callback and Trace (Observability)"
|
|
---
|
|
|
|
The goal of this chapter is to understand the Callback mechanism and integrate CozeLoop to implement tracing and observability.
|
|
|
|
## Code Location
|
|
|
|
- Entry code: [cmd/ch06/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch06/main.go)
|
|
|
|
## Prerequisites
|
|
|
|
Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark). Additionally, you need to set `PROJECT_ROOT` as in Chapter 4:
|
|
|
|
```bash
|
|
export PROJECT_ROOT=/path/to/eino # Eino core library root directory (defaults to the current directory if not set)
|
|
```
|
|
|
|
Optional: Configure CozeLoop for tracing:
|
|
|
|
```bash
|
|
export COZELOOP_WORKSPACE_ID=your_workspace_id
|
|
export COZELOOP_API_TOKEN=your_token
|
|
```
|
|
|
|
## Running
|
|
|
|
In the `examples/quickstart/chatwitheino` directory, run:
|
|
|
|
```bash
|
|
# Set the project root directory
|
|
export PROJECT_ROOT=/path/to/your/project
|
|
|
|
# Optional: Configure CozeLoop
|
|
export COZELOOP_WORKSPACE_ID=your_workspace_id
|
|
export COZELOOP_API_TOKEN=your_token
|
|
|
|
go run ./cmd/ch06
|
|
```
|
|
|
|
Output example:
|
|
|
|
```text
|
|
[trace] starting session: 083d16da-6b13-4fe6-afb0-c45d8f490ce1
|
|
you> Hello
|
|
[trace] chat_model_generate: model=gpt-4.1-mini tokens=150
|
|
[trace] tool_call: name=list_files duration=23ms
|
|
[assistant] Hello! How can I help you?
|
|
```
|
|
|
|
## From Black Box to White Box: Why We Need Callbacks
|
|
|
|
In the previous chapters, the Agent we implemented was a "black box": you input a question, get an answer, but what happened in between was unclear.
|
|
|
|
**Problems with a black box:**
|
|
- Don't know how many times the model was called
|
|
- Don't know how long Tool execution took
|
|
- Don't know how many tokens were consumed
|
|
- Difficult to locate the cause when something goes wrong
|
|
|
|
**The role of Callbacks:**
|
|
- **Callbacks are Eino's sidecar mechanism**: Consistent from component to compose (discussed below) to adk
|
|
- **Callbacks trigger at fixed points**: 5 key moments in a component's lifecycle
|
|
- **Callbacks extract real-time information**: Input, output, errors, streaming data, etc.
|
|
- **Callbacks are versatile**: Observation, logging, metrics, tracing, debugging, auditing, etc.
|
|
|
|
**Simple analogy:**
|
|
- **Agent** = "business logic" (main path)
|
|
- **Callback** = "sidecar hooks" (extract information at fixed points)
|
|
|
|
## Key Concepts
|
|
|
|
### Handler Interface
|
|
|
|
`Handler` is the core interface in Eino that defines callback handlers:
|
|
|
|
```go
|
|
type Handler interface {
|
|
// Non-streaming input (before the component starts processing)
|
|
OnStart(ctx context.Context, info *RunInfo, input CallbackInput) context.Context
|
|
|
|
// Non-streaming output (after the component returns successfully)
|
|
OnEnd(ctx context.Context, info *RunInfo, output CallbackOutput) context.Context
|
|
|
|
// Error (when the component returns an error)
|
|
OnError(ctx context.Context, info *RunInfo, err error) context.Context
|
|
|
|
// Streaming input (when the component receives streaming input)
|
|
OnStartWithStreamInput(ctx context.Context, info *RunInfo,
|
|
input *schema.StreamReader[CallbackInput]) context.Context
|
|
|
|
// Streaming output (when the component returns streaming output)
|
|
OnEndWithStreamOutput(ctx context.Context, info *RunInfo,
|
|
output *schema.StreamReader[CallbackOutput]) context.Context
|
|
}
|
|
```
|
|
|
|
**Design philosophy:**
|
|
- **Sidecar mechanism**: Does not interfere with the main flow, extracts information at fixed points
|
|
- **Full coverage**: All components are supported, from component to compose to adk
|
|
- **State passing**: OnStart -> OnEnd of the same Handler can pass state via context
|
|
- **Performance optimization**: Implementing the `TimingChecker` interface allows skipping unnecessary timings
|
|
|
|
**RunInfo structure:**
|
|
```go
|
|
type RunInfo struct {
|
|
Name string // Business name (node name or user-specified)
|
|
Type string // Implementation type (e.g., "OpenAI")
|
|
Component string // Component type (e.g., "ChatModel")
|
|
}
|
|
```
|
|
|
|
**Important notes:**
|
|
- Streaming callbacks must close the StreamReader, otherwise goroutine leaks will occur
|
|
- Do not modify Input/Output — they are shared by all downstream consumers
|
|
- RunInfo may be nil — check before using
|
|
|
|
### CozeLoop
|
|
|
|
CozeLoop is an open-source AI application observability platform by ByteDance, providing:
|
|
|
|
- **Tracing**: Complete call chain visualization
|
|
- **Metrics monitoring**: Latency, token consumption, error rates, etc.
|
|
- **Log aggregation**: Centralized management of all logs
|
|
- **Debug support**: Online viewing and debugging
|
|
|
|
**Integration:**
|
|
|
|
```go
|
|
import (
|
|
clc "github.com/cloudwego/eino-ext/callbacks/cozeloop"
|
|
"github.com/cloudwego/eino/callbacks"
|
|
"github.com/coze-dev/cozeloop-go"
|
|
)
|
|
|
|
// Create CozeLoop client
|
|
client, err := cozeloop.NewClient(
|
|
cozeloop.WithAPIToken(apiToken),
|
|
cozeloop.WithWorkspaceID(workspaceID),
|
|
)
|
|
|
|
// Register as a global Callback
|
|
callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client))
|
|
```
|
|
|
|
### Callback Trigger Timings
|
|
|
|
Callbacks are triggered at 5 key moments in a component's lifecycle. In the table below, `Timing*` are Eino internal constant names (used with the `TimingChecker` interface), and the corresponding Handler interface methods are shown on the right:
|
|
|
|
| Timing Constant | Handler Method | Trigger Point | Input/Output |
|
|
|-----------------|----------------|---------------|--------------|
|
|
| `TimingOnStart` | `OnStart` | Before the component starts processing | CallbackInput |
|
|
| `TimingOnEnd` | `OnEnd` | After the component returns successfully | CallbackOutput |
|
|
| `TimingOnError` | `OnError` | When the component returns an error | error |
|
|
| `TimingOnStartWithStreamInput` | `OnStartWithStreamInput` | When the component receives streaming input | StreamReader[CallbackInput] |
|
|
| `TimingOnEndWithStreamOutput` | `OnEndWithStreamOutput` | When the component returns streaming output | StreamReader[CallbackOutput] |
|
|
|
|
**Example: ChatModel call flow**
|
|
|
|
```
|
|
+------------------------------------------+
|
|
| ChatModel.Generate(ctx, messages) |
|
|
+------------------------------------------+
|
|
|
|
|
+------------------------+
|
|
| OnStart | <- Input: CallbackInput (messages)
|
|
+------------------------+
|
|
|
|
|
+------------------------+
|
|
| Model processing |
|
|
+------------------------+
|
|
|
|
|
+------------------------+
|
|
| OnEnd | <- Output: CallbackOutput (response)
|
|
+------------------------+
|
|
```
|
|
|
|
**Example: Streaming output flow**
|
|
|
|
```
|
|
+------------------------------------------+
|
|
| ChatModel.Stream(ctx, messages) |
|
|
+------------------------------------------+
|
|
|
|
|
+------------------------+
|
|
| OnStart | <- Input: CallbackInput (messages)
|
|
+------------------------+
|
|
|
|
|
+------------------------+
|
|
| Model processing |
|
|
| (streaming) |
|
|
+------------------------+
|
|
|
|
|
+---------------------------+
|
|
| OnEndWithStreamOutput | <- Output: StreamReader[CallbackOutput]
|
|
+---------------------------+
|
|
|
|
|
+------------------------+
|
|
| Return chunks one |
|
|
| by one |
|
|
+------------------------+
|
|
```
|
|
|
|
**Notes:**
|
|
- Streaming errors (errors mid-stream) do not trigger OnError — they are returned within the StreamReader
|
|
- OnStart -> OnEnd of the same Handler can pass state via context
|
|
- There is no guaranteed execution order between different Handlers
|
|
|
|
## Callback Implementation
|
|
|
|
### 1. Implement a Custom Callback Handler
|
|
|
|
Fully implementing the `Handler` interface requires implementing all 5 methods, which can be verbose. Eino provides the `callbacks.HandlerHelper` utility class to simplify the implementation:
|
|
|
|
```go
|
|
import "github.com/cloudwego/eino/callbacks"
|
|
|
|
// Use NewHandlerHelper to register callbacks you're interested in
|
|
handler := callbacks.NewHandlerHelper().
|
|
OnStart(func(ctx context.Context, info *callbacks.RunInfo, input callbacks.CallbackInput) context.Context {
|
|
log.Printf("[trace] %s/%s start", info.Component, info.Name)
|
|
return ctx
|
|
}).
|
|
OnEnd(func(ctx context.Context, info *callbacks.RunInfo, output callbacks.CallbackOutput) context.Context {
|
|
log.Printf("[trace] %s/%s end", info.Component, info.Name)
|
|
return ctx
|
|
}).
|
|
OnError(func(ctx context.Context, info *callbacks.RunInfo, err error) context.Context {
|
|
log.Printf("[trace] %s/%s error: %v", info.Component, info.Name, err)
|
|
return ctx
|
|
}).
|
|
Handler()
|
|
|
|
// Register as a global Callback
|
|
callbacks.AppendGlobalHandlers(handler)
|
|
```
|
|
|
|
**Note**: `RunInfo` may be `nil` (e.g., top-level calls without RunInfo) — check before using.
|
|
|
|
### 2. Integrate CozeLoop
|
|
|
|
```go
|
|
func setupCozeLoop(ctx context.Context) (*cozeloop.Client, error) {
|
|
apiToken := os.Getenv("COZELOOP_API_TOKEN")
|
|
workspaceID := os.Getenv("COZELOOP_WORKSPACE_ID")
|
|
|
|
if apiToken == "" || workspaceID == "" {
|
|
return nil, nil // Skip if not configured
|
|
}
|
|
|
|
client, err := cozeloop.NewClient(
|
|
cozeloop.WithAPIToken(apiToken),
|
|
cozeloop.WithWorkspaceID(workspaceID),
|
|
)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
|
|
// Register as a global Callback
|
|
callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client))
|
|
|
|
return client, nil
|
|
}
|
|
```
|
|
|
|
### 3. Use in main
|
|
|
|
```go
|
|
func main() {
|
|
ctx := context.Background()
|
|
|
|
// Set up CozeLoop (optional)
|
|
client, err := setupCozeLoop(ctx)
|
|
if err != nil {
|
|
log.Printf("cozeloop setup failed: %v", err)
|
|
}
|
|
if client != nil {
|
|
defer func() {
|
|
time.Sleep(5 * time.Second) // Wait for data to be reported
|
|
client.Close(ctx)
|
|
}()
|
|
}
|
|
|
|
// Create Agent and run...
|
|
}
|
|
```
|
|
|
|
**Key code snippet (Note: this is a simplified code snippet that cannot be run directly. For the complete code, please refer to** [cmd/ch06/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch06/main.go)**)**:
|
|
|
|
```go
|
|
// Set up CozeLoop tracing
|
|
cozeloopApiToken := os.Getenv("COZELOOP_API_TOKEN")
|
|
cozeloopWorkspaceID := os.Getenv("COZELOOP_WORKSPACE_ID")
|
|
if cozeloopApiToken != "" && cozeloopWorkspaceID != "" {
|
|
client, err := cozeloop.NewClient(
|
|
cozeloop.WithAPIToken(cozeloopApiToken),
|
|
cozeloop.WithWorkspaceID(cozeloopWorkspaceID),
|
|
)
|
|
if err != nil {
|
|
log.Fatalf("cozeloop.NewClient failed: %v", err)
|
|
}
|
|
defer func() {
|
|
time.Sleep(5 * time.Second)
|
|
client.Close(ctx)
|
|
}()
|
|
callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client))
|
|
}
|
|
```
|
|
|
|
## The Value of Observability
|
|
|
|
### 1. Performance Analysis
|
|
|
|
With data collected through Callbacks, you can analyze:
|
|
- Model call latency distribution
|
|
- Tool execution time rankings
|
|
- Token consumption trends
|
|
|
|
### 2. Error Tracking
|
|
|
|
When the Agent encounters problems:
|
|
- View the complete call chain
|
|
- Locate which step caused the error
|
|
- Analyze the root cause
|
|
|
|
### 3. Cost Optimization
|
|
|
|
With token consumption data:
|
|
- Identify high-consumption conversations
|
|
- Optimize prompts to reduce tokens
|
|
- Choose more cost-effective models
|
|
|
|
## Chapter Summary
|
|
|
|
- **Callback**: Eino's observation hooks that trigger callbacks at key points
|
|
- **CozeLoop**: ByteDance's AI application observability platform
|
|
- **Global registration**: Register global Callbacks via `callbacks.AppendGlobalHandlers`
|
|
- **Non-intrusive**: Business code doesn't need modification — Callbacks trigger automatically
|
|
- **Observability value**: Performance analysis, error tracking, cost optimization
|
|
|
|
## Further Thinking
|
|
|
|
**Other Callback implementations:**
|
|
- OpenTelemetry Callback: Connect to standard observability protocols
|
|
- Custom logging Callback: Write to local files
|
|
- Metrics Callback: Connect to monitoring systems like Prometheus
|
|
|
|
**Advanced usage:**
|
|
- Implement sampling in Callbacks (only record some requests)
|
|
- Implement rate limiting in Callbacks (based on token consumption)
|
|
- Implement alerting in Callbacks (notify when error rate is too high)
|