You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

442 lines
16 KiB
Markdown

---
title: "Chapter 5: Middleware"
---
The goal of this chapter is to understand the Middleware pattern and implement Tool error handling and ChatModel retry mechanisms.
## Why We Need Middleware
In Chapter 4, we added Tool capabilities to the Agent, enabling it to access the filesystem. But in real-world scenarios, **Tool errors and ChatModel errors are common**, for example:
- **Tool errors**: File not found, parameter errors, insufficient permissions, etc.
- **ChatModel errors**: API rate limiting (429), network timeouts, service unavailable, etc.
### Problem 1: Tool Errors Interrupt the Entire Flow
When a Tool execution fails, the error propagates directly to the Agent, causing the entire conversation to be interrupted:
```text
[tool call] read_file(file_path: "nonexistent.txt")
Error: open nonexistent.txt: no such file or directory
// Conversation interrupted, user needs to start over
```
### Problem 2: Model Calls May Fail Due to Rate Limiting
When the model API returns a 429 (Too Many Requests) error, the entire conversation is also interrupted:
```text
Error: rate limit exceeded (429)
// Conversation interrupted
```
### Expected Behavior
These errors **should not directly terminate the Agent flow**. Instead, the error information should be passed to the model so it can self-correct and proceed to the next round. For example:
```text
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist. Let me list the files in the current directory first...
[tool call] glob(pattern: "*")
```
### The Role of Middleware
The **Middleware pattern** can extend the behavior of Tools and ChatModel, making it ideal for solving this problem:
- **Middleware is an interceptor for the Agent**: Inserts custom logic before and after calls
- **Middleware can handle errors**: Converts errors into a format the model can understand
- **Middleware can implement retries**: Automatically retries failed operations
- **Middleware is composable**: Multiple Middlewares can be chained together
**Simple analogy:**
- **Agent** = "business logic"
- **Middleware** = "AOP aspects" (logging, retry, error handling, and other cross-cutting concerns)
## Code Location
- Entry code: [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go)
## Prerequisites
Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark). Additionally, you need to set `PROJECT_ROOT` as in Chapter 4:
```bash
export PROJECT_ROOT=/path/to/eino # Eino core library root directory
```
## Running
In the `examples/quickstart/chatwitheino` directory, run:
```bash
# Set the project root directory
export PROJECT_ROOT=/path/to/your/project
go run ./cmd/ch05
```
Output example:
```text
you> List the files in the current directory
[assistant] Let me list the files for you...
[tool call] list_files(directory: ".")
you> Read a nonexistent file
[assistant] Trying to read the file...
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist...
```
## Key Concepts
### Middleware Interface
`ChatModelAgentMiddleware` is the middleware interface for Agent:
```go
type ChatModelAgentMiddleware interface {
// BeforeAgent is called before each agent run, allowing modification of
// the agent's instruction and tools configuration.
BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error)
// BeforeModelRewriteState is called before each model invocation.
// The returned state is persisted to the agent's internal state and passed to the model.
BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)
// AfterModelRewriteState is called after each model invocation.
// The input state includes the model's response as the last message.
AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)
// WrapInvokableToolCall wraps a tool's synchronous execution with custom behavior.
// This method is only called for tools that implement InvokableTool.
WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error)
// WrapStreamableToolCall wraps a tool's streaming execution with custom behavior.
// This method is only called for tools that implement StreamableTool.
WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error)
// WrapEnhancedInvokableToolCall wraps an enhanced tool's synchronous execution.
// This method is only called for tools that implement EnhancedInvokableTool.
WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error)
// WrapEnhancedStreamableToolCall wraps an enhanced tool's streaming execution.
// This method is only called for tools that implement EnhancedStreamableTool.
WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error)
// WrapModel wraps a chat model with custom behavior.
// This method is called at request time when the model is about to be invoked.
WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error)
}
```
**Design philosophy:**
- **Decorator pattern**: Each Middleware wraps the original call, and can modify input, output, or errors
- **Onion model**: Requests pass through Middleware from outside to inside, responses return from inside to outside
- **Composable**: Multiple Middlewares execute in sequence
### Middleware Execution Order
`Handlers` (i.e., Middlewares) are wrapped in **array order**, forming an onion model:
```go
Handlers: []adk.ChatModelAgentMiddleware{
&middlewareA{}, // Outermost: wrapped first, intercepts requests first, but WrapModel takes effect last
&middlewareB{}, // Middle layer
&middlewareC{}, // Innermost: wrapped last
}
```
**Execution order for Tool calls:**
```
Request -> A.Wrap -> B.Wrap -> C.Wrap -> Actual Tool Execution -> C returns -> B returns -> A returns -> Response
```
**Practical advice:** Place `safeToolMiddleware` (error capture) at the innermost layer (end of the array) to ensure that interrupt errors thrown by other Middlewares can propagate outward correctly.
### SafeToolMiddleware
`SafeToolMiddleware` converts Tool errors into strings so the model can understand and handle them:
```go
type safeToolMiddleware struct {
*adk.BaseChatModelAgentMiddleware
}
func (m *safeToolMiddleware) WrapInvokableToolCall(
_ context.Context,
endpoint adk.InvokableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
result, err := endpoint(ctx, args, opts...)
if err != nil {
// Convert error to string instead of returning an error
return fmt.Sprintf("[tool error] %v", err), nil
}
return result, nil
}, nil
}
```
**Effect:**
```text
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist, please check the file path...
// Conversation continues, the model can adjust its strategy based on the error information
```
### ModelRetryConfig
`ModelRetryConfig` configures automatic retries for ChatModel:
```go
type ModelRetryConfig struct {
MaxRetries int // Maximum number of retries
IsRetryAble func(ctx context.Context, err error) bool // Determines if an error is retryable
}
```
**Usage (using DeepAgent as an example):**
```go
agent, err := deep.New(ctx, &deep.Config{
// ...
ModelRetryConfig: &adk.ModelRetryConfig{
MaxRetries: 5,
IsRetryAble: func(_ context.Context, err error) bool {
// 429 rate limiting errors are retryable
return strings.Contains(err.Error(), "429") ||
strings.Contains(err.Error(), "Too Many Requests") ||
strings.Contains(err.Error(), "qpm limit")
},
},
})
```
**Retry strategy:**
- Exponential backoff: Retry intervals increase with each attempt
- Configurable conditions: Use `IsRetryAble` to determine which errors are retryable
- Automatic recovery: No user intervention needed
## Middleware Implementation
### 1. Implement SafeToolMiddleware
```go
type safeToolMiddleware struct {
*adk.BaseChatModelAgentMiddleware
}
func (m *safeToolMiddleware) WrapInvokableToolCall(
_ context.Context,
endpoint adk.InvokableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
result, err := endpoint(ctx, args, opts...)
if err != nil {
// Don't convert interrupt errors, they need to continue propagating
if _, ok := compose.IsInterruptRerunError(err); ok {
return "", err
}
// Convert other errors to strings
return fmt.Sprintf("[tool error] %v", err), nil
}
return result, nil
}, nil
}
```
### 2. Implement Streaming Tool Error Handling
```go
func (m *safeToolMiddleware) WrapStreamableToolCall(
_ context.Context,
endpoint adk.StreamableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.StreamableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) {
sr, err := endpoint(ctx, args, opts...)
if err != nil {
if _, ok := compose.IsInterruptRerunError(err); ok {
return nil, err
}
// Return a single-frame stream containing the error message
return singleChunkReader(fmt.Sprintf("[tool error] %v", err)), nil
}
// Wrap the stream to catch errors within it
return safeWrapReader(sr), nil
}, nil
}
```
### 3. Configure the Agent to Use Middleware
This chapter continues using the `DeepAgent` introduced in Chapter 4, registering Middleware in its `Handlers` field:
```go
agent, err := deep.New(ctx, &deep.Config{
Name: "Ch05MiddlewareAgent",
Description: "ChatWithDoc agent with safe tool middleware and retry.",
ChatModel: cm,
Instruction: agentInstruction,
Backend: backend,
StreamingShell: backend,
MaxIteration: 50,
Handlers: []adk.ChatModelAgentMiddleware{
&safeToolMiddleware{}, // Converts Tool errors to strings
},
ModelRetryConfig: &adk.ModelRetryConfig{
MaxRetries: 5,
IsRetryAble: func(_ context.Context, err error) bool {
return strings.Contains(err.Error(), "429") ||
strings.Contains(err.Error(), "Too Many Requests")
},
},
})
```
**Note**: The `Handlers` field (in the config) and "Middleware" (the concept discussed in documentation) are the same thing — `Handlers` is the config field name, while `ChatModelAgentMiddleware` is the interface name.
**Key code snippet (Note: this is a simplified code snippet that cannot be run directly. For the complete code, please refer to** [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go)**)**:
```go
// SafeToolMiddleware catches Tool errors and converts them to strings
type safeToolMiddleware struct {
*adk.BaseChatModelAgentMiddleware
}
func (m *safeToolMiddleware) WrapInvokableToolCall(
_ context.Context,
endpoint adk.InvokableToolCallEndpoint,
_ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
result, err := endpoint(ctx, args, opts...)
if err != nil {
if _, ok := compose.IsInterruptRerunError(err); ok {
return "", err
}
return fmt.Sprintf("[tool error] %v", err), nil
}
return result, nil
}, nil
}
// Configure DeepAgent (same as Chapter 4, with Handlers and ModelRetryConfig added)
agent, _ := deep.New(ctx, &deep.Config{
ChatModel: cm,
Backend: backend,
StreamingShell: backend,
MaxIteration: 50,
Handlers: []adk.ChatModelAgentMiddleware{
&safeToolMiddleware{},
},
ModelRetryConfig: &adk.ModelRetryConfig{
MaxRetries: 5,
IsRetryAble: func(_ context.Context, err error) bool {
return strings.Contains(err.Error(), "429")
},
},
})
```
## Middleware Execution Flow
```
+------------------------------------------+
| User: Read a nonexistent file |
+------------------------------------------+
|
+------------------------+
| Agent analyzes intent |
| Decides to call |
| read_file |
+------------------------+
|
+------------------------+
| SafeToolMiddleware |
| Intercepts Tool call |
+------------------------+
|
+------------------------+
| Execute read_file |
| Returns error |
+------------------------+
|
+------------------------+
| SafeToolMiddleware |
| Converts error to |
| string |
+------------------------+
|
+------------------------+
| Return Tool Result |
| "[tool error] ..." |
+------------------------+
|
+------------------------+
| Agent generates reply |
| "Sorry, the file |
| doesn't exist..." |
+------------------------+
```
## Chapter Summary
- **Middleware**: An interceptor for the Agent that inserts custom logic before and after calls
- **SafeToolMiddleware**: Converts Tool errors to strings so the model can understand and handle them
- **ModelRetryConfig**: Configures automatic retries for ChatModel to handle temporary errors like rate limiting
- **Decorator pattern**: Middleware wraps the original call, and can modify input, output, or errors
- **Onion model**: Requests pass through Middleware from outside to inside, responses return from inside to outside
## Further Thinking
**Eino Built-in Middlewares:**
| Middleware | Description |
|------------|-------------|
| **reduction** | Tool output reduction — when tool output is too long, automatically truncates and offloads to the filesystem to prevent context overflow |
| **summarization** | Automatic conversation history summarization — when token count exceeds a threshold, automatically generates summaries to compress history |
| **skill** | Skill loading middleware — enables the Agent to dynamically load and execute predefined skills |
**Middleware chain example:**
```go
import (
"github.com/cloudwego/eino/adk/middlewares/reduction"
"github.com/cloudwego/eino/adk/middlewares/summarization"
"github.com/cloudwego/eino/adk/middlewares/skill"
)
// Create reduction middleware: manages tool output length
reductionMW, _ := reduction.New(ctx, &reduction.Config{
Backend: filesystemBackend, // Storage backend
MaxLengthForTrunc: 50000, // Max length for single tool output
MaxTokensForClear: 30000, // Token threshold to trigger cleanup
})
// Create summarization middleware: automatically compresses conversation history
summarizationMW, _ := summarization.New(ctx, &summarization.Config{
Model: chatModel, // Model used to generate summaries
Trigger: &summarization.TriggerCondition{
ContextTokens: 190000, // Token threshold to trigger summarization
},
})
// Combine multiple middlewares (conceptual example; when using DeepAgent, replace adk.NewChatModelAgent with deep.New)
agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
Handlers: []adk.ChatModelAgentMiddleware{ // Note: config field name is Handlers, conceptually equivalent to Middlewares
summarizationMW, // Outermost: conversation history summarization
reductionMW, // Middle layer: tool output reduction
},
})
```