You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
442 lines
16 KiB
Markdown
442 lines
16 KiB
Markdown
---
|
|
title: "Chapter 5: Middleware"
|
|
---
|
|
|
|
The goal of this chapter is to understand the Middleware pattern and implement Tool error handling and ChatModel retry mechanisms.
|
|
|
|
## Why We Need Middleware
|
|
|
|
In Chapter 4, we added Tool capabilities to the Agent, enabling it to access the filesystem. But in real-world scenarios, **Tool errors and ChatModel errors are common**, for example:
|
|
|
|
- **Tool errors**: File not found, parameter errors, insufficient permissions, etc.
|
|
- **ChatModel errors**: API rate limiting (429), network timeouts, service unavailable, etc.
|
|
|
|
### Problem 1: Tool Errors Interrupt the Entire Flow
|
|
|
|
When a Tool execution fails, the error propagates directly to the Agent, causing the entire conversation to be interrupted:
|
|
|
|
```text
|
|
[tool call] read_file(file_path: "nonexistent.txt")
|
|
Error: open nonexistent.txt: no such file or directory
|
|
// Conversation interrupted, user needs to start over
|
|
```
|
|
|
|
### Problem 2: Model Calls May Fail Due to Rate Limiting
|
|
|
|
When the model API returns a 429 (Too Many Requests) error, the entire conversation is also interrupted:
|
|
|
|
```text
|
|
Error: rate limit exceeded (429)
|
|
// Conversation interrupted
|
|
```
|
|
|
|
### Expected Behavior
|
|
|
|
These errors **should not directly terminate the Agent flow**. Instead, the error information should be passed to the model so it can self-correct and proceed to the next round. For example:
|
|
|
|
```text
|
|
[tool call] read_file(file_path: "nonexistent.txt")
|
|
[tool result] [tool error] open nonexistent.txt: no such file or directory
|
|
[assistant] Sorry, the file doesn't exist. Let me list the files in the current directory first...
|
|
[tool call] glob(pattern: "*")
|
|
```
|
|
|
|
### The Role of Middleware
|
|
|
|
The **Middleware pattern** can extend the behavior of Tools and ChatModel, making it ideal for solving this problem:
|
|
|
|
- **Middleware is an interceptor for the Agent**: Inserts custom logic before and after calls
|
|
- **Middleware can handle errors**: Converts errors into a format the model can understand
|
|
- **Middleware can implement retries**: Automatically retries failed operations
|
|
- **Middleware is composable**: Multiple Middlewares can be chained together
|
|
|
|
**Simple analogy:**
|
|
- **Agent** = "business logic"
|
|
- **Middleware** = "AOP aspects" (logging, retry, error handling, and other cross-cutting concerns)
|
|
|
|
## Code Location
|
|
|
|
- Entry code: [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go)
|
|
|
|
## Prerequisites
|
|
|
|
Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark). Additionally, you need to set `PROJECT_ROOT` as in Chapter 4:
|
|
|
|
```bash
|
|
export PROJECT_ROOT=/path/to/eino # Eino core library root directory
|
|
```
|
|
|
|
## Running
|
|
|
|
In the `examples/quickstart/chatwitheino` directory, run:
|
|
|
|
```bash
|
|
# Set the project root directory
|
|
export PROJECT_ROOT=/path/to/your/project
|
|
|
|
go run ./cmd/ch05
|
|
```
|
|
|
|
Output example:
|
|
|
|
```text
|
|
you> List the files in the current directory
|
|
[assistant] Let me list the files for you...
|
|
[tool call] list_files(directory: ".")
|
|
|
|
you> Read a nonexistent file
|
|
[assistant] Trying to read the file...
|
|
[tool call] read_file(file_path: "nonexistent.txt")
|
|
[tool result] [tool error] open nonexistent.txt: no such file or directory
|
|
[assistant] Sorry, the file doesn't exist...
|
|
```
|
|
|
|
## Key Concepts
|
|
|
|
### Middleware Interface
|
|
|
|
`ChatModelAgentMiddleware` is the middleware interface for Agent:
|
|
|
|
```go
|
|
type ChatModelAgentMiddleware interface {
|
|
// BeforeAgent is called before each agent run, allowing modification of
|
|
// the agent's instruction and tools configuration.
|
|
BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error)
|
|
|
|
// BeforeModelRewriteState is called before each model invocation.
|
|
// The returned state is persisted to the agent's internal state and passed to the model.
|
|
BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)
|
|
|
|
// AfterModelRewriteState is called after each model invocation.
|
|
// The input state includes the model's response as the last message.
|
|
AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)
|
|
|
|
// WrapInvokableToolCall wraps a tool's synchronous execution with custom behavior.
|
|
// This method is only called for tools that implement InvokableTool.
|
|
WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error)
|
|
|
|
// WrapStreamableToolCall wraps a tool's streaming execution with custom behavior.
|
|
// This method is only called for tools that implement StreamableTool.
|
|
WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error)
|
|
|
|
// WrapEnhancedInvokableToolCall wraps an enhanced tool's synchronous execution.
|
|
// This method is only called for tools that implement EnhancedInvokableTool.
|
|
WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error)
|
|
|
|
// WrapEnhancedStreamableToolCall wraps an enhanced tool's streaming execution.
|
|
// This method is only called for tools that implement EnhancedStreamableTool.
|
|
WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error)
|
|
|
|
// WrapModel wraps a chat model with custom behavior.
|
|
// This method is called at request time when the model is about to be invoked.
|
|
WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error)
|
|
}
|
|
```
|
|
|
|
**Design philosophy:**
|
|
- **Decorator pattern**: Each Middleware wraps the original call, and can modify input, output, or errors
|
|
- **Onion model**: Requests pass through Middleware from outside to inside, responses return from inside to outside
|
|
- **Composable**: Multiple Middlewares execute in sequence
|
|
|
|
### Middleware Execution Order
|
|
|
|
`Handlers` (i.e., Middlewares) are wrapped in **array order**, forming an onion model:
|
|
|
|
```go
|
|
Handlers: []adk.ChatModelAgentMiddleware{
|
|
&middlewareA{}, // Outermost: wrapped first, intercepts requests first, but WrapModel takes effect last
|
|
&middlewareB{}, // Middle layer
|
|
&middlewareC{}, // Innermost: wrapped last
|
|
}
|
|
```
|
|
|
|
**Execution order for Tool calls:**
|
|
|
|
```
|
|
Request -> A.Wrap -> B.Wrap -> C.Wrap -> Actual Tool Execution -> C returns -> B returns -> A returns -> Response
|
|
```
|
|
|
|
**Practical advice:** Place `safeToolMiddleware` (error capture) at the innermost layer (end of the array) to ensure that interrupt errors thrown by other Middlewares can propagate outward correctly.
|
|
|
|
### SafeToolMiddleware
|
|
|
|
`SafeToolMiddleware` converts Tool errors into strings so the model can understand and handle them:
|
|
|
|
```go
|
|
type safeToolMiddleware struct {
|
|
*adk.BaseChatModelAgentMiddleware
|
|
}
|
|
|
|
func (m *safeToolMiddleware) WrapInvokableToolCall(
|
|
_ context.Context,
|
|
endpoint adk.InvokableToolCallEndpoint,
|
|
_ *adk.ToolContext,
|
|
) (adk.InvokableToolCallEndpoint, error) {
|
|
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
|
|
result, err := endpoint(ctx, args, opts...)
|
|
if err != nil {
|
|
// Convert error to string instead of returning an error
|
|
return fmt.Sprintf("[tool error] %v", err), nil
|
|
}
|
|
return result, nil
|
|
}, nil
|
|
}
|
|
```
|
|
|
|
**Effect:**
|
|
|
|
```text
|
|
[tool call] read_file(file_path: "nonexistent.txt")
|
|
[tool result] [tool error] open nonexistent.txt: no such file or directory
|
|
[assistant] Sorry, the file doesn't exist, please check the file path...
|
|
// Conversation continues, the model can adjust its strategy based on the error information
|
|
```
|
|
|
|
### ModelRetryConfig
|
|
|
|
`ModelRetryConfig` configures automatic retries for ChatModel:
|
|
|
|
```go
|
|
type ModelRetryConfig struct {
|
|
MaxRetries int // Maximum number of retries
|
|
IsRetryAble func(ctx context.Context, err error) bool // Determines if an error is retryable
|
|
}
|
|
```
|
|
|
|
**Usage (using DeepAgent as an example):**
|
|
|
|
```go
|
|
agent, err := deep.New(ctx, &deep.Config{
|
|
// ...
|
|
ModelRetryConfig: &adk.ModelRetryConfig{
|
|
MaxRetries: 5,
|
|
IsRetryAble: func(_ context.Context, err error) bool {
|
|
// 429 rate limiting errors are retryable
|
|
return strings.Contains(err.Error(), "429") ||
|
|
strings.Contains(err.Error(), "Too Many Requests") ||
|
|
strings.Contains(err.Error(), "qpm limit")
|
|
},
|
|
},
|
|
})
|
|
```
|
|
|
|
**Retry strategy:**
|
|
- Exponential backoff: Retry intervals increase with each attempt
|
|
- Configurable conditions: Use `IsRetryAble` to determine which errors are retryable
|
|
- Automatic recovery: No user intervention needed
|
|
|
|
## Middleware Implementation
|
|
|
|
### 1. Implement SafeToolMiddleware
|
|
|
|
```go
|
|
type safeToolMiddleware struct {
|
|
*adk.BaseChatModelAgentMiddleware
|
|
}
|
|
|
|
func (m *safeToolMiddleware) WrapInvokableToolCall(
|
|
_ context.Context,
|
|
endpoint adk.InvokableToolCallEndpoint,
|
|
_ *adk.ToolContext,
|
|
) (adk.InvokableToolCallEndpoint, error) {
|
|
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
|
|
result, err := endpoint(ctx, args, opts...)
|
|
if err != nil {
|
|
// Don't convert interrupt errors, they need to continue propagating
|
|
if _, ok := compose.IsInterruptRerunError(err); ok {
|
|
return "", err
|
|
}
|
|
// Convert other errors to strings
|
|
return fmt.Sprintf("[tool error] %v", err), nil
|
|
}
|
|
return result, nil
|
|
}, nil
|
|
}
|
|
```
|
|
|
|
### 2. Implement Streaming Tool Error Handling
|
|
|
|
```go
|
|
func (m *safeToolMiddleware) WrapStreamableToolCall(
|
|
_ context.Context,
|
|
endpoint adk.StreamableToolCallEndpoint,
|
|
_ *adk.ToolContext,
|
|
) (adk.StreamableToolCallEndpoint, error) {
|
|
return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) {
|
|
sr, err := endpoint(ctx, args, opts...)
|
|
if err != nil {
|
|
if _, ok := compose.IsInterruptRerunError(err); ok {
|
|
return nil, err
|
|
}
|
|
// Return a single-frame stream containing the error message
|
|
return singleChunkReader(fmt.Sprintf("[tool error] %v", err)), nil
|
|
}
|
|
// Wrap the stream to catch errors within it
|
|
return safeWrapReader(sr), nil
|
|
}, nil
|
|
}
|
|
```
|
|
|
|
### 3. Configure the Agent to Use Middleware
|
|
|
|
This chapter continues using the `DeepAgent` introduced in Chapter 4, registering Middleware in its `Handlers` field:
|
|
|
|
```go
|
|
agent, err := deep.New(ctx, &deep.Config{
|
|
Name: "Ch05MiddlewareAgent",
|
|
Description: "ChatWithDoc agent with safe tool middleware and retry.",
|
|
ChatModel: cm,
|
|
Instruction: agentInstruction,
|
|
Backend: backend,
|
|
StreamingShell: backend,
|
|
MaxIteration: 50,
|
|
Handlers: []adk.ChatModelAgentMiddleware{
|
|
&safeToolMiddleware{}, // Converts Tool errors to strings
|
|
},
|
|
ModelRetryConfig: &adk.ModelRetryConfig{
|
|
MaxRetries: 5,
|
|
IsRetryAble: func(_ context.Context, err error) bool {
|
|
return strings.Contains(err.Error(), "429") ||
|
|
strings.Contains(err.Error(), "Too Many Requests")
|
|
},
|
|
},
|
|
})
|
|
```
|
|
|
|
**Note**: The `Handlers` field (in the config) and "Middleware" (the concept discussed in documentation) are the same thing — `Handlers` is the config field name, while `ChatModelAgentMiddleware` is the interface name.
|
|
|
|
**Key code snippet (Note: this is a simplified code snippet that cannot be run directly. For the complete code, please refer to** [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go)**)**:
|
|
|
|
```go
|
|
// SafeToolMiddleware catches Tool errors and converts them to strings
|
|
type safeToolMiddleware struct {
|
|
*adk.BaseChatModelAgentMiddleware
|
|
}
|
|
|
|
func (m *safeToolMiddleware) WrapInvokableToolCall(
|
|
_ context.Context,
|
|
endpoint adk.InvokableToolCallEndpoint,
|
|
_ *adk.ToolContext,
|
|
) (adk.InvokableToolCallEndpoint, error) {
|
|
return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
|
|
result, err := endpoint(ctx, args, opts...)
|
|
if err != nil {
|
|
if _, ok := compose.IsInterruptRerunError(err); ok {
|
|
return "", err
|
|
}
|
|
return fmt.Sprintf("[tool error] %v", err), nil
|
|
}
|
|
return result, nil
|
|
}, nil
|
|
}
|
|
|
|
// Configure DeepAgent (same as Chapter 4, with Handlers and ModelRetryConfig added)
|
|
agent, _ := deep.New(ctx, &deep.Config{
|
|
ChatModel: cm,
|
|
Backend: backend,
|
|
StreamingShell: backend,
|
|
MaxIteration: 50,
|
|
Handlers: []adk.ChatModelAgentMiddleware{
|
|
&safeToolMiddleware{},
|
|
},
|
|
ModelRetryConfig: &adk.ModelRetryConfig{
|
|
MaxRetries: 5,
|
|
IsRetryAble: func(_ context.Context, err error) bool {
|
|
return strings.Contains(err.Error(), "429")
|
|
},
|
|
},
|
|
})
|
|
```
|
|
|
|
## Middleware Execution Flow
|
|
|
|
```
|
|
+------------------------------------------+
|
|
| User: Read a nonexistent file |
|
|
+------------------------------------------+
|
|
|
|
|
+------------------------+
|
|
| Agent analyzes intent |
|
|
| Decides to call |
|
|
| read_file |
|
|
+------------------------+
|
|
|
|
|
+------------------------+
|
|
| SafeToolMiddleware |
|
|
| Intercepts Tool call |
|
|
+------------------------+
|
|
|
|
|
+------------------------+
|
|
| Execute read_file |
|
|
| Returns error |
|
|
+------------------------+
|
|
|
|
|
+------------------------+
|
|
| SafeToolMiddleware |
|
|
| Converts error to |
|
|
| string |
|
|
+------------------------+
|
|
|
|
|
+------------------------+
|
|
| Return Tool Result |
|
|
| "[tool error] ..." |
|
|
+------------------------+
|
|
|
|
|
+------------------------+
|
|
| Agent generates reply |
|
|
| "Sorry, the file |
|
|
| doesn't exist..." |
|
|
+------------------------+
|
|
```
|
|
|
|
## Chapter Summary
|
|
|
|
- **Middleware**: An interceptor for the Agent that inserts custom logic before and after calls
|
|
- **SafeToolMiddleware**: Converts Tool errors to strings so the model can understand and handle them
|
|
- **ModelRetryConfig**: Configures automatic retries for ChatModel to handle temporary errors like rate limiting
|
|
- **Decorator pattern**: Middleware wraps the original call, and can modify input, output, or errors
|
|
- **Onion model**: Requests pass through Middleware from outside to inside, responses return from inside to outside
|
|
|
|
## Further Thinking
|
|
|
|
**Eino Built-in Middlewares:**
|
|
|
|
| Middleware | Description |
|
|
|------------|-------------|
|
|
| **reduction** | Tool output reduction — when tool output is too long, automatically truncates and offloads to the filesystem to prevent context overflow |
|
|
| **summarization** | Automatic conversation history summarization — when token count exceeds a threshold, automatically generates summaries to compress history |
|
|
| **skill** | Skill loading middleware — enables the Agent to dynamically load and execute predefined skills |
|
|
|
|
**Middleware chain example:**
|
|
|
|
```go
|
|
import (
|
|
"github.com/cloudwego/eino/adk/middlewares/reduction"
|
|
"github.com/cloudwego/eino/adk/middlewares/summarization"
|
|
"github.com/cloudwego/eino/adk/middlewares/skill"
|
|
)
|
|
|
|
// Create reduction middleware: manages tool output length
|
|
reductionMW, _ := reduction.New(ctx, &reduction.Config{
|
|
Backend: filesystemBackend, // Storage backend
|
|
MaxLengthForTrunc: 50000, // Max length for single tool output
|
|
MaxTokensForClear: 30000, // Token threshold to trigger cleanup
|
|
})
|
|
|
|
// Create summarization middleware: automatically compresses conversation history
|
|
summarizationMW, _ := summarization.New(ctx, &summarization.Config{
|
|
Model: chatModel, // Model used to generate summaries
|
|
Trigger: &summarization.TriggerCondition{
|
|
ContextTokens: 190000, // Token threshold to trigger summarization
|
|
},
|
|
})
|
|
|
|
// Combine multiple middlewares (conceptual example; when using DeepAgent, replace adk.NewChatModelAgent with deep.New)
|
|
agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
|
|
Handlers: []adk.ChatModelAgentMiddleware{ // Note: config field name is Handlers, conceptually equivalent to Middlewares
|
|
summarizationMW, // Outermost: conversation history summarization
|
|
reductionMW, // Middle layer: tool output reduction
|
|
},
|
|
})
|
|
```
|