--- title: "Chapter 5: Middleware" --- The goal of this chapter is to understand the Middleware pattern and implement Tool error handling and ChatModel retry mechanisms. ## Why We Need Middleware In Chapter 4, we added Tool capabilities to the Agent, enabling it to access the filesystem. But in real-world scenarios, **Tool errors and ChatModel errors are common**, for example: - **Tool errors**: File not found, parameter errors, insufficient permissions, etc. - **ChatModel errors**: API rate limiting (429), network timeouts, service unavailable, etc. ### Problem 1: Tool Errors Interrupt the Entire Flow When a Tool execution fails, the error propagates directly to the Agent, causing the entire conversation to be interrupted: ```text [tool call] read_file(file_path: "nonexistent.txt") Error: open nonexistent.txt: no such file or directory // Conversation interrupted, user needs to start over ``` ### Problem 2: Model Calls May Fail Due to Rate Limiting When the model API returns a 429 (Too Many Requests) error, the entire conversation is also interrupted: ```text Error: rate limit exceeded (429) // Conversation interrupted ``` ### Expected Behavior These errors **should not directly terminate the Agent flow**. Instead, the error information should be passed to the model so it can self-correct and proceed to the next round. For example: ```text [tool call] read_file(file_path: "nonexistent.txt") [tool result] [tool error] open nonexistent.txt: no such file or directory [assistant] Sorry, the file doesn't exist. Let me list the files in the current directory first... [tool call] glob(pattern: "*") ``` ### The Role of Middleware The **Middleware pattern** can extend the behavior of Tools and ChatModel, making it ideal for solving this problem: - **Middleware is an interceptor for the Agent**: Inserts custom logic before and after calls - **Middleware can handle errors**: Converts errors into a format the model can understand - **Middleware can implement retries**: Automatically retries failed operations - **Middleware is composable**: Multiple Middlewares can be chained together **Simple analogy:** - **Agent** = "business logic" - **Middleware** = "AOP aspects" (logging, retry, error handling, and other cross-cutting concerns) ## Code Location - Entry code: [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go) ## Prerequisites Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark). Additionally, you need to set `PROJECT_ROOT` as in Chapter 4: ```bash export PROJECT_ROOT=/path/to/eino # Eino core library root directory ``` ## Running In the `examples/quickstart/chatwitheino` directory, run: ```bash # Set the project root directory export PROJECT_ROOT=/path/to/your/project go run ./cmd/ch05 ``` Output example: ```text you> List the files in the current directory [assistant] Let me list the files for you... [tool call] list_files(directory: ".") you> Read a nonexistent file [assistant] Trying to read the file... [tool call] read_file(file_path: "nonexistent.txt") [tool result] [tool error] open nonexistent.txt: no such file or directory [assistant] Sorry, the file doesn't exist... ``` ## Key Concepts ### Middleware Interface `ChatModelAgentMiddleware` is the middleware interface for Agent: ```go type ChatModelAgentMiddleware interface { // BeforeAgent is called before each agent run, allowing modification of // the agent's instruction and tools configuration. BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) // BeforeModelRewriteState is called before each model invocation. // The returned state is persisted to the agent's internal state and passed to the model. BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) // AfterModelRewriteState is called after each model invocation. // The input state includes the model's response as the last message. AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) // WrapInvokableToolCall wraps a tool's synchronous execution with custom behavior. // This method is only called for tools that implement InvokableTool. WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) // WrapStreamableToolCall wraps a tool's streaming execution with custom behavior. // This method is only called for tools that implement StreamableTool. WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) // WrapEnhancedInvokableToolCall wraps an enhanced tool's synchronous execution. // This method is only called for tools that implement EnhancedInvokableTool. WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) // WrapEnhancedStreamableToolCall wraps an enhanced tool's streaming execution. // This method is only called for tools that implement EnhancedStreamableTool. WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) // WrapModel wraps a chat model with custom behavior. // This method is called at request time when the model is about to be invoked. WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) } ``` **Design philosophy:** - **Decorator pattern**: Each Middleware wraps the original call, and can modify input, output, or errors - **Onion model**: Requests pass through Middleware from outside to inside, responses return from inside to outside - **Composable**: Multiple Middlewares execute in sequence ### Middleware Execution Order `Handlers` (i.e., Middlewares) are wrapped in **array order**, forming an onion model: ```go Handlers: []adk.ChatModelAgentMiddleware{ &middlewareA{}, // Outermost: wrapped first, intercepts requests first, but WrapModel takes effect last &middlewareB{}, // Middle layer &middlewareC{}, // Innermost: wrapped last } ``` **Execution order for Tool calls:** ``` Request -> A.Wrap -> B.Wrap -> C.Wrap -> Actual Tool Execution -> C returns -> B returns -> A returns -> Response ``` **Practical advice:** Place `safeToolMiddleware` (error capture) at the innermost layer (end of the array) to ensure that interrupt errors thrown by other Middlewares can propagate outward correctly. ### SafeToolMiddleware `SafeToolMiddleware` converts Tool errors into strings so the model can understand and handle them: ```go type safeToolMiddleware struct { *adk.BaseChatModelAgentMiddleware } func (m *safeToolMiddleware) WrapInvokableToolCall( _ context.Context, endpoint adk.InvokableToolCallEndpoint, _ *adk.ToolContext, ) (adk.InvokableToolCallEndpoint, error) { return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { result, err := endpoint(ctx, args, opts...) if err != nil { // Convert error to string instead of returning an error return fmt.Sprintf("[tool error] %v", err), nil } return result, nil }, nil } ``` **Effect:** ```text [tool call] read_file(file_path: "nonexistent.txt") [tool result] [tool error] open nonexistent.txt: no such file or directory [assistant] Sorry, the file doesn't exist, please check the file path... // Conversation continues, the model can adjust its strategy based on the error information ``` ### ModelRetryConfig `ModelRetryConfig` configures automatic retries for ChatModel: ```go type ModelRetryConfig struct { MaxRetries int // Maximum number of retries IsRetryAble func(ctx context.Context, err error) bool // Determines if an error is retryable } ``` **Usage (using DeepAgent as an example):** ```go agent, err := deep.New(ctx, &deep.Config{ // ... ModelRetryConfig: &adk.ModelRetryConfig{ MaxRetries: 5, IsRetryAble: func(_ context.Context, err error) bool { // 429 rate limiting errors are retryable return strings.Contains(err.Error(), "429") || strings.Contains(err.Error(), "Too Many Requests") || strings.Contains(err.Error(), "qpm limit") }, }, }) ``` **Retry strategy:** - Exponential backoff: Retry intervals increase with each attempt - Configurable conditions: Use `IsRetryAble` to determine which errors are retryable - Automatic recovery: No user intervention needed ## Middleware Implementation ### 1. Implement SafeToolMiddleware ```go type safeToolMiddleware struct { *adk.BaseChatModelAgentMiddleware } func (m *safeToolMiddleware) WrapInvokableToolCall( _ context.Context, endpoint adk.InvokableToolCallEndpoint, _ *adk.ToolContext, ) (adk.InvokableToolCallEndpoint, error) { return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { result, err := endpoint(ctx, args, opts...) if err != nil { // Don't convert interrupt errors, they need to continue propagating if _, ok := compose.IsInterruptRerunError(err); ok { return "", err } // Convert other errors to strings return fmt.Sprintf("[tool error] %v", err), nil } return result, nil }, nil } ``` ### 2. Implement Streaming Tool Error Handling ```go func (m *safeToolMiddleware) WrapStreamableToolCall( _ context.Context, endpoint adk.StreamableToolCallEndpoint, _ *adk.ToolContext, ) (adk.StreamableToolCallEndpoint, error) { return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) { sr, err := endpoint(ctx, args, opts...) if err != nil { if _, ok := compose.IsInterruptRerunError(err); ok { return nil, err } // Return a single-frame stream containing the error message return singleChunkReader(fmt.Sprintf("[tool error] %v", err)), nil } // Wrap the stream to catch errors within it return safeWrapReader(sr), nil }, nil } ``` ### 3. Configure the Agent to Use Middleware This chapter continues using the `DeepAgent` introduced in Chapter 4, registering Middleware in its `Handlers` field: ```go agent, err := deep.New(ctx, &deep.Config{ Name: "Ch05MiddlewareAgent", Description: "ChatWithDoc agent with safe tool middleware and retry.", ChatModel: cm, Instruction: agentInstruction, Backend: backend, StreamingShell: backend, MaxIteration: 50, Handlers: []adk.ChatModelAgentMiddleware{ &safeToolMiddleware{}, // Converts Tool errors to strings }, ModelRetryConfig: &adk.ModelRetryConfig{ MaxRetries: 5, IsRetryAble: func(_ context.Context, err error) bool { return strings.Contains(err.Error(), "429") || strings.Contains(err.Error(), "Too Many Requests") }, }, }) ``` **Note**: The `Handlers` field (in the config) and "Middleware" (the concept discussed in documentation) are the same thing — `Handlers` is the config field name, while `ChatModelAgentMiddleware` is the interface name. **Key code snippet (Note: this is a simplified code snippet that cannot be run directly. For the complete code, please refer to** [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go)**)**: ```go // SafeToolMiddleware catches Tool errors and converts them to strings type safeToolMiddleware struct { *adk.BaseChatModelAgentMiddleware } func (m *safeToolMiddleware) WrapInvokableToolCall( _ context.Context, endpoint adk.InvokableToolCallEndpoint, _ *adk.ToolContext, ) (adk.InvokableToolCallEndpoint, error) { return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { result, err := endpoint(ctx, args, opts...) if err != nil { if _, ok := compose.IsInterruptRerunError(err); ok { return "", err } return fmt.Sprintf("[tool error] %v", err), nil } return result, nil }, nil } // Configure DeepAgent (same as Chapter 4, with Handlers and ModelRetryConfig added) agent, _ := deep.New(ctx, &deep.Config{ ChatModel: cm, Backend: backend, StreamingShell: backend, MaxIteration: 50, Handlers: []adk.ChatModelAgentMiddleware{ &safeToolMiddleware{}, }, ModelRetryConfig: &adk.ModelRetryConfig{ MaxRetries: 5, IsRetryAble: func(_ context.Context, err error) bool { return strings.Contains(err.Error(), "429") }, }, }) ``` ## Middleware Execution Flow ``` +------------------------------------------+ | User: Read a nonexistent file | +------------------------------------------+ | +------------------------+ | Agent analyzes intent | | Decides to call | | read_file | +------------------------+ | +------------------------+ | SafeToolMiddleware | | Intercepts Tool call | +------------------------+ | +------------------------+ | Execute read_file | | Returns error | +------------------------+ | +------------------------+ | SafeToolMiddleware | | Converts error to | | string | +------------------------+ | +------------------------+ | Return Tool Result | | "[tool error] ..." | +------------------------+ | +------------------------+ | Agent generates reply | | "Sorry, the file | | doesn't exist..." | +------------------------+ ``` ## Chapter Summary - **Middleware**: An interceptor for the Agent that inserts custom logic before and after calls - **SafeToolMiddleware**: Converts Tool errors to strings so the model can understand and handle them - **ModelRetryConfig**: Configures automatic retries for ChatModel to handle temporary errors like rate limiting - **Decorator pattern**: Middleware wraps the original call, and can modify input, output, or errors - **Onion model**: Requests pass through Middleware from outside to inside, responses return from inside to outside ## Further Thinking **Eino Built-in Middlewares:** | Middleware | Description | |------------|-------------| | **reduction** | Tool output reduction — when tool output is too long, automatically truncates and offloads to the filesystem to prevent context overflow | | **summarization** | Automatic conversation history summarization — when token count exceeds a threshold, automatically generates summaries to compress history | | **skill** | Skill loading middleware — enables the Agent to dynamically load and execute predefined skills | **Middleware chain example:** ```go import ( "github.com/cloudwego/eino/adk/middlewares/reduction" "github.com/cloudwego/eino/adk/middlewares/summarization" "github.com/cloudwego/eino/adk/middlewares/skill" ) // Create reduction middleware: manages tool output length reductionMW, _ := reduction.New(ctx, &reduction.Config{ Backend: filesystemBackend, // Storage backend MaxLengthForTrunc: 50000, // Max length for single tool output MaxTokensForClear: 30000, // Token threshold to trigger cleanup }) // Create summarization middleware: automatically compresses conversation history summarizationMW, _ := summarization.New(ctx, &summarization.Config{ Model: chatModel, // Model used to generate summaries Trigger: &summarization.TriggerCondition{ ContextTokens: 190000, // Token threshold to trigger summarization }, }) // Combine multiple middlewares (conceptual example; when using DeepAgent, replace adk.NewChatModelAgent with deep.New) agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ Handlers: []adk.ChatModelAgentMiddleware{ // Note: config field name is Handlers, conceptually equivalent to Middlewares summarizationMW, // Outermost: conversation history summarization reductionMW, // Middle layer: tool output reduction }, }) ```