eino-examples/quickstart/chatwitheino/docs/english_docs/ch05_middleware.md

---
title: "Chapter 5: Middleware"
---

The goal of this chapter is to understand the Middleware pattern and implement Tool error handling and ChatModel retry mechanisms.

## Why We Need Middleware

In Chapter 4, we added Tool capabilities to the Agent, enabling it to access the filesystem. But in real-world scenarios, **Tool errors and ChatModel errors are common**, for example:

- **Tool errors**: File not found, parameter errors, insufficient permissions, etc.
- **ChatModel errors**: API rate limiting (429), network timeouts, service unavailable, etc.

### Problem 1: Tool Errors Interrupt the Entire Flow

When a Tool execution fails, the error propagates directly to the Agent, causing the entire conversation to be interrupted:

```text
[tool call] read_file(file_path: "nonexistent.txt")
Error: open nonexistent.txt: no such file or directory
// Conversation interrupted, user needs to start over
```

### Problem 2: Model Calls May Fail Due to Rate Limiting

When the model API returns a 429 (Too Many Requests) error, the entire conversation is also interrupted:

```text
Error: rate limit exceeded (429)
// Conversation interrupted
```

### Expected Behavior

These errors **should not directly terminate the Agent flow**. Instead, the error information should be passed to the model so it can self-correct and proceed to the next round. For example:

```text
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist. Let me list the files in the current directory first...
[tool call] glob(pattern: "*")
```

### The Role of Middleware

The **Middleware pattern** can extend the behavior of Tools and ChatModel, making it ideal for solving this problem:

- **Middleware is an interceptor for the Agent**: Inserts custom logic before and after calls
- **Middleware can handle errors**: Converts errors into a format the model can understand
- **Middleware can implement retries**: Automatically retries failed operations
- **Middleware is composable**: Multiple Middlewares can be chained together

**Simple analogy:**
- **Agent** = "business logic"
- **Middleware** = "AOP aspects" (logging, retry, error handling, and other cross-cutting concerns)

## Code Location

- Entry code: [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go)

## Prerequisites

Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark). Additionally, you need to set `PROJECT_ROOT` as in Chapter 4:

```bash
export PROJECT_ROOT=/path/to/eino  # Eino core library root directory
```

## Running

In the `examples/quickstart/chatwitheino` directory, run:

```bash
# Set the project root directory
export PROJECT_ROOT=/path/to/your/project

go run ./cmd/ch05
```

Output example:

```text
you> List the files in the current directory
[assistant] Let me list the files for you...
[tool call] list_files(directory: ".")

you> Read a nonexistent file
[assistant] Trying to read the file...
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist...
```

## Key Concepts

### Middleware Interface

`ChatModelAgentMiddleware` is the middleware interface for Agent:

```go
type ChatModelAgentMiddleware interface {
    // BeforeAgent is called before each agent run, allowing modification of
    // the agent's instruction and tools configuration.
    BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error)

    // BeforeModelRewriteState is called before each model invocation.
    // The returned state is persisted to the agent's internal state and passed to the model.
    BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)

    // AfterModelRewriteState is called after each model invocation.
    // The input state includes the model's response as the last message.
    AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)

    // WrapInvokableToolCall wraps a tool's synchronous execution with custom behavior.
    // This method is only called for tools that implement InvokableTool.
    WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error)

    // WrapStreamableToolCall wraps a tool's streaming execution with custom behavior.
    // This method is only called for tools that implement StreamableTool.
    WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error)

    // WrapEnhancedInvokableToolCall wraps an enhanced tool's synchronous execution.
    // This method is only called for tools that implement EnhancedInvokableTool.
    WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error)

    // WrapEnhancedStreamableToolCall wraps an enhanced tool's streaming execution.
    // This method is only called for tools that implement EnhancedStreamableTool.
    WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error)

    // WrapModel wraps a chat model with custom behavior.
    // This method is called at request time when the model is about to be invoked.
    WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error)
}
```

**Design philosophy:**
- **Decorator pattern**: Each Middleware wraps the original call, and can modify input, output, or errors
- **Onion model**: Requests pass through Middleware from outside to inside, responses return from inside to outside
- **Composable**: Multiple Middlewares execute in sequence

### Middleware Execution Order

`Handlers` (i.e., Middlewares) are wrapped in **array order**, forming an onion model:

```go
Handlers: []adk.ChatModelAgentMiddleware{
    &middlewareA{},  // Outermost: wrapped first, intercepts requests first, but WrapModel takes effect last
    &middlewareB{},  // Middle layer
    &middlewareC{},  // Innermost: wrapped last
}
```

**Execution order for Tool calls:**

```
Request -> A.Wrap -> B.Wrap -> C.Wrap -> Actual Tool Execution -> C returns -> B returns -> A returns -> Response
```

**Practical advice:** Place `safeToolMiddleware` (error capture) at the innermost layer (end of the array) to ensure that interrupt errors thrown by other Middlewares can propagate outward correctly.

### SafeToolMiddleware

`SafeToolMiddleware` converts Tool errors into strings so the model can understand and handle them:

```go
type safeToolMiddleware struct {
    *adk.BaseChatModelAgentMiddleware
}

func (m *safeToolMiddleware) WrapInvokableToolCall(
    _ context.Context,
    endpoint adk.InvokableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
        result, err := endpoint(ctx, args, opts...)
        if err != nil {
            // Convert error to string instead of returning an error
            return fmt.Sprintf("[tool error] %v", err), nil
        }
        return result, nil
    }, nil
}
```

**Effect:**

```text
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist, please check the file path...
// Conversation continues, the model can adjust its strategy based on the error information
```

### ModelRetryConfig

`ModelRetryConfig` configures automatic retries for ChatModel:

```go
type ModelRetryConfig struct {
    MaxRetries int                          // Maximum number of retries
    IsRetryAble func(ctx context.Context, err error) bool  // Determines if an error is retryable
}
```

**Usage (using DeepAgent as an example):**

```go
agent, err := deep.New(ctx, &deep.Config{
    // ...
    ModelRetryConfig: &adk.ModelRetryConfig{
        MaxRetries: 5,
        IsRetryAble: func(_ context.Context, err error) bool {
            // 429 rate limiting errors are retryable
            return strings.Contains(err.Error(), "429") ||
                strings.Contains(err.Error(), "Too Many Requests") ||
                strings.Contains(err.Error(), "qpm limit")
        },
    },
})
```

**Retry strategy:**
- Exponential backoff: Retry intervals increase with each attempt
- Configurable conditions: Use `IsRetryAble` to determine which errors are retryable
- Automatic recovery: No user intervention needed

## Middleware Implementation

### 1. Implement SafeToolMiddleware

```go
type safeToolMiddleware struct {
    *adk.BaseChatModelAgentMiddleware
}

func (m *safeToolMiddleware) WrapInvokableToolCall(
    _ context.Context,
    endpoint adk.InvokableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
        result, err := endpoint(ctx, args, opts...)
        if err != nil {
            // Don't convert interrupt errors, they need to continue propagating
            if _, ok := compose.IsInterruptRerunError(err); ok {
                return "", err
            }
            // Convert other errors to strings
            return fmt.Sprintf("[tool error] %v", err), nil
        }
        return result, nil
    }, nil
}
```

### 2. Implement Streaming Tool Error Handling

```go
func (m *safeToolMiddleware) WrapStreamableToolCall(
    _ context.Context,
    endpoint adk.StreamableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.StreamableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) {
        sr, err := endpoint(ctx, args, opts...)
        if err != nil {
            if _, ok := compose.IsInterruptRerunError(err); ok {
                return nil, err
            }
            // Return a single-frame stream containing the error message
            return singleChunkReader(fmt.Sprintf("[tool error] %v", err)), nil
        }
        // Wrap the stream to catch errors within it
        return safeWrapReader(sr), nil
    }, nil
}
```

### 3. Configure the Agent to Use Middleware

This chapter continues using the `DeepAgent` introduced in Chapter 4, registering Middleware in its `Handlers` field:

```go
agent, err := deep.New(ctx, &deep.Config{
    Name:           "Ch05MiddlewareAgent",
    Description:    "ChatWithDoc agent with safe tool middleware and retry.",
    ChatModel:      cm,
    Instruction:    agentInstruction,
    Backend:        backend,
    StreamingShell: backend,
    MaxIteration:   50,
    Handlers: []adk.ChatModelAgentMiddleware{
        &safeToolMiddleware{},  // Converts Tool errors to strings
    },
    ModelRetryConfig: &adk.ModelRetryConfig{
        MaxRetries: 5,
        IsRetryAble: func(_ context.Context, err error) bool {
            return strings.Contains(err.Error(), "429") ||
                strings.Contains(err.Error(), "Too Many Requests")
        },
    },
})
```

**Note**: The `Handlers` field (in the config) and "Middleware" (the concept discussed in documentation) are the same thing — `Handlers` is the config field name, while `ChatModelAgentMiddleware` is the interface name.

**Key code snippet (Note: this is a simplified code snippet that cannot be run directly. For the complete code, please refer to** [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go)**)**:

```go
// SafeToolMiddleware catches Tool errors and converts them to strings
type safeToolMiddleware struct {
    *adk.BaseChatModelAgentMiddleware
}

func (m *safeToolMiddleware) WrapInvokableToolCall(
    _ context.Context,
    endpoint adk.InvokableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
        result, err := endpoint(ctx, args, opts...)
        if err != nil {
            if _, ok := compose.IsInterruptRerunError(err); ok {
                return "", err
            }
            return fmt.Sprintf("[tool error] %v", err), nil
        }
        return result, nil
    }, nil
}

// Configure DeepAgent (same as Chapter 4, with Handlers and ModelRetryConfig added)
agent, _ := deep.New(ctx, &deep.Config{
    ChatModel:      cm,
    Backend:        backend,
    StreamingShell: backend,
    MaxIteration:   50,
    Handlers: []adk.ChatModelAgentMiddleware{
        &safeToolMiddleware{},
    },
    ModelRetryConfig: &adk.ModelRetryConfig{
        MaxRetries: 5,
        IsRetryAble: func(_ context.Context, err error) bool {
            return strings.Contains(err.Error(), "429")
        },
    },
})
```

## Middleware Execution Flow

```
+------------------------------------------+
|  User: Read a nonexistent file            |
+------------------------------------------+
                   |
        +------------------------+
        |  Agent analyzes intent  |
        |  Decides to call        |
        |  read_file              |
        +------------------------+
                   |
        +------------------------+
        |  SafeToolMiddleware     |
        |  Intercepts Tool call   |
        +------------------------+
                   |
        +------------------------+
        |  Execute read_file      |
        |  Returns error          |
        +------------------------+
                   |
        +------------------------+
        |  SafeToolMiddleware     |
        |  Converts error to      |
        |  string                 |
        +------------------------+
                   |
        +------------------------+
        |  Return Tool Result     |
        |  "[tool error] ..."     |
        +------------------------+
                   |
        +------------------------+
        |  Agent generates reply  |
        |  "Sorry, the file       |
        |   doesn't exist..."     |
        +------------------------+
```

## Chapter Summary

- **Middleware**: An interceptor for the Agent that inserts custom logic before and after calls
- **SafeToolMiddleware**: Converts Tool errors to strings so the model can understand and handle them
- **ModelRetryConfig**: Configures automatic retries for ChatModel to handle temporary errors like rate limiting
- **Decorator pattern**: Middleware wraps the original call, and can modify input, output, or errors
- **Onion model**: Requests pass through Middleware from outside to inside, responses return from inside to outside

## Further Thinking

**Eino Built-in Middlewares:**

| Middleware | Description |
|------------|-------------|
| **reduction** | Tool output reduction — when tool output is too long, automatically truncates and offloads to the filesystem to prevent context overflow |
| **summarization** | Automatic conversation history summarization — when token count exceeds a threshold, automatically generates summaries to compress history |
| **skill** | Skill loading middleware — enables the Agent to dynamically load and execute predefined skills |

**Middleware chain example:**

```go
import (
    "github.com/cloudwego/eino/adk/middlewares/reduction"
    "github.com/cloudwego/eino/adk/middlewares/summarization"
    "github.com/cloudwego/eino/adk/middlewares/skill"
)

// Create reduction middleware: manages tool output length
reductionMW, _ := reduction.New(ctx, &reduction.Config{
    Backend:           filesystemBackend,     // Storage backend
    MaxLengthForTrunc: 50000,                  // Max length for single tool output
    MaxTokensForClear: 30000,                  // Token threshold to trigger cleanup
})

// Create summarization middleware: automatically compresses conversation history
summarizationMW, _ := summarization.New(ctx, &summarization.Config{
    Model: chatModel,                          // Model used to generate summaries
    Trigger: &summarization.TriggerCondition{
        ContextTokens: 190000,                 // Token threshold to trigger summarization
    },
})

// Combine multiple middlewares (conceptual example; when using DeepAgent, replace adk.NewChatModelAgent with deep.New)
agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    Handlers: []adk.ChatModelAgentMiddleware{  // Note: config field name is Handlers, conceptually equivalent to Middlewares
        summarizationMW,   // Outermost: conversation history summarization
        reductionMW,       // Middle layer: tool output reduction
    },
})
```