You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

16 KiB

title
Chapter 5: Middleware

The goal of this chapter is to understand the Middleware pattern and implement Tool error handling and ChatModel retry mechanisms.

Why We Need Middleware

In Chapter 4, we added Tool capabilities to the Agent, enabling it to access the filesystem. But in real-world scenarios, Tool errors and ChatModel errors are common, for example:

  • Tool errors: File not found, parameter errors, insufficient permissions, etc.
  • ChatModel errors: API rate limiting (429), network timeouts, service unavailable, etc.

Problem 1: Tool Errors Interrupt the Entire Flow

When a Tool execution fails, the error propagates directly to the Agent, causing the entire conversation to be interrupted:

[tool call] read_file(file_path: "nonexistent.txt")
Error: open nonexistent.txt: no such file or directory
// Conversation interrupted, user needs to start over

Problem 2: Model Calls May Fail Due to Rate Limiting

When the model API returns a 429 (Too Many Requests) error, the entire conversation is also interrupted:

Error: rate limit exceeded (429)
// Conversation interrupted

Expected Behavior

These errors should not directly terminate the Agent flow. Instead, the error information should be passed to the model so it can self-correct and proceed to the next round. For example:

[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist. Let me list the files in the current directory first...
[tool call] glob(pattern: "*")

The Role of Middleware

The Middleware pattern can extend the behavior of Tools and ChatModel, making it ideal for solving this problem:

  • Middleware is an interceptor for the Agent: Inserts custom logic before and after calls
  • Middleware can handle errors: Converts errors into a format the model can understand
  • Middleware can implement retries: Automatically retries failed operations
  • Middleware is composable: Multiple Middlewares can be chained together

Simple analogy:

  • Agent = "business logic"
  • Middleware = "AOP aspects" (logging, retry, error handling, and other cross-cutting concerns)

Code Location

Prerequisites

Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark). Additionally, you need to set PROJECT_ROOT as in Chapter 4:

export PROJECT_ROOT=/path/to/eino  # Eino core library root directory

Running

In the examples/quickstart/chatwitheino directory, run:

# Set the project root directory
export PROJECT_ROOT=/path/to/your/project

go run ./cmd/ch05

Output example:

you> List the files in the current directory
[assistant] Let me list the files for you...
[tool call] list_files(directory: ".")

you> Read a nonexistent file
[assistant] Trying to read the file...
[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist...

Key Concepts

Middleware Interface

ChatModelAgentMiddleware is the middleware interface for Agent:

type ChatModelAgentMiddleware interface {
    // BeforeAgent is called before each agent run, allowing modification of
    // the agent's instruction and tools configuration.
    BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error)

    // BeforeModelRewriteState is called before each model invocation.
    // The returned state is persisted to the agent's internal state and passed to the model.
    BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)

    // AfterModelRewriteState is called after each model invocation.
    // The input state includes the model's response as the last message.
    AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error)

    // WrapInvokableToolCall wraps a tool's synchronous execution with custom behavior.
    // This method is only called for tools that implement InvokableTool.
    WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error)

    // WrapStreamableToolCall wraps a tool's streaming execution with custom behavior.
    // This method is only called for tools that implement StreamableTool.
    WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error)

    // WrapEnhancedInvokableToolCall wraps an enhanced tool's synchronous execution.
    // This method is only called for tools that implement EnhancedInvokableTool.
    WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error)

    // WrapEnhancedStreamableToolCall wraps an enhanced tool's streaming execution.
    // This method is only called for tools that implement EnhancedStreamableTool.
    WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error)

    // WrapModel wraps a chat model with custom behavior.
    // This method is called at request time when the model is about to be invoked.
    WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error)
}

Design philosophy:

  • Decorator pattern: Each Middleware wraps the original call, and can modify input, output, or errors
  • Onion model: Requests pass through Middleware from outside to inside, responses return from inside to outside
  • Composable: Multiple Middlewares execute in sequence

Middleware Execution Order

Handlers (i.e., Middlewares) are wrapped in array order, forming an onion model:

Handlers: []adk.ChatModelAgentMiddleware{
    &middlewareA{},  // Outermost: wrapped first, intercepts requests first, but WrapModel takes effect last
    &middlewareB{},  // Middle layer
    &middlewareC{},  // Innermost: wrapped last
}

Execution order for Tool calls:

Request -> A.Wrap -> B.Wrap -> C.Wrap -> Actual Tool Execution -> C returns -> B returns -> A returns -> Response

Practical advice: Place safeToolMiddleware (error capture) at the innermost layer (end of the array) to ensure that interrupt errors thrown by other Middlewares can propagate outward correctly.

SafeToolMiddleware

SafeToolMiddleware converts Tool errors into strings so the model can understand and handle them:

type safeToolMiddleware struct {
    *adk.BaseChatModelAgentMiddleware
}

func (m *safeToolMiddleware) WrapInvokableToolCall(
    _ context.Context,
    endpoint adk.InvokableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
        result, err := endpoint(ctx, args, opts...)
        if err != nil {
            // Convert error to string instead of returning an error
            return fmt.Sprintf("[tool error] %v", err), nil
        }
        return result, nil
    }, nil
}

Effect:

[tool call] read_file(file_path: "nonexistent.txt")
[tool result] [tool error] open nonexistent.txt: no such file or directory
[assistant] Sorry, the file doesn't exist, please check the file path...
// Conversation continues, the model can adjust its strategy based on the error information

ModelRetryConfig

ModelRetryConfig configures automatic retries for ChatModel:

type ModelRetryConfig struct {
    MaxRetries int                          // Maximum number of retries
    IsRetryAble func(ctx context.Context, err error) bool  // Determines if an error is retryable
}

Usage (using DeepAgent as an example):

agent, err := deep.New(ctx, &deep.Config{
    // ...
    ModelRetryConfig: &adk.ModelRetryConfig{
        MaxRetries: 5,
        IsRetryAble: func(_ context.Context, err error) bool {
            // 429 rate limiting errors are retryable
            return strings.Contains(err.Error(), "429") ||
                strings.Contains(err.Error(), "Too Many Requests") ||
                strings.Contains(err.Error(), "qpm limit")
        },
    },
})

Retry strategy:

  • Exponential backoff: Retry intervals increase with each attempt
  • Configurable conditions: Use IsRetryAble to determine which errors are retryable
  • Automatic recovery: No user intervention needed

Middleware Implementation

1. Implement SafeToolMiddleware

type safeToolMiddleware struct {
    *adk.BaseChatModelAgentMiddleware
}

func (m *safeToolMiddleware) WrapInvokableToolCall(
    _ context.Context,
    endpoint adk.InvokableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
        result, err := endpoint(ctx, args, opts...)
        if err != nil {
            // Don't convert interrupt errors, they need to continue propagating
            if _, ok := compose.IsInterruptRerunError(err); ok {
                return "", err
            }
            // Convert other errors to strings
            return fmt.Sprintf("[tool error] %v", err), nil
        }
        return result, nil
    }, nil
}

2. Implement Streaming Tool Error Handling

func (m *safeToolMiddleware) WrapStreamableToolCall(
    _ context.Context,
    endpoint adk.StreamableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.StreamableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) {
        sr, err := endpoint(ctx, args, opts...)
        if err != nil {
            if _, ok := compose.IsInterruptRerunError(err); ok {
                return nil, err
            }
            // Return a single-frame stream containing the error message
            return singleChunkReader(fmt.Sprintf("[tool error] %v", err)), nil
        }
        // Wrap the stream to catch errors within it
        return safeWrapReader(sr), nil
    }, nil
}

3. Configure the Agent to Use Middleware

This chapter continues using the DeepAgent introduced in Chapter 4, registering Middleware in its Handlers field:

agent, err := deep.New(ctx, &deep.Config{
    Name:           "Ch05MiddlewareAgent",
    Description:    "ChatWithDoc agent with safe tool middleware and retry.",
    ChatModel:      cm,
    Instruction:    agentInstruction,
    Backend:        backend,
    StreamingShell: backend,
    MaxIteration:   50,
    Handlers: []adk.ChatModelAgentMiddleware{
        &safeToolMiddleware{},  // Converts Tool errors to strings
    },
    ModelRetryConfig: &adk.ModelRetryConfig{
        MaxRetries: 5,
        IsRetryAble: func(_ context.Context, err error) bool {
            return strings.Contains(err.Error(), "429") ||
                strings.Contains(err.Error(), "Too Many Requests")
        },
    },
})

Note: The Handlers field (in the config) and "Middleware" (the concept discussed in documentation) are the same thing — Handlers is the config field name, while ChatModelAgentMiddleware is the interface name.

Key code snippet (Note: this is a simplified code snippet that cannot be run directly. For the complete code, please refer to cmd/ch05/main.go):

// SafeToolMiddleware catches Tool errors and converts them to strings
type safeToolMiddleware struct {
    *adk.BaseChatModelAgentMiddleware
}

func (m *safeToolMiddleware) WrapInvokableToolCall(
    _ context.Context,
    endpoint adk.InvokableToolCallEndpoint,
    _ *adk.ToolContext,
) (adk.InvokableToolCallEndpoint, error) {
    return func(ctx context.Context, args string, opts ...tool.Option) (string, error) {
        result, err := endpoint(ctx, args, opts...)
        if err != nil {
            if _, ok := compose.IsInterruptRerunError(err); ok {
                return "", err
            }
            return fmt.Sprintf("[tool error] %v", err), nil
        }
        return result, nil
    }, nil
}

// Configure DeepAgent (same as Chapter 4, with Handlers and ModelRetryConfig added)
agent, _ := deep.New(ctx, &deep.Config{
    ChatModel:      cm,
    Backend:        backend,
    StreamingShell: backend,
    MaxIteration:   50,
    Handlers: []adk.ChatModelAgentMiddleware{
        &safeToolMiddleware{},
    },
    ModelRetryConfig: &adk.ModelRetryConfig{
        MaxRetries: 5,
        IsRetryAble: func(_ context.Context, err error) bool {
            return strings.Contains(err.Error(), "429")
        },
    },
})

Middleware Execution Flow

+------------------------------------------+
|  User: Read a nonexistent file            |
+------------------------------------------+
                   |
        +------------------------+
        |  Agent analyzes intent  |
        |  Decides to call        |
        |  read_file              |
        +------------------------+
                   |
        +------------------------+
        |  SafeToolMiddleware     |
        |  Intercepts Tool call   |
        +------------------------+
                   |
        +------------------------+
        |  Execute read_file      |
        |  Returns error          |
        +------------------------+
                   |
        +------------------------+
        |  SafeToolMiddleware     |
        |  Converts error to      |
        |  string                 |
        +------------------------+
                   |
        +------------------------+
        |  Return Tool Result     |
        |  "[tool error] ..."     |
        +------------------------+
                   |
        +------------------------+
        |  Agent generates reply  |
        |  "Sorry, the file       |
        |   doesn't exist..."     |
        +------------------------+

Chapter Summary

  • Middleware: An interceptor for the Agent that inserts custom logic before and after calls
  • SafeToolMiddleware: Converts Tool errors to strings so the model can understand and handle them
  • ModelRetryConfig: Configures automatic retries for ChatModel to handle temporary errors like rate limiting
  • Decorator pattern: Middleware wraps the original call, and can modify input, output, or errors
  • Onion model: Requests pass through Middleware from outside to inside, responses return from inside to outside

Further Thinking

Eino Built-in Middlewares:

Middleware Description
reduction Tool output reduction — when tool output is too long, automatically truncates and offloads to the filesystem to prevent context overflow
summarization Automatic conversation history summarization — when token count exceeds a threshold, automatically generates summaries to compress history
skill Skill loading middleware — enables the Agent to dynamically load and execute predefined skills

Middleware chain example:

import (
    "github.com/cloudwego/eino/adk/middlewares/reduction"
    "github.com/cloudwego/eino/adk/middlewares/summarization"
    "github.com/cloudwego/eino/adk/middlewares/skill"
)

// Create reduction middleware: manages tool output length
reductionMW, _ := reduction.New(ctx, &reduction.Config{
    Backend:           filesystemBackend,     // Storage backend
    MaxLengthForTrunc: 50000,                  // Max length for single tool output
    MaxTokensForClear: 30000,                  // Token threshold to trigger cleanup
})

// Create summarization middleware: automatically compresses conversation history
summarizationMW, _ := summarization.New(ctx, &summarization.Config{
    Model: chatModel,                          // Model used to generate summaries
    Trigger: &summarization.TriggerCondition{
        ContextTokens: 190000,                 // Token threshold to trigger summarization
    },
})

// Combine multiple middlewares (conceptual example; when using DeepAgent, replace adk.NewChatModelAgent with deep.New)
agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{
    Handlers: []adk.ChatModelAgentMiddleware{  // Note: config field name is Handlers, conceptually equivalent to Middlewares
        summarizationMW,   // Outermost: conversation history summarization
        reductionMW,       // Middle layer: tool output reduction
    },
})