You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

315 lines
12 KiB
Markdown

---
title: "Chapter 4: Tool and Filesystem Access"
---
The goal of this chapter is to add Tool capabilities to the Agent, enabling it to access the filesystem.
## Why We Need Tools
In the first three chapters, the Agent we implemented could only have conversations — it couldn't perform actual operations.
**Agent limitations:**
- Can only generate text replies
- Cannot access external resources (files, APIs, databases, etc.)
- Cannot execute actual tasks (calculations, queries, modifications, etc.)
**The role of Tools:**
- **Tools are capability extensions for the Agent**: They enable the Agent to perform concrete operations
- **Tools encapsulate specific implementations**: The Agent doesn't care how a Tool works internally, only about its input and output
- **Tools are composable**: An Agent can have multiple Tools and choose which to call as needed
**Simple analogy:**
- **Agent** = "intelligent assistant" (can understand instructions, but needs tools to execute)
- **Tool** = "toolbox" (file operations, network requests, database queries, etc.)
## Why We Need Filesystem Access
This example is ChatWithDoc (chat with documentation), with the goal of helping users learn the Eino framework and write Eino code. So, what is the best documentation?
**The answer is: the Eino repository's code itself.**
- **Code**: Source code shows the framework's actual implementation
- **Comments**: Code comments provide design rationale and usage instructions
- **Examples**: Example code demonstrates best practices
With filesystem access capabilities, the Agent can directly read Eino source code, comments, and examples, providing users with the most accurate and up-to-date technical support.
## Key Concepts
### Tool Interface
`Tool` is the interface in Eino that defines executable capabilities:
```go
// BaseTool provides tool metadata that ChatModel uses to decide whether and how to call the tool
type BaseTool interface {
Info(ctx context.Context) (*schema.ToolInfo, error)
}
// InvokableTool is a tool that can be executed by ToolsNode
type InvokableTool interface {
BaseTool
// InvokableRun executes the tool; arguments are a JSON-encoded string, returns a string result
InvokableRun(ctx context.Context, argumentsInJSON string, opts ...Option) (string, error)
}
// StreamableTool is the streaming variant of InvokableTool
type StreamableTool interface {
BaseTool
// StreamableRun executes the tool in streaming mode, returns a StreamReader
StreamableRun(ctx context.Context, argumentsInJSON string, opts ...Option) (*schema.StreamReader[string], error)
}
```
**Interface hierarchy:**
- `BaseTool`: Base interface, only provides metadata
- `InvokableTool`: Executable tool (extends BaseTool)
- `StreamableTool`: Streaming tool (extends BaseTool)
### Backend Interface
`Backend` is an abstract interface in Eino for filesystem operations:
```go
type Backend interface {
// List file information in a directory
LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error)
// Read file content, supports line offset and limits
Read(ctx context.Context, req *ReadRequest) (*FileContent, error)
// Search for matching content in files
GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error)
// Match files based on glob patterns
GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error)
// Write file content
Write(ctx context.Context, req *WriteRequest) error
// Edit file content (string replacement)
Edit(ctx context.Context, req *EditRequest) error
}
```
### LocalBackend
`LocalBackend` is the local filesystem implementation of Backend, directly accessing the operating system's filesystem:
```go
import localbk "github.com/cloudwego/eino-ext/adk/backend/local"
backend, err := localbk.NewBackend(ctx, &localbk.Config{})
```
**Features:**
- Directly accesses the local filesystem, implemented using Go standard library
- Supports all Backend interface methods
- Supports executing shell commands (ExecuteStreaming)
- Path safety: Requires absolute paths to prevent directory traversal attacks
- Zero configuration: Works out of the box with no additional setup
## Implementation: Using DeepAgent
This chapter uses the DeepAgent prebuilt Agent, which provides first-class configuration for Backend and StreamingShell, making it easy to register filesystem-related tools.
### From ChatModelAgent to DeepAgent: When to Switch?
Previous chapters used `ChatModelAgent`, which can handle multi-turn conversations. But to access the filesystem, we need to switch to `DeepAgent`.
**ChatModelAgent vs DeepAgent comparison:**
| Capability | ChatModelAgent | DeepAgent |
|-----------|----------------|-----------|
| Multi-turn conversation | Yes | Yes |
| Add custom Tools | Yes, manually register each Tool | Yes, manual or automatic registration |
| Filesystem access (Backend) | No, must manually create and register all file tools | Yes, first-class config, auto-registered |
| Command execution (StreamingShell) | No, must manually create | Yes, first-class config, auto-registered |
| Built-in task management | No | Yes, `write_todos` tool |
| Sub-Agent support | No | Yes |
**Selection guide:**
- Pure conversation scenarios (no external access) -> Use `ChatModelAgent`
- Need filesystem access or command execution -> Use `DeepAgent`
### Why Use DeepAgent?
Compared to using ChatModelAgent directly, DeepAgent's advantages:
1. **First-class configuration**: Backend and StreamingShell are first-class config options — just pass them in
2. **Automatic tool registration**: Configuring a Backend automatically registers filesystem tools, no manual creation needed
3. **Built-in task management**: Provides the `write_todos` tool for task planning and tracking
4. **Sub-Agent support**: Can configure specialized sub-Agents for specific tasks
5. **More powerful**: Integrates filesystem, command execution, and many other capabilities
### Code Implementation
```go
import (
localbk "github.com/cloudwego/eino-ext/adk/backend/local"
"github.com/cloudwego/eino/adk/prebuilt/deep"
)
// Create LocalBackend
backend, err := localbk.NewBackend(ctx, &localbk.Config{})
// Create DeepAgent, automatically registers filesystem tools
agent, err := deep.New(ctx, &deep.Config{
Name: "Ch04ToolAgent",
Description: "ChatWithDoc agent with filesystem access via LocalBackend.",
ChatModel: cm,
Instruction: instruction,
Backend: backend, // Provides filesystem operation capabilities
StreamingShell: backend, // Provides command execution capabilities
MaxIteration: 50,
})
```
### Tools Automatically Registered by DeepAgent
When `Backend` and `StreamingShell` are configured, DeepAgent automatically registers the following tools:
- `read_file`: Read file content
- `write_file`: Write file content
- `edit_file`: Edit file content
- `glob`: Find files based on glob patterns
- `grep`: Search for content in files
- `execute`: Execute shell commands
## Code Location
- Entry code: [cmd/ch04/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch04/main.go)
## Prerequisites
Same as Chapter 1: you need to configure an available ChatModel (OpenAI or Ark).
This chapter also requires setting `PROJECT_ROOT` (optional, see running instructions below).
## Running
In the `examples/quickstart/chatwitheino` directory, run:
```bash
# Optional: Set the root directory path for the Eino core library
# When not set, the Agent defaults to using the current working directory (the chatwitheino directory) as root
# To let the Agent search the complete Eino codebase, point this to the eino core library root
export PROJECT_ROOT=/path/to/eino
# Verify the path is correct (you should see directories like adk, components, compose, etc.)
ls $PROJECT_ROOT
go run ./cmd/ch04
```
**`PROJECT_ROOT` explanation:**
- **When not set**: `PROJECT_ROOT` defaults to the current working directory (the directory containing `chatwitheino`), and the Agent can only access files in this example project. This is sufficient for quick experimentation.
- **When set**: Points to the Eino core library root directory, and the Agent can search the complete Eino framework codebase (core library, extension library, examples library). This is the full ChatWithEino usage scenario.
**Recommended three-repository directory structure (for the full experience):**
```
eino/ # PROJECT_ROOT (Eino core library)
├── adk/
├── components/
├── compose/
├── ext/ # eino-ext (extension components, e.g., OpenAI, Ark implementations)
├── examples/ # eino-examples (this repository, where this example is located)
│ └── quickstart/
│ └── chatwitheino/
└── ...
```
You can use the `dev_setup.sh` script to automatically set up the above directory structure:
```bash
# Run in the eino root directory to automatically clone extension and example repos to the correct locations
bash scripts/dev_setup.sh
```
Output example:
```text
you> List the files in the current directory
[assistant] Let me list the files in the current directory...
[tool call] glob(pattern: "*")
[tool result] Found 5 files:
- main.go
- go.mod
- go.sum
- README.md
- cmd/
you> Read the contents of main.go
[assistant] Let me read the main.go file...
[tool call] read_file(file_path: "main.go")
[tool result] File contents:
...
```
**Note:** If you encounter Tool errors during execution that cause the Agent to stop, don't panic — this is normal. Tool errors are common, such as parameter errors, file not found, etc. How to gracefully handle Tool errors will be covered in detail in the next chapter.
## Tool Call Flow
When the Agent needs to call a Tool:
```
+------------------------------------------+
| User: List the files in the current dir |
+------------------------------------------+
|
+------------------------+
| Agent analyzes intent |
| Decides to call glob |
+------------------------+
|
+------------------------+
| Generate Tool Call |
| {"pattern": "*"} |
+------------------------+
|
+------------------------+
| Execute Tool |
| glob("*") |
+------------------------+
|
+------------------------+
| Return Tool Result |
| {"files": [...]} |
+------------------------+
|
+------------------------+
| Agent generates reply |
| "Found 5 files..." |
+------------------------+
```
## Chapter Summary
- **Tool**: Capability extensions for the Agent, enabling it to perform concrete operations
- **Backend**: Abstract interface for filesystem operations, providing unified file operation capabilities
- **LocalBackend**: Local filesystem implementation of Backend, directly accessing the OS filesystem
- **DeepAgent**: A prebuilt advanced Agent with first-class configuration for Backend and StreamingShell
- **Automatic tool registration**: Configuring a Backend automatically registers filesystem tools
- **Tool call flow**: Agent analyzes intent -> Generates Tool Call -> Executes Tool -> Returns result -> Generates reply
## Further Thinking
**Other Tool types:**
- HTTP Tool: Call external APIs
- Database Tool: Query databases
- Calculator Tool: Perform calculations
- Code Executor Tool: Run code
**Other Backend implementations:**
- Other storage backends can be implemented based on the Backend interface
- For example: cloud storage, database storage, etc.
- LocalBackend already provides complete filesystem operation capabilities
**Custom Tool creation:**
If you need to create custom Tools, you can use `utils.InferTool` to automatically infer from functions. See:
- [Tool interface documentation](https://github.com/cloudwego/eino/tree/main/components/tool)
- [Tool creation examples](https://github.com/cloudwego/eino-examples/tree/main/components/tool)