当前位置:首页 > AI > 正文内容

AI Agent 智能体开发入门

廖万里12小时前AI2

# AI Agent 智能体开发入门

前言

AI Agent(人工智能智能体)是当前 AI 领域最热门的方向之一。与传统的"一问一答"式 AI 交互不同,Agent 具备自主规划、工具调用、多步推理的能力,能够完成复杂的任务。本文将从零开始,带你深入理解 AI Agent 的核心概念,并通过实战代码构建一个可用的智能体系统。

---

第一章:什么是 AI Agent

1.1 从 Chatbot 到 Agent 的演进

传统 Chatbot 的工作模式:

用户提问 → 模型生成回答 → 结束

AI Agent 的工作模式:

用户提问 → 理解任务 → 规划步骤 → 执行动作(调用工具)→ 
观察结果 → 继续执行或调整 → 最终回答

核心区别:

| 特性 | Chatbot | Agent | |------|---------|-------| | 交互模式 | 单轮对话 | 多轮自主执行 | | 能力边界 | 仅文本生成 | 可调用外部工具 | | 决策能力 | 无 | 有规划和反思能力 | | 任务范围 | 简单问答 | 复杂多步骤任务 |

1.2 Agent 的核心组件

一个完整的 AI Agent 通常包含以下组件:

from dataclasses import dataclass
from typing import Callable, Any
from enum import Enum

class AgentState(Enum): """Agent 状态""" IDLE = "idle" PLANNING = "planning" EXECUTING = "executing" REFLECTING = "reflecting" COMPLETED = "completed" FAILED = "failed"

@dataclass class Tool: """工具定义""" name: str description: str function: Callable parameters: dict # JSON Schema

@dataclass class AgentMemory: """Agent 记忆系统""" short_term: list # 短期记忆(当前对话) long_term: list # 长期记忆(历史经验) working: dict # 工作记忆(当前任务状态)

@dataclass class AgentPlan: """任务规划""" goal: str steps: list[dict] current_step: int results: list[Any]

1.3 主流 Agent 框架对比

| 框架 | 特点 | 适用场景 | |------|------|----------| | LangChain | 生态丰富,组件化设计 | 快速原型开发 | | AutoGPT | 自主性强,任务分解 | 自动化任务执行 | | CrewAI | 多 Agent 协作 | 团队协作场景 | | 自建框架 | 灵活可控 | 定制化需求 |

---

第二章:Agent 核心能力实现

2.1 工具调用系统

工具调用是 Agent 的核心能力,让 AI 能够与外部世界交互:

import json
from typing import Any, Callable
from pydantic import BaseModel, Field

class ToolDefinition(BaseModel): """工具定义模型""" name: str = Field(..., description="工具名称") description: str = Field(..., description="工具描述") parameters: dict = Field(default_factory=dict, description="参数定义")

class ToolRegistry: """工具注册中心""" def __init__(self): self._tools: dict[str, ToolDefinition] = {} self._functions: dict[str, Callable] = {} def register(self, name: str, description: str, parameters: dict): """注册工具装饰器""" def decorator(func: Callable) -> Callable: self._tools[name] = ToolDefinition( name=name, description=description, parameters=parameters ) self._functions[name] = func return func return decorator def get_tool_schema(self) -> list[dict]: """获取 OpenAI/Anthropic 格式的工具 schema""" return [ { "name": tool.name, "description": tool.description, "input_schema": { "type": "object", "properties": tool.parameters, "required": list(tool.parameters.keys()) } } for tool in self._tools.values() ] def execute(self, name: str, **kwargs) -> Any: """执行工具""" if name not in self._functions: raise ValueError(f"工具 '{name}' 不存在") return self._functions<a href="**kwargs" target="_blank" rel="noopener noreferrer">name</a>

# 使用示例 registry = ToolRegistry()

@registry.register( name="search_web", description="搜索互联网获取信息", parameters={ "query": { "type": "string", "description": "搜索关键词" } } ) def search_web(query: str) -> str: """模拟网络搜索""" # 实际项目中接入真实搜索 API search_results = { "Python": "Python 是一种通用编程语言...", "AI Agent": "AI Agent 是具有自主决策能力的人工智能系统..." } return search_results.get(query, f"未找到关于 '{query}' 的信息")

@registry.register( name="execute_code", description="执行 Python 代码", parameters={ "code": { "type": "string", "description": "要执行的 Python 代码" } } ) def execute_code(code: str) -> str: """安全执行代码(生产环境需要沙箱)""" try: local_vars = {} exec(code, {"__builtins__": {}}, local_vars) return str(local_vars.get("result", "代码执行成功")) except Exception as e: return f"执行错误: {str(e)}"

2.2 任务规划器

规划器负责将复杂任务分解为可执行的步骤:

from typing import Optional
import anthropic

class TaskPlanner: """任务规划器""" def __init__(self, client: anthropic.Anthropic, model: str = "claude-3-sonnet-20240229"): self.client = client self.model = model def plan(self, task: str, available_tools: list[str]) -> dict: """ 规划任务执行步骤 Returns: { "goal": "任务目标", "steps": [ {"step": 1, "action": "search_web", "params": {"query": "..."}}, {"step": 2, "action": "analyze", "reason": "..."}, ... ] } """ prompt = f""" 你是一个任务规划专家。请将以下任务分解为具体的执行步骤。

任务:{task}

可用工具:{', '.join(available_tools)}

请以 JSON 格式输出任务规划,格式如下: {{ "goal": "任务目标描述", "steps": [ {{ "step": 1, "action": "工具名称或 reasoning", "params": {{"参数名": "参数值"}}, "expected_output": "预期输出" }} ] }}

注意: 1. 步骤要具体可执行 2. 合理利用可用工具 3. 复杂任务适当分解 4. 每步都要有明确目的 """ response = self.client.messages.create( model=self.model, max_tokens=2048, messages=[{"role": "user", "content": prompt}] ) # 解析 JSON 响应 try: content = response.content[0].text # 提取 JSON 部分 import re json_match = re.search(r'\{[\s\S]*\}', content) if json_match: return json.loads(json_match.group()) except json.JSONDecodeError: pass return {"goal": task, "steps": [{"step": 1, "action": "reasoning", "params": {}}]}

2.3 记忆系统

记忆系统让 Agent 能够保持上下文和积累经验:

from collections import deque
from datetime import datetime
import pickle
import os

class AgentMemorySystem: """Agent 记忆系统""" def __init__(self, max_short_term: int = 10, storage_path: str = "./memory"): self.max_short_term = max_short_term self.storage_path = storage_path # 短期记忆(滑动窗口) self.short_term: deque = deque(maxlen=max_short_term) # 工作记忆(当前任务) self.working_memory: dict = {} # 长期记忆(持久化存储) self.long_term: list = [] os.makedirs(storage_path, exist_ok=True) self._load_long_term_memory() def add_short_term(self, role: str, content: str): """添加短期记忆""" self.short_term.append({ "role": role, "content": content, "timestamp": datetime.now().isoformat() }) def set_working(self, key: str, value: Any): """设置工作记忆""" self.working_memory[key] = value def get_working(self, key: str) -> Any: """获取工作记忆""" return self.working_memory.get(key) def add_long_term(self, experience: dict): """添加长期记忆""" experience["timestamp"] = datetime.now().isoformat() self.long_term.append(experience) self._save_long_term_memory() def recall_relevant(self, query: str, top_k: int = 3) -> list: """召回相关记忆(简化版,生产环境用向量数据库)""" # 这里使用简单的关键词匹配 relevant = [] for memory in self.long_term: if query.lower() in str(memory).lower(): relevant.append(memory) return relevant[:top_k] def get_context_for_llm(self) -> str: """获取传递给 LLM 的上下文""" context_parts = [] # 短期记忆 if self.short_term: context_parts.append("## 最近对话") for msg in self.short_term: context_parts.append(f"{msg['role']}: {msg['content']}") # 工作记忆 if self.working_memory: context_parts.append("\n## 当前任务状态") for key, value in self.working_memory.items(): context_parts.append(f"- {key}: {value}") return "\n".join(context_parts) def _save_long_term_memory(self): """持久化长期记忆""" with open(os.path.join(self.storage_path, "long_term.pkl"), 'wb') as f: pickle.dump(self.long_term, f) def _load_long_term_memory(self): """加载长期记忆""" path = os.path.join(self.storage_path, "long_term.pkl") if os.path.exists(path): with open(path, 'rb') as f: self.long_term = pickle.load(f)

---

第三章:从零构建一个完整的 Agent

3.1 Agent 核心类实现

import anthropic
from typing import Optional
import json

class IntelligentAgent: """智能体核心实现""" def __init__( self, name: str = "Assistant", model: str = "claude-3-sonnet-20240229", api_key: Optional[str] = None ): self.name = name self.model = model self.client = anthropic.Anthropic(api_key=api_key) # 初始化组件 self.tools = ToolRegistry() self.memory = AgentMemorySystem() self.planner = TaskPlanner(self.client, model) # 注册系统内置工具 self._register_builtin_tools() # Agent 状态 self.state = AgentState.IDLE self.max_iterations = 10 def _register_builtin_tools(self): """注册内置工具""" @self.tools.register( name="think", description="思考下一步行动,用于复杂推理", parameters={ "thought": { "type": "string", "description": "当前思考内容" }, "conclusion": { "type": "string", "description": "思考结论" } } ) def think(thought: str, conclusion: str) -> str: return f"思考: {thought}\n结论: {conclusion}" def register_tool(self, name: str, description: str, parameters: dict): """注册自定义工具的便捷方法""" return self.tools.register(name, description, parameters) def process_tool_calls(self, content_blocks: list) -> list: """处理工具调用""" results = [] for block in content_blocks: if block.type == "tool_use": tool_name = block.name tool_input = block.input print(f"[工具调用] {tool_name}({tool_input})") try: result = self.tools.execute(tool_name, **tool_input) results.append({ "tool_use_id": block.id, "name": tool_name, "result": result }) except Exception as e: results.append({ "tool_use_id": block.id, "name": tool_name, "result": f"错误: {str(e)}" }) return results def run(self, user_input: str) -> str: """ 运行 Agent 处理用户请求 这是 Agent 的主循环: 1. 接收用户输入 2. 规划任务 3. 执行工具调用 4. 反思和调整 5. 返回最终结果 """ self.state = AgentState.PLANNING self.memory.add_short_term("user", user_input) # 构建消息 messages = [{ "role": "user", "content": f""" 你是 {self.name},一个智能助手。 可用工具:{[t.name for t in self.tools._tools.values()]} 记忆上下文: {self.memory.get_context_for_llm()} 用户请求:{user_input} 请分析任务,决定是否需要使用工具,然后执行。 """ }] iteration = 0 final_response = "" while iteration < self.max_iterations: iteration += 1 self.state = AgentState.EXECUTING # 调用 LLM response = self.client.messages.create( model=self.model, max_tokens=4096, tools=self.tools.get_tool_schema(), messages=messages ) # 检查是否有工具调用 tool_calls = [b for b in response.content if hasattr(b, 'type') and b.type == "tool_use"] if not tool_calls: # 没有工具调用,获取最终答案 self.state = AgentState.COMPLETED final_response = response.content[0].text break # 处理工具调用 tool_results = self.process_tool_calls(response.content) # 将结果反馈给模型 messages.append({ "role": "assistant", "content": response.content }) messages.append({ "role": "user", "content": [ { "type": "tool_result", "tool_use_id": r["tool_use_id"], "content": r["result"] } for r in tool_results ] }) if iteration >= self.max_iterations: self.state = AgentState.FAILED final_response = "任务执行超时,请简化请求或分步提问。" # 记录结果 self.memory.add_short_term("assistant", final_response) return final_response

3.2 实战:构建一个研究助手 Agent

# 完整的研究助手示例

import anthropic import requests from bs4 import BeautifulSoup

class ResearchAgent(IntelligentAgent): """研究助手 Agent""" def __init__(self, **kwargs): super().__init__(name="研究助手", **kwargs) self._setup_research_tools() def _setup_research_tools(self): """设置研究相关工具""" @self.tools.register( name="web_search", description="搜索互联网获取信息", parameters={ "query": { "type": "string", "description": "搜索关键词" } } ) def web_search(query: str) -> str: """执行网络搜索(示例实现)""" # 实际项目中使用真实搜索 API(如 Serper、SerpAPI) # 这里用模拟数据演示 mock_results = { "AI Agent": [ {"title": "AI Agent 入门指南", "url": "https://example.com/agent-guide", "snippet": "..."}, {"title": "构建智能 Agent", "url": "https://example.com/build-agent", "snippet": "..."} ] } for key in mock_results: if key.lower() in query.lower(): return json.dumps(mock_results[key], ensure_ascii=False) return json.dumps([{"title": f"搜索: {query}", "url": "https://example.com", "snippet": "示例结果"}]) @self.tools.register( name="fetch_webpage", description="获取网页内容", parameters={ "url": { "type": "string", "description": "网页 URL" } } ) def fetch_webpage(url: str) -> str: """抓取网页内容""" try: headers = {'User-Agent': 'Mozilla/5.0 (Research Bot)'} response = requests.get(url, headers=headers, timeout=10) soup = BeautifulSoup(response.text, 'html.parser') # 提取正文 text = soup.get_text(separator='\n', strip=True) return text[:3000] # 限制长度 except Exception as e: return f"获取失败: {str(e)}" @self.tools.register( name="summarize", description="总结文本内容", parameters={ "text": { "type": "string", "description": "要总结的文本" }, "focus": { "type": "string", "description": "总结的侧重点" } } ) def summarize(text: str, focus: str = "主要观点") -> str: """总结文本""" # 这里调用 LLM 进行总结 summary_prompt = f"请总结以下内容,重点关注{focus}:\n\n{text[:2000]}" response = self.client.messages.create( model=self.model, max_tokens=500, messages=[{"role": "user", "content": summary_prompt}] ) return response.content[0].text

# 使用示例 if __name__ == "__main__": agent = ResearchAgent() # 执行研究任务 result = agent.run("请帮我研究一下 AI Agent 的发展现状,给出一个简要报告") print(result)

---

第四章:高级 Agent 模式

4.1 ReAct 模式(推理+行动)

ReAct 是一种经典的 Agent 模式,交替进行推理和行动:

class ReActAgent:
    """ReAct 模式 Agent"""
    
    def __init__(self, client, tools: ToolRegistry):
        self.client = client
        self.tools = tools
    
    def run(self, question: str) -> str:
        """ReAct 循环"""
        prompt = f"""
        回答问题:{question}
        
        使用以下格式:
        Thought: 思考下一步
        Action: 工具名称[参数]
        Observation: 工具返回结果
        ... (重复直到有答案)
        Thought: 我现在知道答案了
        Answer: 最终答案
        """
        
        messages = [{"role": "user", "content": prompt}]
        
        for _ in range(5):  # 最多 5 轮
            response = self.client.messages.create(
                model="claude-3-sonnet-20240229",
                max_tokens=1024,
                messages=messages
            )
            
            content = response.content[0].text
            
            # 解析 Action
            if "Action:" in content:
                # 提取并执行工具
                import re
                action_match = re.search(r'Action: (\w+)\[(.*?)\]', content)
                if action_match:
                    tool_name = action_match.group(1)
                    params = action_match.group(2)
                    
                    result = self.tools.execute(tool_name, query=params)
                    
                    messages.append({"role": "assistant", "content": content})
                    messages.append({"role": "user", "content": f"Observation: {result}"})
                    continue
            
            # 没有更多 Action,返回答案
            if "Answer:" in content:
                return content.split("Answer:")[-1].strip()
        
        return "无法完成推理"

4.2 多 Agent 协作

多个 Agent 分工协作处理复杂任务:

from typing import Protocol

class AgentRole(Protocol): """Agent 角色协议""" name: str expertise: str def process(self, task: str) -> str: ...

class MultiAgentSystem: """多 Agent 协作系统""" def __init__(self, coordinator_model: str = "claude-3-opus-20240229"): self.agents: dict[str, AgentRole] = {} self.coordinator_model = coordinator_model self.client = anthropic.Anthropic() def add_agent(self, agent: AgentRole): """添加专家 Agent""" self.agents[agent.name] = agent def solve(self, task: str) -> str: """协调多个 Agent 解决问题""" # 1. 分析任务,分配给合适的 Agent assignment = self._assign_tasks(task) # 2. 各 Agent 执行 results = {} for agent_name, subtask in assignment.items(): if agent_name in self.agents: results[agent_name] = self.agents[agent_name].process(subtask) # 3. 综合结果 return self._synthesize(results) def _assign_tasks(self, task: str) -> dict: """任务分配""" agent_list = [f"- {name}: {agent.expertise}" for name, agent in self.agents.items()] prompt = f""" 任务:{task} 可用专家: {chr(10).join(agent_list)} 请分配任务给合适的专家,以 JSON 格式输出: {{"专家名": "子任务描述"}} """ response = self.client.messages.create( model=self.coordinator_model, max_tokens=1024, messages=[{"role": "user", "content": prompt}] ) try: import re json_match = re.search(r'\{[\s\S]*\}', response.content[0].text) return json.loads(json_match.group()) if json_match else {} except: return {} def _synthesize(self, results: dict) -> str: """综合各 Agent 结果""" prompt = f"请综合以下专家意见,给出最终答案:\n\n" for name, result in results.items(): prompt += f"【{name}】\n{result}\n\n" response = self.client.messages.create( model=self.coordinator_model, max_tokens=2048, messages=[{"role": "user", "content": prompt}] ) return response.content[0].text

# 定义具体专家角色 class CodeExpert: name = "代码专家" expertise = "编程、代码审查、调试" def __init__(self, client): self.client = client def process(self, task: str) -> str: response = self.client.messages.create( model="claude-3-sonnet-20240229", max_tokens=1024, system="你是代码专家,专注于编程相关问题。", messages=[{"role": "user", "content": task}] ) return response.content[0].text

class ResearchExpert: name = "研究专家" expertise = "信息搜索、数据分析、报告撰写" def __init__(self, client): self.client = client def process(self, task: str) -> str: response = self.client.messages.create( model="claude-3-sonnet-20240229", max_tokens=1024, system="你是研究专家,专注于信息收集和分析。", messages=[{"role": "user", "content": task}] ) return response.content[0].text

---

第五章:生产环境最佳实践

5.1 错误处理与重试

import time
from functools import wraps

def retry_with_backoff(max_retries: int = 3, backoff_factor: float = 2.0): """重试装饰器""" def decorator(func): @wraps(func) def wrapper(*args, **kwargs): for attempt in range(max_retries): try: return func(*args, **kwargs) except anthropic.RateLimitError: if attempt < max_retries - 1: wait_time = backoff_factor ** attempt time.sleep(wait_time) else: raise except anthropic.APIError as e: if attempt == max_retries - 1: raise return None return wrapper return decorator

class RobustAgent(IntelligentAgent): """具备容错能力的 Agent""" @retry_with_backoff(max_retries=3) def run(self, user_input: str) -> str: try: return super().run(user_input) except Exception as e: self.state = AgentState.FAILED return f"执行出错: {str(e)}。请稍后重试。"

5.2 安全考虑

class SecureAgent(IntelligentAgent):
    """安全的 Agent 实现"""
    
    # 工具执行白名单
    ALLOWED_TOOLS = {"web_search", "think", "summarize"}
    
    # 敏感操作黑名单
    BLOCKED_PATTERNS = [
        "rm -rf",
        "DROP TABLE",
        "exec(",
        "__import__",
        "eval(",
    ]
    
    def validate_tool_input(self, tool_name: str, params: dict) -> bool:
        """验证工具输入安全性"""
        if tool_name not in self.ALLOWED_TOOLS:
            return False
        
        # 检查参数中是否有危险模式
        params_str = json.dumps(params)
        for pattern in self.BLOCKED_PATTERNS:
            if pattern in params_str:
                return False
        
        return True
    
    def execute_tool(self, name: str, **kwargs) -> Any:
        """安全执行工具"""
        if not self.validate_tool_input(name, kwargs):
            raise SecurityError(f"工具 '{name}' 或参数被禁止执行")
        
        return super().execute_tool(name, **kwargs)

class SecurityError(Exception): """安全错误""" pass

5.3 监控与日志

import logging
from datetime import datetime
import json

class MonitoredAgent(IntelligentAgent): """带监控的 Agent""" def __init__(self, **kwargs): super().__init__(**kwargs) self.logger = self._setup_logger() self.metrics = { "total_requests": 0, "tool_calls": 0, "errors": 0, "total_tokens": 0 } def _setup_logger(self): logger = logging.getLogger(f"agent.{self.name}") logger.setLevel(logging.INFO) handler = logging.FileHandler(f"agent_{self.name}.log") handler.setFormatter(logging.Formatter( '%(asctime)s - %(levelname)s - %(message)s' )) logger.addHandler(handler) return logger def run(self, user_input: str) -> str: self.metrics["total_requests"] += 1 start_time = datetime.now() self.logger.info(f"开始处理: {user_input[:100]}...") try: result = super().run(user_input) duration = (datetime.now() - start_time).total_seconds() self.logger.info(f"完成处理,耗时 {duration:.2f}s") return result except Exception as e: self.metrics["errors"] += 1 self.logger.error(f"处理失败: {str(e)}") raise def get_metrics(self) -> dict: """获取性能指标""" return self.metrics.copy()

---

第六章:Agent 开发工具与资源

6.1 调试工具

class AgentDebugger:
    """Agent 调试器"""
    
    def __init__(self, agent: IntelligentAgent):
        self.agent = agent
        self.trace: list[dict] = []
    
    def trace_tool_call(self, name: str, params: dict, result: Any):
        """记录工具调用"""
        self.trace.append({
            "type": "tool_call",
            "name": name,
            "params": params,
            "result": str(result)[:500],  # 截断
            "timestamp": datetime.now().isoformat()
        })
    
    def trace_llm_call(self, messages: list, response: str):
        """记录 LLM 调用"""
        self.trace.append({
            "type": "llm_call",
            "messages_count": len(messages),
            "response_preview": response[:200],
            "timestamp": datetime.now().isoformat()
        })
    
    def get_trace_report(self) -> str:
        """生成追踪报告"""
        report = ["# Agent 执行追踪报告\n"]
        
        for i, entry in enumerate(self.trace, 1):
            report.append(f"## 步骤 {i}: {entry['type']}")
            for key, value in entry.items():
                if key != "type":
                    report.append(f"- {key}: {value}")
            report.append("")
        
        return "\n".join(report)

6.2 推荐学习资源

官方文档:

  • Anthropic API 文档:https://docs.anthropic.com
  • OpenAI Agents 指南:https://platform.openai.com/docs/agents
开源框架:
  • LangChain:https://github.com/langchain-ai/langchain
  • CrewAI:https://github.com/joaomdmoura/crewAI
  • AutoGPT:https://github.com/Significant-Gravitas/AutoGPT
学术论文:
  • ReAct: Synergizing Reasoning and Acting in Language Models
  • Toolformer: Language Models Can Teach Themselves to Use Tools
  • Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
---

结语

AI Agent 代表了人工智能应用的新范式——从被动响应到主动执行,从单一能力到工具组合。掌握 Agent 开发,意味着你能够构建真正有用的 AI 应用。

核心要点回顾:

1. 理解本质:Agent = LLM + 工具调用 + 规划能力 + 记忆系统 2. 组件化设计:工具注册、任务规划、记忆管理各司其职 3. 迭代执行:行动-观察-调整的循环是 Agent 的核心模式 4. 安全优先:生产环境必须考虑安全限制和错误处理 5. 持续优化:通过监控和调试不断改进 Agent 能力

未来已来,开始构建你的第一个 Agent 吧!

---

*本文共计约 4800 字*

本文链接:https://www.kkkliao.cn/?id=649 转载需授权!

分享到:

版权声明:本文由廖万里的博客发布,如需转载请注明出处。


发表评论

访客

看不清,换一张

◎欢迎参与讨论,请在这里发表您的看法和观点。