方法模板 - QitOS

QitOS 提供方法模板——即现成的 Agent + Critic 组合，实现了知名的智能体推理模式。每个模板封装了专用的状态、评论器和智能体，让你无需重写控制循环即可应用该模式。

什么是方法模板？

方法模板由以下部分组成：

组件	角色
State	`StateSchema` 子类，包含模式专用字段（反思记录、草稿、精炼次数等）
Critic	`Critic` 子类，通过在适当时机返回 `retry`、`continue` 或 `stop` 来驱动模式循环
Agent	`AgentModule` 子类，具有模式感知的 `build_system_prompt()`、`prepare()` 和 `reduce()`

评论器是核心：它评估每个步骤，决定是否重试（附带指令补丁）、继续或停止。智能体通过模式上下文（先前的反思、草稿版本、批评意见）丰富提示词。

Self-Refine

Self-Refine 模式（Madaan et al. 2023）迭代过程：生成 → 批评 → 精炼，直到质量达到阈值或达到最大精炼次数。

适用场景

质量比速度更重要的文本生成任务
摘要、翻译、带自评估的代码生成
任何迭代改进能可靠产生更好输出的任务

快速开始

from qitos.recipes.self_refine import SelfRefineAgent, SelfRefineCritic

agent = SelfRefineAgent(llm=my_llm)
result = agent.run(
    task="撰写关于...研究论文的简洁摘要",
    critics=[SelfRefineCritic(max_refinements=3, quality_threshold=0.7)],
    max_steps=10,
    return_state=True,
)
print(result.state.draft)        # 精炼后的输出
print(result.state.final_result) # 如果提取到 FINAL ANSWER

工作原理

生成：智能体生成初始草稿。
批评：SelfRefineCritic 使用启发式评分评估草稿质量（经过更多精炼的较长草稿得分更高；非常短的草稿会被惩罚）。在生产环境中，请替换为基于 LLM 的评分器。
精炼：如果分数低于 quality_threshold 且仍有精炼次数，评论器返回 retry 并附带 instruction_patch 要求智能体改进。智能体在下一次提示中可以看到之前的草稿和批评。
接受：当分数达到阈值或达到 max_refinements 时，评论器返回 continue 或 stop。

SelfRefineCritic 参数

参数	默认值	描述
`max_refinements`	3	最大精炼迭代次数
`quality_threshold`	0.7	接受草稿的最低分数（0.0–1.0）

SelfRefineState 字段

字段	类型	描述
`draft`	`str`	当前草稿文本
`refinement_count`	`int`	目前已进行的精炼次数
`max_refinements`	`int`	精炼预算
`critique_history`	`List[str]`	累积的批评记录

Reflexion

Reflexion 模式（Shinn et al. 2023）迭代过程：行动 → 评估 → 反思 → 带记忆重试。在失败时，评论器生成语言反思存储在状态中，并注入到未来的提示词中。

适用场景

调试和纠错任务，智能体需要从失败中学习
失败后应尝试不同策略的任务
多次尝试的问题（编码、推理），反思能改进后续尝试

快速开始

from qitos.recipes.reflexion import ReflexionAgent, ReflexionCritic

agent = ReflexionAgent(llm=my_llm)
result = agent.run(
    task="调试 tests/test_module.py 中失败的测试",
    critics=[ReflexionCritic(max_reflections=3)],
    max_steps=15,
    return_state=True,
)
print(result.state.reflections)    # 语言反思列表
print(result.state.final_result)  # 如果提取到 FINAL ANSWER

工作原理

行动：智能体采取行动完成任务。
评估：ReflexionCritic 检查是否失败（错误、非零返回码、空结果）。
反思：在失败时，评论器从错误上下文生成语言反思，并以 instruction_patch 形式返回 retry。反思存储在 state.reflections 中。
带记忆重试：智能体的 build_system_prompt() 包含所有先前的反思，因此 LLM 可以避免重复相同的错误。

ReflexionCritic 参数

参数	默认值	描述
`max_reflections`	3	强制停止前的最大反思迭代次数
`success_threshold`	0.6	认为轨迹成功的最低分数

ReflexionState 字段

字段	类型	描述
`reflections`	`List[str]`	累积的语言反思
`reflection_count`	`int`	已生成的反思数量
`max_reflections`	`int`	反思预算
`last_action_success`	`bool`	上一步是否成功
`attempt`	`int`	当前尝试编号

LATS

LATS 模式（Zhou et al. 2023）将蒙特卡洛树搜索应用于语言智能体：选择 → 扩展 → 评估 → 回溯。失败的轨迹生成反思，引导未来探索避开类似错误。

适用场景

需要系统探索多条解路的任务
逻辑谜题、编程挑战和多步推理
尝试不同策略能提高成功率的问题

快速开始

from qitos.recipes.lats import LATSAgent, LATSCritic

agent = LATSAgent(llm=my_llm)
result = agent.run(
    task="解决逻辑谜题 ...",
    critics=[LATSCritic(max_simulations=5, exploration_weight=1.41)],
    max_steps=20,
    return_state=True,
)
print(result.state.best_answer)    # 找到的最佳答案
print(result.state.best_reward)    # 最佳奖励分数

LATSCritic 参数

参数	默认值	描述
`max_simulations`	5	最大模拟迭代次数
`exploration_weight`	1.41	UCB1 探索常数 (c)
`success_threshold`	0.8	提前停止的奖励阈值

LATSState 字段

字段	类型	描述
`simulations_done`	`int`	已完成的模拟次数
`max_simulations`	`int`	模拟预算
`best_reward`	`float`	目前最佳奖励
`best_answer`	`str`	最佳轨迹的答案
`reflections`	`List[str]`	失败路径的反思

MoA（混合智能体）

MoA 模式（Wang et al. 2024）独立运行多个提议者并综合其输出：提议 → 聚合。提议的多样性提高质量，即使单个提议者较弱。

适用场景

受益于多元视角或创造性回应的任务
分析、评估和综合问题
通过集成推理提升质量

快速开始

from qitos.recipes.moa import MoAOrchestrator, MoACritic

agent = MoAOrchestrator(llm=my_llm)
result = agent.run(
    task="评估...的系统架构设计",
    critics=[MoACritic(proposer_count=3, max_rounds=1)],
    max_steps=15,
    return_state=True,
)
print(result.state.synthesis)  # 聚合后的答案

MoACritic 参数

参数	默认值	描述
`proposer_count`	3	预期的提议数量
`max_rounds`	1	最大提议-聚合轮数
`quality_threshold`	0.6	接受综合结果的最低分数

Magentic-One

Magentic-One 模式（Furtado et al. 2024）使用编排器的双账本架构：规划 → 委派 → 跟踪进度 → 停滞时重新规划。编排器维护事实库和任务账本，委派给专家智能体，在停滞时重新规划。

适用场景

需要协调不同能力的复杂多步任务
编排器需要根据中间结果调整计划的场景
涉及研究、编码和分析子任务的开放性问题

快速开始

from qitos.recipes.magentic_one import (
    MagenticOneOrchestrator,
    ProgressCritic,
    MagenticOneState,
)

agent = MagenticOneOrchestrator(llm=my_llm)
result = agent.run(
    task="研究并总结...的最新发现",
    critics=[ProgressCritic(max_stalls=3)],
    max_steps=30,
    return_state=True,
)
print(result.state.fact_bank)        # 累积的事实
print(result.state.completed_tasks)  # 已完成的子任务

ProgressCritic 参数

参数	默认值	描述
`max_stalls`	3	最大连续无进展步数
`progress_threshold`	0.5	认为有进展的最低分数

MagenticOneState 字段

字段	类型	描述
`fact_bank`	`List[str]`	累积的事实和推测
`task_ledger`	`List[str]`	当前的子任务计划
`completed_tasks`	`List[str]`	已完成的子任务
`stall_count`	`int`	连续无进展步数
`specialist_calls`	`int`	专家委派次数

脚手架工具

使用 qit new CLI 命令从内置 cookiecutter 模板脚手架生成新智能体项目：

# 使用默认设置创建新智能体
pip install qitos[cookiecutter]
qit new --agent-name my_agent --agent-description "我的自定义智能体"

# 列出可用模板
qit list-templates

# 完全自定义创建
qit new \
  --agent-name code_reviewer \
  --agent-description "审查代码中的安全问题" \
  --author "我的团队" \
  --default-model qwen-plus \
  --max-steps 20

脚手架生成的项目包括：

src/agent.py — 包含 State、init_state、build_system_prompt、reduce 的 Agent 类
configs/default.yaml — 默认模型和步骤配置
tests/test_agent.py — 基本冒烟测试
snowl_compat.py — Snowl 评估兼容适配器
eval_config.yaml — 评估配置

构建自定义方法模板

要创建自定义方法模板，请遵循相同的 Agent + Critic 模式：

定义状态，扩展 StateSchema 并添加模式的跟踪字段
实现评论器，当模式需要迭代时返回带有 instruction_patch 和 state_patch 的 retry
实现智能体，其 build_system_prompt() 从状态中注入模式上下文

from dataclasses import dataclass, field
from typing import Any, Dict, List
from qitos import AgentModule, Decision, StateSchema
from qitos.engine.critic import Critic
from qitos.engine.critic_result import CriticResult
from qitos.kit.parser import ReActTextParser

@dataclass
class MyMethodState(StateSchema):
    iteration: int = 0
    max_iterations: int = 5
    history: List[str] = field(default_factory=list)

class MyMethodCritic(Critic):
    def evaluate(self, state, decision, results):
        if not isinstance(state, MyMethodState):
            return CriticResult(action="continue")
        if state.iteration < state.max_iterations:
            return CriticResult(
                action="retry",
                score=0.5,
                instruction_patch="尝试不同的方法。",
                state_patch={"iteration": state.iteration + 1},
            )
        return CriticResult(action="stop", reason="已达最大迭代次数。")

class MyMethodAgent(AgentModule[MyMethodState, Dict[str, Any], Any]):
    def __init__(self, llm=None, **kwargs):
        super().__init__(llm=llm, model_parser=ReActTextParser(), **kwargs)

    def init_state(self, task, **kwargs):
        return MyMethodState(task=task, max_steps=kwargs.get("max_steps", 10))

    def build_system_prompt(self, state):
        prompt = "你是一个方法智能体。迭代直到完成。"
        if state.history:
            prompt += "\n\n之前的尝试：\n" + "\n".join(state.history)
        return prompt

    def prepare(self, state, observation):
        return f"任务：{state.task}\n迭代：{state.iteration}"

    def reduce(self, state, observation, decision, action_results):
        return state

Documentation Index

​什么是方法模板？

​Self-Refine

​适用场景

​快速开始

​工作原理

​SelfRefineCritic 参数

​SelfRefineState 字段

​Reflexion

​适用场景

​快速开始

​工作原理

​ReflexionCritic 参数

​ReflexionState 字段

​LATS

​适用场景

​快速开始

​LATSCritic 参数

​LATSState 字段

​MoA（混合智能体）

​适用场景

​快速开始

​MoACritic 参数

​Magentic-One

​适用场景

​快速开始

​ProgressCritic 参数

​MagenticOneState 字段

​脚手架工具

​构建自定义方法模板

什么是方法模板？

Self-Refine

适用场景

快速开始

工作原理

SelfRefineCritic 参数

SelfRefineState 字段

Reflexion

适用场景

快速开始

工作原理

ReflexionCritic 参数

ReflexionState 字段

LATS

适用场景

快速开始

LATSCritic 参数

LATSState 字段

MoA（混合智能体）

适用场景

快速开始

MoACritic 参数

Magentic-One

适用场景

快速开始

ProgressCritic 参数

MagenticOneState 字段

脚手架工具

构建自定义方法模板