Skip to main content
This is where the course moves from “pattern” to “operator workflow.” You are still inside the same QitOS kernel, but now the agent must behave well over many steps inside a workspace. That raises new design questions:
  • should you still build the tool surface by hand?
  • how much workflow discipline belongs in the system prompt?
  • when is HistoryPolicy enough?
  • when do you need CompactHistory or explicit memory?
The lesson studies examples/real/claude_code_agent.py.

What changes from lesson 2

BranchLesson 2Lesson 3
ToolsManual registry around a trimmed CodingToolSetcoding_tools(...) preset registry
PromptPlanner + executor promptsOne workflow-heavy system prompt
StatePlan and cursorTodos, mode, target file, verification command, optional doc URL
HistoryDefault behaviorExplicit HistoryPolicy(max_messages=16, max_tokens=2800)
MemoryNoneStill none by default, but now memory becomes a real design option
CompactionNot introducedIntroduced as an upgrade path for longer runs

The system prompt now defines workflow discipline

Unlike lesson 1, this prompt is not just a parser contract. It also encodes operating style:
Workflow:
- Start by writing a todo list with `todo_write`.
- If you are unsure which tool to use, call `tool_search`.
- Read before you edit.
- Make the smallest correct change.
- Run verification immediately after editing.
- Only use `web_fetch` when the task needs documentation.
The full prompt in the example also lists preferred tool patterns. In v0.4, the output contract is no longer hardcoded into the example prompt. It is injected by the active protocol:
Thought: <short reasoning>
Action: <tool_name>(arg=value, ...)
or
Final Answer: <what changed + verification proof>
This is a major QitOS design lesson:
  • the runtime stays the same
  • the prompt can still become much more operational

The default lesson parser remains ReAct on purpose

Even though the prompt is richer, the parser is still:
model_parser=ReActTextParser()
That is deliberate. You are learning how much of an agent’s behavior can be changed through:
  • better state
  • better tools
  • better workflow prompting
without immediately changing the parser or protocol.

The example is now preset-first

Under the hood, the example still builds an OpenAICompatibleModel transport for the current endpoint. But v0.4 adds one new layer before that transport is created:
  • resolve a FamilyPreset
  • build a HarnessPolicy
  • choose protocol, parser, tool delivery mode, and context defaults
That means one example can switch across Qwen, Kimi, MiniMax, gpt-oss, and Gemma 4 without changing the agent implementation itself. For most of those families, the harness is still text/JSON-first:
  • the model returns text
  • the tool schema is either injected into the prompt or passed via tool parameters
  • the parser turns text into a Decision
MiniMax keeps its model-specific tool-call parser, but the switching path is now still driven by the same preset system. This remains a strong research default because it is:
  • easy to compare across providers
  • easy to inspect in traces
  • easy to adapt to local endpoints
If you later need a family-specific protocol, QitOS can support it, but the preset system makes that coupling explicit and traceable.
1

Start from a preset tool registry

The lesson uses:
super().__init__(
    toolset=[
        coding_tools(
            workspace_root=workspace_root,
            shell_timeout=30,
            include_notebook=True,
        )
    ],
    llm=llm,
    model_parser=ReActTextParser(),
)
This is the point in the course where presets become the right abstraction.coding_tools(...) gives you a coherent workspace bundle instead of forcing you to hand-register every file, shell, task, and notebook tool.The lesson here is:
  • build tools by hand while learning the kernel
  • switch to presets when the agent surface becomes operationally large
2

Understand what the preset is buying you

coding_tools(...) is the standard full coding bundle.In practice, that gives the agent access to:
  • file inspection and editing
  • shell execution
  • task/todo helpers
  • optional notebook support
  • optional web and documentation tools
Once an agent reaches this level, toolset choice becomes a big part of agent design.You are no longer selecting one tool at a time. You are selecting a working environment.
3

Design state for long-running work

The state now carries workflow signals:
@dataclass
class ClaudeCodeState(StateSchema):
    scratchpad: list[str] = field(default_factory=list)
    todos: list[dict[str, Any]] = field(default_factory=list)
    target_file: str = TARGET_FILE
    test_command: str = TEST_COMMAND
    doc_url: str = DOC_URL
    mode: str = "work"
Why this state shape works:
  • todos exposes a work queue that survives multiple steps
  • mode lets the agent remember whether it is planning or executing
  • doc_url adds optional external grounding without forcing browsing
  • scratchpad keeps the recent compressed trajectory
This is more advanced than lesson 2, but it is still disciplined:every field exists because it changes future behavior
4

Use reduce to absorb structured tool output

reduce() listens for tool-driven workflow state:
if isinstance(first, dict):
    if first.get("todos"):
        state.todos = list(first.get("todos") or [])
    if first.get("current_mode"):
        state.mode = str(first.get("current_mode"))
    if int(first.get("returncode", 1)) == 0:
        state.final_result = (
            "Verification passed with the canonical coding toolset."
        )
if state.metadata.get("todos"):
    state.todos = list(state.metadata.get("todos") or [])
if state.metadata.get("mode"):
    state.mode = str(state.metadata.get("mode"))
This is the long-running agent version of the same old lesson:tools do work, but reduce() still decides what the agent should remember.
5

Introduce explicit history control

The run passes:
history_policy=HistoryPolicy(max_messages=16, max_tokens=2800)
This is the first course lesson where message-window management matters.HistoryPolicy answers:
  • how many recent messages are retained
  • how many tokens can be spent on history
  • when older interaction context stops being sent verbatim
This is not the same as memory and not the same as compaction.
6

Learn the boundary between history, compaction, and memory

In this lesson, the example still does not attach a custom history= or memory=.That is a meaningful choice:
  • HistoryPolicy controls the message budget
  • state stores immediate workflow artifacts like todos and mode
  • no separate memory store is needed yet
Add CompactHistory when the run becomes long enough that simple trimming loses too much context:
from qitos.kit import CompactConfig, CompactHistory, WindowMemory

super().__init__(
    toolset=[coding_tools(workspace_root=workspace_root)],
    llm=llm,
    model_parser=ReActTextParser(),
    history=CompactHistory(
        llm=llm,
        config=CompactConfig(
            max_tokens=2800,
            keep_last_messages=10,
            keep_last_rounds=4,
        ),
    ),
    memory=WindowMemory(window_size=30),
)
Read that carefully:
  • HistoryPolicy trims the message request
  • CompactHistory summarizes and preserves old interaction history
  • Memory stores reusable records outside the immediate message stream
Those are three different layers.
7

Understand when to upgrade the protocol

This lesson still uses text ReAct, and that is usually the right call.Consider a protocol upgrade only when you need something specific:
  • JSON or XML if you need stricter structured output than text ReAct
  • Terminus if the agent is driving a live terminal session
  • a model-specific parser such as MiniMaxToolCallParser if the provider emits native structured tool calls you actually want to preserve
Do not upgrade the protocol just because the agent became more advanced.
8

Run it like an operator and inspect it like a researcher

Run:
python examples/real/claude_code_agent.py
Inspect:
qita board --logdir runs
In qita, inspect:
  • whether todos appear early and remain coherent
  • whether mode changes match the intended workflow
  • how the prompt and parser still stay in the simple ReAct path
  • whether history trimming changes the model’s behavior
  • whether the run would benefit from CompactHistory

The right mental model for long-running agents

By this point in the course, you should think in layers:
  • state is what the next step definitely needs
  • history is what the next model call may need
  • compaction is how old history is compressed
  • memory is what should outlive the immediate turn structure
That separation is one of the most important QitOS design decisions.

Full example

The full runnable lesson lives at:

What lesson 4 adds

Lesson 4 keeps the long-running structure, but changes the domain completely. That means you will learn how to specialize:
  • tool composition
  • prompt policy
  • state semantics
  • reduce() logic
without inventing a new runtime.

Next lesson: Code security audit agent

Turn the same kernel into a defensive review agent with ranked findings and audit-specific traces.

Related guide: observability

Review qita board, replay, and export before studying the final audit workflow.