Lesson 3: Claude Code-style agent

This is where the course moves from “pattern” to “operator workflow.” You are still inside the same QitOS kernel (the core AgentModule + Engine execution loop), but now the agent must behave well over many steps inside a workspace. This raises new design questions:

should you still build the tool surface by hand?
how much workflow discipline belongs in the system prompt?
when is HistoryPolicy enough?
when do you need CompactHistory (summarizing older context to stay within token limits) or explicit memory?

The lesson studies examples/real/claude_code_agent.py.

What changes from lesson 2

Branch	Lesson 2	Lesson 3
Tools	Manual registry around a trimmed `CodingToolSet`	`coding_tools(...)` preset registry
Prompt	Planner + executor prompts	One workflow-heavy system prompt
State	Plan and cursor	Todos, mode, target file, verification command, optional doc URL
History	Default behavior	Explicit `HistoryPolicy(max_messages=16, max_tokens=2800)`
Memory	None	Still none by default, but now memory becomes a real design option
Compaction	Not introduced	Introduced as an upgrade path for longer runs — summarizing older context to stay within token limits

The system prompt now defines workflow discipline

Unlike lesson 1, this prompt is not just a parser contract. It also encodes operating style:

Workflow:
- Start by writing a todo list with `todo_write`.
- If you are unsure which tool to use, call `tool_search`.
- Read before you edit.
- Make the smallest correct change.
- Run verification immediately after editing.
- Only use `web_fetch` when the task needs documentation.

The full prompt in the example also lists preferred tool patterns. In v0.4, the output contract is no longer hardcoded into the example prompt. It is injected by the active protocol:

Thought: <short reasoning>
Action: <tool_name>(arg=value, ...)
or
Final Answer: <what changed + verification proof>

This is a major QitOS design lesson:

the runtime stays the same
the prompt can become much more operational

The default lesson parser remains ReAct on purpose

Even though the prompt is richer, the parser is still:

model_parser=ReActTextParser()

That is deliberate. You are learning how much of an agent’s behavior can change through:

better state
better tools
better workflow prompting

without immediately changing the parser or protocol.

The example is now preset-first

Under the hood, the example still builds an OpenAICompatibleModel transport for the current endpoint. But v0.4 adds one new layer before that transport (the adapter that sends requests to and receives responses from a model API) is created:

resolve a FamilyPreset (a reusable configuration bundle for a model family)
build a HarnessPolicy (the wiring layer that connects a transport, parser, and protocol)
choose protocol, parser (a component that converts raw model output into a typed Decision), tool delivery mode, and context defaults

One example can switch across Qwen, Kimi, MiniMax, gpt-oss, and Gemma 4 without changing the agent implementation itself. For most of those families, the harness is still text/JSON-first:

the model returns text
the tool schema is either injected into the prompt or passed via tool parameters
the parser turns text into a Decision

MiniMax keeps its model-specific tool-call parser, but the switching path is now still driven by the same preset system. This remains a strong research default because it is:

easy to compare across providers
easy to inspect in traces
easy to adapt to local endpoints

If you later need a family-specific protocol, QitOS can support it, but the preset system makes that coupling explicit and traceable.

Start from a preset tool registry

The lesson uses:

super().__init__(
    toolset=[
        coding_tools(
            workspace_root=workspace_root,
            shell_timeout=30,
            include_notebook=True,
        )
    ],
    llm=llm,
    model_parser=ReActTextParser(),
)

This is the point in the course where presets (reusable configuration bundles) become the right abstraction.coding_tools(...) gives you a coherent workspace bundle instead of forcing you to hand-register every file, shell, task, and notebook tool.The lesson here is:

build tools by hand while learning the kernel
switch to presets when the agent surface becomes operationally large

Understand what the preset is buying you

coding_tools(...) is the standard full coding bundle.In practice, that gives the agent access to:

file inspection and editing
shell execution
task/todo helpers
optional notebook support
optional web and documentation tools

Once an agent reaches this level, toolset choice becomes a big part of agent design.You are no longer selecting one tool at a time. You are selecting a working environment.

Design state for long-running work

The state now carries workflow signals:

@dataclass
class ClaudeCodeState(StateSchema):
    scratchpad: list[str] = field(default_factory=list)
    todos: list[dict[str, Any]] = field(default_factory=list)
    target_file: str = TARGET_FILE
    test_command: str = TEST_COMMAND
    doc_url: str = DOC_URL
    mode: str = "work"

Why this state shape works:

todos exposes a work queue that survives multiple steps
mode lets the agent remember whether it is planning or executing
doc_url adds optional external grounding without forcing browsing
scratchpad keeps the recent compressed trajectory (the sequence of observations and decisions across steps)

This is more advanced than lesson 2, but it is still disciplined:every field exists because it changes future behavior

Use reduce to absorb structured tool output

reduce() listens for tool-driven workflow state:

if isinstance(first, dict):
    if first.get("todos"):
        state.todos = list(first.get("todos") or [])
    if first.get("current_mode"):
        state.mode = str(first.get("current_mode"))
    if int(first.get("returncode", 1)) == 0:
        state.final_result = (
            "Verification passed with the canonical coding toolset."
        )
if state.metadata.get("todos"):
    state.todos = list(state.metadata.get("todos") or [])
if state.metadata.get("mode"):
    state.mode = str(state.metadata.get("mode"))

This is the long-running agent version of the same lesson:tools do work, but reduce() still decides what the agent should remember.

Introduce explicit history control

The run passes:

history_policy=HistoryPolicy(max_messages=16, max_tokens=2800)

This is the first course lesson where message-window management matters.HistoryPolicy answers:

how many recent messages are retained
how many tokens can be spent on history
when older interaction context stops being sent verbatim

This is not the same as memory and not the same as compaction (summarizing older context to stay within token limits).

Learn the boundary between history, compaction, and memory

In this lesson, the example still does not attach a custom history= or memory=.That is a meaningful choice:

HistoryPolicy controls the message budget
state stores immediate workflow artifacts like todos and mode
no separate memory store is needed yet

Add CompactHistory when the run becomes long enough that simple trimming loses too much context:

from qitos.kit import CompactConfig, CompactHistory, WindowMemory

super().__init__(
    toolset=[coding_tools(workspace_root=workspace_root)],
    llm=llm,
    model_parser=ReActTextParser(),
    history=CompactHistory(
        llm=llm,
        config=CompactConfig(
            max_tokens=2800,
            keep_last_messages=10,
            keep_last_rounds=4,
        ),
    ),
    memory=WindowMemory(window_size=30),
)

Read that carefully:

HistoryPolicy trims the message request
CompactHistory summarizes and preserves old interaction history
Memory stores reusable records outside the immediate message stream

Those are three different layers.

Understand when to upgrade the protocol

This lesson still uses text ReAct, and that is usually the right call.Consider a protocol upgrade only when you need something specific:

JSON or XML if you need stricter structured output than text ReAct
Terminus if the agent is driving a live terminal session
a model-specific parser such as MiniMaxToolCallParser if the provider emits native structured tool calls you actually want to preserve

Do not upgrade the protocol just because the agent became more advanced.

Run it like an operator and inspect it like a researcher

Run:

python examples/real/claude_code_agent.py

Inspect:

qita board --logdir runs

In qita, inspect:

whether todos appear early and remain coherent
whether mode changes match the intended workflow
how the prompt and parser still stay in the simple ReAct path
whether history trimming changes the model’s behavior
whether the run would benefit from CompactHistory

The right mental model for long-running agents

By this point in the course, you should think in layers:

state is what the next step definitely needs
history is what the next model call may need
compaction is how old history is compressed
memory is what should outlive the immediate turn structure

That separation is one of the central QitOS design decisions.

Full example

The full runnable lesson lives at:

examples/real/claude_code_agent.py

What lesson 4 adds

Lesson 4 keeps the long-running structure, but changes the domain completely. You will learn how to specialize:

tool composition
prompt policy
state semantics
reduce() logic

without inventing a new runtime.

Next lesson: Code security audit agent

Turn the same kernel into a defensive review agent with ranked findings and audit-specific traces.

Related guide: observability

Review qita board, replay, and export before studying the final audit workflow.

​What changes from lesson 2

​The system prompt now defines workflow discipline

​The default lesson parser remains ReAct on purpose

​The example is now preset-first

​The right mental model for long-running agents

​Full example

​What lesson 4 adds