Skip to main content
This is the first complete agent in the course. It is small on purpose, but it is not fake. You will build a real coding agent with:
  • a typed state
  • a real LLM harness
  • a real system prompt
  • a real parser
  • real tools
  • a real reduce() loop
  • real qita traces
You will study examples/patterns/react.py, but the lesson is written so you do not need to reverse-engineer that file to understand why it works.

What you are building

The task is tiny:
  • open buggy_module.py
  • fix add(a, b) so it returns a + b
  • run a verification command
That small task is useful because it lets you see the whole QitOS kernel without extra orchestration noise.

The design for this lesson

Design branchChoice in this lessonWhy this is the right first choice
Task shapeOne-file bug fix with one verification commandEasy to verify, easy to trace
Statescratchpad, target_file, test_commandJust enough to influence the next step
Model harnessOpenAICompatibleModel returning textSimple and portable across providers
Prompt contractREACT_SYSTEM_PROMPTExplicit one-tool-per-turn text protocol
ParserReActTextParserDirect match for Thought: / Action: output
ToolsCompact CodingToolSet inside a manual ToolRegistryLearn tool design before presets
MemoryNoneThe run is too short to justify separate memory
HistoryDefault Engine history behaviorDo not introduce context control before you need it
Traceabilityqita boardLearn to inspect the kernel from day one

Why we start with the text ReAct harness

The first lesson uses:
OpenAICompatibleModel(...)
with:
model_parser=ReActTextParser()
That gives you the most transparent possible path: messages -> text model output -> ReAct parser -> Decision -> tool execution We do not start with native tool calling, XML, JSON, or model-specific harnesses because those add coupling before you understand the core loop.

The system prompt is a contract, not decoration

The lesson uses the canonical ReAct prompt:
You are a reliable ReAct agent.

Rules:
- Use at most one tool call per response.
- Never invent tool names or arguments.
- If a tool result is enough to conclude, output final answer directly.

Output contract (strict):
Thought: <one concise reasoning sentence>
Action: <tool_name>(arg=value, ...)
or
Final Answer: <final answer only>
In code, that is:
def build_system_prompt(self, state: ReactState) -> str | None:
    return render_prompt(
        REACT_SYSTEM_PROMPT,
        {"tool_schema": self.tool_registry.get_tool_descriptions()},
    )
This matters because ReActTextParser is not doing magic. It expects exactly this style of output. The first durable QitOS lesson is:
  • prompt format and parser choice are one design decision
  • if you change one, you usually need to change the other

The full model harness for this lesson

The example builds the model like this:
def build_model() -> OpenAICompatibleModel:
    return OpenAICompatibleModel(
        model=MODEL_NAME,
        api_key=api_key,
        base_url=MODEL_BASE_URL,
        temperature=0.2,
        max_tokens=2048,
    )
Why this harness is appropriate here:
  • it works with OpenAI-compatible endpoints
  • it keeps the response in plain text
  • it stays compatible with the prompt-injection tool schema path used by REACT_SYSTEM_PROMPT
  • it keeps the lesson portable across research labs and local gateways
You are not choosing the “best model” here. You are choosing the simplest harness that exposes the kernel clearly.
1

Design the state around the next step

The state is intentionally small:
@dataclass
class ReactState(StateSchema):
    scratchpad: list[str] = field(default_factory=list)
    target_file: str = "buggy_module.py"
    test_command: str = (
        'python -c "import buggy_module; assert buggy_module.add(20, 22) == 42"'
    )
Why these fields?
  • scratchpad stores the compressed trajectory that the next model step can use
  • target_file keeps the agent grounded in one artifact
  • test_command turns “done” into an executable success condition
This is your first QitOS habit:add only state that changes future decisions
2

Expose a minimal tool surface

The example uses a manual registry so you can see exactly what is being exposed:
registry = ToolRegistry()
registry.include(
    CodingToolSet(
        workspace_root=workspace_root,
        include_notebook=False,
        enable_lsp=False,
        enable_tasks=False,
        enable_web=False,
        expose_modern_names=False,
    )
)
This is important. CodingToolSet is a bundle, but you still control its surface.For lesson 1, the right tool surface is just enough to:
  • inspect files
  • edit files
  • run the verification command
Do not expose a richer toolset until the task requires it.
3

Bind the prompt to the parser

The agent constructor pairs the prompt contract and the parser:
super().__init__(
    tool_registry=registry,
    llm=llm,
    model_parser=ReActTextParser(),
)
Read that as one sentence:“This agent asks the model to speak ReAct text, and the Engine parses that text with the ReAct parser.”In QitOS, this pairing is the harness.Later lessons will change prompts and protocols. For now, keep this pair fixed.
4

Prepare only the context the next step needs

prepare() curates the current step’s input:
def prepare(self, state: ReactState) -> str:
    lines = [
        f"Task: {state.task}",
        f"Target file: {state.target_file}",
        f"Verification command: {state.test_command}",
        f"Step: {state.current_step}/{state.max_steps}",
    ]
    if state.scratchpad:
        lines.append("Recent trajectory:")
        lines.extend(state.scratchpad[-8:])
    return "\n".join(lines)
This is the second core habit:prepare() is not a state dump. It is a prompt-ready view of state.
5

Use reduce to define what the agent remembers

ReAct learns inside reduce():
def reduce(
    self,
    state: ReactState,
    observation: dict[str, Any],
    decision: Decision[Action],
) -> ReactState:
    action_results = (
        observation.get("action_results", [])
        if isinstance(observation, dict)
        else []
    )
    if decision.rationale:
        state.scratchpad.append(f"Thought: {decision.rationale}")
    if decision.actions:
        state.scratchpad.append(f"Action: {format_action(decision.actions[0])}")
    if action_results:
        first = action_results[0]
        state.scratchpad.append(f"Observation: {first}")
        if isinstance(first, dict) and int(first.get("returncode", 1)) == 0:
            state.final_result = "Patch applied and verification passed."
    state.scratchpad = state.scratchpad[-30:]
    return state
Three lessons are hidden in this one function:
  • not every observation belongs in future context
  • state is where you keep the compressed working memory
  • final_result is a clean, explicit success signal
6

Notice what we are not using yet

We do not use:
  • decide() overrides
  • explicit planning
  • memory adapters
  • custom history implementations
  • context compaction
  • model-specific protocol overrides
That is not because QitOS lacks them. It is because lesson 1 is about seeing the default path clearly before you bend it.
7

Run the example and inspect the kernel with qita

Run it:
python examples/patterns/react.py
Then inspect it:
qita board --logdir runs
In qita, check:
  • the exact prompt text sent to the model
  • whether the parser produced clean Thought and Action fields
  • whether the tool output made the verification condition obvious
  • whether final_result is set at the first true success condition

Why there is no separate memory or compaction yet

For this lesson, the right memory choice is “none.” Why:
  • the run is short
  • the useful context is already visible in scratchpad
  • adding retrieval or compaction here would blur the architecture before you understand it
In QitOS, memory is not a badge of sophistication. It is an answer to a concrete long-run problem.

Full example

The full runnable lesson lives at:

What lesson 2 changes

Lesson 2 keeps the same model harness and the same execution parser, but introduces a new idea: planning should become explicit state and explicit control flow, not a longer hidden thought.

Next lesson: PlanAct

Add a planner, a cursor, and a decide() gate without changing the core runtime.

Related reference: kit

Review ReActTextParser, prompt templates, and coding tool surfaces used in this lesson.