Skip to main content
This lesson is the first time you deliberately bend the default loop. You are still building an ordinary QitOS agent, but now you will introduce:
  • a planning artifact in state
  • a planner prompt separate from the execution prompt
  • a decide() override that only handles the planning boundary
The key idea is that you still do not introduce a second runtime.

What changes from lesson 1

BranchLesson 1Lesson 2
ControlDefault LLM path every stepdecide() intercepts only the planning boundary
PromptingOne ReAct system promptOne planning prompt plus one execution prompt
StateScratchpad + task fieldsAdd plan_steps and cursor
ParserReActTextParserStill ReActTextParser for execution
ToolsCompact coding toolsSame compact coding tools
Memory and historyNone beyond stateStill no separate memory or compaction
That last row matters. You are adding planning, not context complexity.

The two-prompt architecture

This lesson uses two prompt contracts.

Planner prompt

You are a planning module.
Break the task into 3-7 atomic executable steps.

Constraints:
- Each step must be actionable and verifiable.
- Prefer tool-executable operations over vague reasoning.
- No prose outside the numbered list.
In code, that is PLAN_DRAFT_PROMPT.

Executor prompt

You are the execution module for a Plan-Act agent.

You will receive the global task and one current plan step.
Execute only the current step. Do not jump ahead.

Output contract (strict):
Thought: <one sentence>
Action: <tool_name>(arg=value, ...)
or
Final Answer: <step result>
In code, that is PLAN_EXEC_SYSTEM_PROMPT. The design lesson is:
  • planning and acting can use different prompts
  • but they still flow through the same AgentModule + Engine runtime

The parser story in this lesson

The planner path does not use ReActTextParser. Instead:
  • _plan() renders PLAN_DRAFT_PROMPT
  • NumberedPlanBuilder calls the same LLM harness
  • the builder parses a numbered list into list[str]
The execution path does use ReActTextParser. That means lesson 2 already teaches a subtle but important QitOS idea: different phases of the same agent can use different parsing contracts, as long as the control boundary is explicit.

The model harness stays intentionally boring

Just like lesson 1, the example uses:
OpenAICompatibleModel(...)
Why keep the same harness?
  • so you can isolate the effect of planning
  • so prompt and parser changes are easy to interpret
  • so the new lesson teaches one new idea instead of five
1

Extend state with a plan and a cursor

The state adds only what execution needs:
@dataclass
class PlanActState(StateSchema):
    plan_steps: list[str] = field(default_factory=list)
    cursor: int = 0
    target_file: str = "buggy_module.py"
    test_command: str = TEST_COMMAND
    scratchpad: list[str] = field(default_factory=list)
This is the first time the course makes a hidden reasoning artifact explicit.Why store the plan in state?
  • the trace can show it
  • prepare() can surface it
  • reduce() can advance it
  • your own logic can rewrite it later if needed
2

Use a dedicated plan builder

The planner is initialized once:
self.plan_builder = NumberedPlanBuilder()
And called like this:
prompt = render_prompt(
    PLAN_DRAFT_PROMPT,
    {
        "task": (
            f"{state.task}\n"
            f"Target file: {state.target_file}\n"
            f"Last step must run: {state.test_command}"
        ),
    },
)
plan = self.plan_builder.build(self.llm, prompt)
This is the right QitOS move:planning becomes a named artifact with a dedicated parser, not an unstructured paragraph in the main scratchpad.
3

Use decide only as the planning gate

The control logic is small:
def decide(self, state: PlanActState, observation: dict[str, Any]):
    if not state.plan_steps or state.cursor >= len(state.plan_steps):
        if not self._plan(state):
            return Decision.final("Failed to build a valid plan.")
        return Decision.wait("plan_ready")
    return None
That return None is the whole point.Once a plan exists, the Engine goes back to its default LLM path:prompt -> ReActTextParser -> Decision -> tool executionSo lesson 2 is not about replacing the runtime. It is about adding one explicit control boundary to it.
4

Bind execution prompt and parser clearly

Execution still uses:
super().__init__(
    tool_registry=registry,
    llm=llm,
    model_parser=ReActTextParser(),
)
and:
def build_system_prompt(self, state: PlanActState) -> str | None:
    return render_prompt(
        PLAN_EXEC_SYSTEM_PROMPT,
        {
            "current_step": self._current_step_text(state),
            "tool_schema": self.tool_registry.get_tool_descriptions(),
        },
    )
So the planning phase and the execution phase are visibly different:
  • numbered plan builder for planning
  • ReAct text contract for execution
5

Make the plan visible in prepare

prepare() now renders both the global task and the current plan step:
def prepare(self, state: PlanActState) -> str:
    lines = [
        f"Task: {state.task}",
        f"Plan cursor: {state.cursor}/{len(state.plan_steps)}",
        f"Current plan step: {self._current_step_text(state)}",
        f"Step: {state.current_step}/{state.max_steps}",
    ]
This changes the agent’s working memory shape.Instead of re-reasoning over the entire task every step, the model reasons over:
  • one task
  • one explicit plan
  • one current plan item
6

Advance plan progress in reduce

Progress becomes ordinary state logic:
if isinstance(first, dict) and first.get("status") == "success":
    state.cursor += 1
if isinstance(first, dict) and int(first.get("returncode", 1)) == 0:
    state.final_result = "Verification passed."
    state.cursor = len(state.plan_steps)
The important lesson is not the exact condition. It is the placement:reduce() is where you decide what counts as plan completion.
7

Keep memory and history simple on purpose

Lesson 2 still does not add:
  • a memory adapter
  • HistoryPolicy tuning
  • CompactHistory
Why not?Because the plan itself already compresses the task into a better working form. Introducing context compaction here would blur whether behavior changed because of planning or because of context management.
8

Run it and inspect the planning boundary in qita

Run:
python examples/patterns/planact.py
Inspect:
qita board --logdir runs
In the trace, pay attention to:
  • the step where Decision.wait("plan_ready") appears
  • the moment plan_steps becomes part of state
  • the fact that later execution still uses the same ReAct parser path

Why PlanAct is still the same kernel

Researchers often think adding planning requires:
  • a separate planner service
  • a planner-executor loop outside the framework
  • a second agent runtime
This lesson is showing the opposite design:
  • a planner is just another controlled model call
  • a plan is just another state artifact
  • execution is still the normal Engine path
That is one of the deepest QitOS ideas.

Full example

The full runnable lesson lives at:

What lesson 3 adds

Lesson 3 keeps the same kernel again, but now the agent becomes operationally long-running. That means you will finally introduce:
  • preset toolsets instead of manual wiring
  • a workflow-oriented system prompt
  • explicit history control
  • the point where context compaction and memory become real design questions

Next lesson: Claude Code-style agent

Move from pattern design to a long-running workspace agent with presets, history policy, and qita-driven debugging.

Related guide: memory and history

Review the distinction between state, history, compaction, and memory before the long-running lesson.