Introduction

QitOS gives you a stable AgentModule + Engine kernel for building reproducible LLM agents. Whether you are prototyping ReAct loops, running GAIA benchmarks, or shipping a production coding agent, QitOS keeps your runs traceable, your patterns composable, and your results reproducible. A long-running QitOS coding run stays readable — it does not degrade into opaque glue code.

Two ways to author agents in QitOS

QitOS keeps two authoring paths intentionally.

Research-first

This is the default path for researchers who want the most control. You handwrite:

the system prompt
the parser (the component that converts raw model output into a structured Decision)
the model protocol (the output format contract expected from the model)
the model transport (the client that sends requests to the model API)
the tool surface (the set of tools the agent can call)

This is the most PyTorch-like path in the framework: direct, explicit, and easy to modify for experiments.

Preset-first

This is the fastest path when you want a stable baseline or quick model-family switching. You start from:

family presets (pre-configured defaults for a model family, covering protocol, transport, and tool delivery)
harness policies (rules that control how the model harness resolves at runtime)
preset tool builders (pre-assembled tool bundles for common workflows)

This path is especially useful when one agent should switch across Qwen, Kimi, MiniMax, gpt-oss, and Gemma 4 without rewriting the agent.

Who QitOS is for

QitOS is built for three kinds of practitioners:

Researchers — prototype ReAct, PlanAct, Tree-of-Thought, Reflexion, and new agent methods, then diff, replay, and publish the runs.
Agent builders — build tool-using agents on a stable execution loop, instead of piling framework glue code on top of raw LLM calls.
Evaluators — run GAIA, Tau-Bench, and CyBench workflows on the same kernel you use in product agents, so benchmark results actually transfer to real use.

Key capabilities

Reproducible runs

Every QitOS run writes a manifest.json, events.jsonl, and steps.jsonl to a local run directory. The manifest captures the model ID, prompt hash, config hash, seed, and tool manifest (the serialized description of all tools available to the run) — everything you need to reproduce or compare runs exactly.

Built-in observability with qita

qita is QitOS’s built-in trace viewer. After any run, launch the board to inspect step-by-step execution, replay the trajectory (the temporal sequence of prompts, decisions, tool calls, and observations across steps), and export traces to standalone HTML.

qita board --logdir runs

The board runs at http://127.0.0.1:8765 and auto-refreshes as new runs appear.

Canonical agent patterns

QitOS ships with canonical implementations of four established reasoning patterns, each runnable from examples/patterns/:

Pattern	Description
ReAct	A reasoning-acting loop where the model alternates between thinking and taking one action per step
PlanAct	Generate an explicit plan first, then execute it step by step
Tree-of-Thought	Branch into multiple candidate paths, score them, then choose the best one
Reflexion	An actor-critic loop where a critic evaluates each step and the actor retries based on feedback

Benchmark adapters

QitOS includes adapters for GAIA, Tau-Bench, and CyBench that run on the same AgentModule + Engine kernel you use for your own agents. You don’t need a separate evaluation harness.

Critics and hooks

QitOS provides two complementary runtime extension mechanisms:

Critics control agent behavior quality at runtime. A critic can continue, stop, or retry a step — and on retry, it can patch the agent’s instructions or state. This is unique to QitOS: no other framework offers runtime behavior correction with instruction and state patches.
Hooks observe the Engine loop at defined lifecycle points without controlling execution. Use hooks for logging, metrics collection, and audit trails.

Checkpoint and fork

Save an agent’s state at any step, resume after interruption, and fork a run to explore alternative paths. QitOS checkpointing supports both in-memory and SQLite storage, time-travel within the same thread, and true branching into new threads.

MCP integration

Bridge Model Context Protocol (MCP) servers into QitOS agent tool registries. Connect to MCP servers via stdio or HTTP transport, filter which tools to expose, and manage async server lifecycles.

Featured designs

These are the design decisions that distinguish QitOS from a loose collection of agent utilities.

Single-kernel architecture

QitOS is opinionated about one thing above all: there is one runtime kernel per run.

AgentModule defines policy (what the agent should do at each step)
Engine owns execution (the actual step loop)
tools, parsers, critics, memory, and tracing attach to that kernel instead of spawning separate orchestrators

This is what keeps examples, benchmarks, and production-style agents comparable.

Protocol-aware prompting and parsing

QitOS treats prompt format and parser choice as a first-class contract — they must match each other.

ReAct prompts pair with ReActTextParser
JSON prompts pair with JsonDecisionParser
XML prompts pair with XmlDecisionParser
more structured variants such as Terminus and MiniMax tool-call parsers follow the same model-response → parser → Decision path

This is why traces stay understandable instead of collapsing into provider-specific glue code.

Preset-first agent authoring

QitOS gives you reusable authoring blocks instead of forcing every agent to rebuild the same wiring.

preset tool bundles such as coding_tools(...), advanced_coding_tools(...), web_tools(), and task_tools(...)
reusable memory adapters such as WindowMemory, SummaryMemory, VectorMemory, and MarkdownFileMemory
reusable history strategies such as WindowHistory, TokenBudgetSummaryHistory, and CompactHistory
reusable planners such as NumberedPlanBuilder and DynamicTreeSearch

The framework is designed so researchers can change the policy without rebuilding the whole stack.

Long-running context control

Long-running agents are a first-class concern in QitOS. You can control context growth through:

HistoryPolicy for selecting which messages to send to the model
token-budget-aware summarization with TokenBudgetSummaryHistory
multi-stage compaction (gradually compressing older context while keeping recent messages intact) with CompactHistory
memory adapters for semantic or persistent recall across steps

The framework helps you study and shape long runs instead of pretending context is free.

Trace-first observability with qita

QitOS assumes that if a run matters, it should be inspectable afterward. Every traced run produces structured artifacts (saved files that record what happened during the run), and qita turns them into:

a board for comparing runs
replay for step-by-step inspection
export for shareable standalone HTML

This trace-first design is why the same kernel works well for experiments, benchmarks, and debugging.

Domain specialization without a new runtime

One of the strongest design features in QitOS is that domain agents are still ordinary QitOS agents — they do not need a separate runtime. Product-style agents, benchmark runners, and domain agents all reuse the same runtime. Full product applications belong in qitos-zoo; domain behavior is expressed through:

state design
prompt policy
tool composition
reduce() semantics

This makes specialization easier to reason about and easier to reproduce.

Status

QitOS is currently Alpha. The stable foundation is the AgentModule + Engine kernel, the qita trace/observability flow, canonical examples, and benchmark adapters. Higher-level convenience APIs, some kit modules, and experimental toolsets are likely to evolve. If you are evaluating QitOS for adoption, start from the kernel and examples rather than assuming all higher-level APIs are frozen.

Where to go next

Quick start

Run your first agent in under 2 minutes

Tutorials

Follow the four-lesson research track for designing agents in the QitOS mindmap

Installation

Install QitOS and its optional extras

Core concepts

Understand AgentModule, Engine, State, and Tools

Agent patterns

ReAct, PlanAct, Tree-of-Thought, and Reflexion examples

Kit reference

Explore pre-built tools, memory, parsers, planners, and history strategies

Tracing & qita

Learn how QitOS makes every run inspectable, replayable, and exportable

​Two ways to author agents in QitOS

​Research-first

​Preset-first

​Who QitOS is for

​Key capabilities

​Reproducible runs

​Built-in observability with qita

​Canonical agent patterns

​Benchmark adapters

​Critics and hooks

​Checkpoint and fork

​MCP integration

​Featured designs

​Single-kernel architecture

​Protocol-aware prompting and parsing

​Preset-first agent authoring

​Long-running context control

​Trace-first observability with qita

​Domain specialization without a new runtime

​Status

​Where to go next

Quick start

Tutorials

Installation

Core concepts

Agent patterns

Kit reference

Tracing & qita

Two ways to author agents in QitOS

Research-first

Preset-first

Who QitOS is for

Key capabilities

Reproducible runs

Built-in observability with qita

Canonical agent patterns

Benchmark adapters

Critics and hooks

Checkpoint and fork

MCP integration

Featured designs

Single-kernel architecture

Protocol-aware prompting and parsing

Preset-first agent authoring

Long-running context control

Trace-first observability with qita

Domain specialization without a new runtime

Status

Where to go next