# QitOS ## Docs - [CyBench](https://qitor.mintlify.app/benchmarks/cybench.md): Run CTF-style security evaluation tasks with CyBenchAdapter, Docker isolation, and guided subtask scoring. - [Desktop Starter Benchmark](https://qitor.mintlify.app/benchmarks/desktop-starter.md): The official v0.5 desktop benchmark path for OSWorld-compatible starter tasks. - [GAIA](https://qitor.mintlify.app/benchmarks/gaia.md): Run the GAIA general AI assistant benchmark with QitOS using the GaiaAdapter and a ReAct web research agent. - [OSWorld Benchmark Adapter](https://qitor.mintlify.app/benchmarks/osworld.md): The benchmark-specific OSWorld adapter path in QitOS, distinct from the desktop starter benchmark and the reusable baseline recipe. - [Benchmarks](https://qitor.mintlify.app/benchmarks/overview.md): Run desktop-starter, OSWorld, GAIA, Tau-Bench, and CyBench through one official QitOS benchmark path with normalized outputs and trace artifacts. - [Tau-Bench](https://qitor.mintlify.app/benchmarks/tau-bench.md): Evaluate tool-agent-user interaction on retail and airline customer service tasks with TauBenchAdapter. - [Why QitOS v0.5 Ships Desktop First](https://qitor.mintlify.app/blog/desktop-osworld-starter.md): Why the v0.5 release focuses on one OSWorld-compatible desktop starter path instead of several incomplete multimodal tracks. - [Why gold presets matter](https://qitor.mintlify.app/blog/gold-presets-preview.md): QitOS v0.4 moves from one-off provider wiring to reusable family-level harness defaults. - [QitOS Blog](https://qitor.mintlify.app/blog/index.md): Design notes, field reports, and practical lessons from building agents with QitOS. - [Why reproducible runs matter in QitOS](https://qitor.mintlify.app/blog/reproducible-runs.md): A short field note on official runs, benchmark output normalization, and best-effort replay. - [Why QitOS Keeps a Single Kernel](https://qitor.mintlify.app/blog/single-kernel.md): Why examples, benchmarks, and production-style agents in QitOS all run on the same AgentModule + Engine kernel. - [Agent Module](https://qitor.mintlify.app/concepts/agent-module.md): AgentModule is the policy layer that defines your agent's strategy: how it initializes state, builds prompts, decides what to do, and reduces observations into the next state. - [Engine](https://qitor.mintlify.app/concepts/engine.md): Engine is the execution kernel that owns the agent loop: it calls your AgentModule hooks in sequence, executes actions, manages tracing, and enforces stop criteria. - [Family presets](https://qitor.mintlify.app/concepts/family-presets.md): Why QitOS v0.4 introduces family presets, harness policies, and transport adapters. - [Glossary](https://qitor.mintlify.app/concepts/glossary.md): Shared language for runs, trajectories, actions, artifacts, replay, and benchmark outputs in QitOS. - [Official Runs](https://qitor.mintlify.app/concepts/official-runs.md): What qualifies as an official QitOS run, what gets recorded, and what best-effort replay really means. - [State and Task](https://qitor.mintlify.app/concepts/state-and-task.md): StateSchema is the typed container for everything your agent tracks across steps. Task is the structured package that describes what the agent should accomplish. - [Tools and Registry](https://qitor.mintlify.app/concepts/tools-and-registry.md): QitOS tools are plain Python callables marked with the @tool decorator. ToolRegistry collects them and makes them available to the Engine for execution. - [Tracing](https://qitor.mintlify.app/concepts/tracing.md): QitOS writes structured trace artifacts for every run so you can replay, diff, export, and audit agent behavior. - [Add a model-family preset](https://qitor.mintlify.app/guides/add-a-family-preset.md): How to extend QitOS v0.4 with a new family preset without forking the runtime. - [Agent Patterns](https://qitor.mintlify.app/guides/agent-patterns.md): ReAct, PlanAct, Tree-of-Thought, and Reflexion patterns with working code examples. - [Benchmarks and Recipes](https://qitor.mintlify.app/guides/benchmarks-and-recipes.md): How QitOS separates framework capabilities, benchmark adapters, and reproducible recipe baselines. - [First Agent](https://qitor.mintlify.app/guides/build-your-first-agent.md): Build a working QitOS minimal coding agent from scratch: state, model, tools, run, and qita inspection. - [Computer Use and Desktop Env](https://qitor.mintlify.app/guides/computer-use-desktop.md): Build OSWorld-inspired desktop agents on QitOS without binding yourself to a provider-specific computer-use API. - [Critics & Stop Criteria](https://qitor.mintlify.app/guides/critics-and-stop-criteria.md): Validate steps with critics and control when the Engine halts with stop criteria. - [Desktop Benchmark Starter](https://qitor.mintlify.app/guides/desktop-benchmark-starter.md): How the official desktop benchmark path fits the v0.5 QitOS story. - [Memory & History](https://qitor.mintlify.app/guides/memory-and-history.md): Manage conversation history and long-term memory in QitOS agents. - [Multimodal Core and Desktop Starter](https://qitor.mintlify.app/guides/multimodal-core.md): How QitOS v0.5 turns the multimodal foundation into one complete desktop research path. - [Observability](https://qitor.mintlify.app/guides/observability.md): Inspect, replay, and export agent runs with the qita web board. - [Qwen family best practice](https://qitor.mintlify.app/guides/qwen-family-best-practice.md): How to run Qwen and qwen-plus with the native tool-call lane in QitOS. - [Third-Party Benchmark Integration](https://qitor.mintlify.app/guides/third-party-benchmark-integration.md): The canonical QitOS contract for adding a new benchmark family without leaking benchmark logic into the framework kernel. - [Installation](https://qitor.mintlify.app/installation.md): Install QitOS from PyPI or source, with optional extras for models and benchmarks. - [Introduction](https://qitor.mintlify.app/introduction.md): QitOS is a research-first agent framework for building reproducible LLM agents with a clean AgentModule + Engine kernel. - [Prerequisites](https://qitor.mintlify.app/prerequisites.md): Get a remote model API, set your API key, and verify your endpoint before running QitOS. - [Quickstart](https://qitor.mintlify.app/quickstart.md): Run your first QitOS minimal coding agent in under 2 minutes, then inspect it with qita. - [API Reference](https://qitor.mintlify.app/reference/api.md): Complete reference for the public QitOS Python API — everything exported from the qitos package. - [CLI Reference](https://qitor.mintlify.app/reference/cli.md): Reference for qita and qit, including the minimal coding-agent demo and the official benchmark CLI added in QitOS v0.3. - [Configuration](https://qitor.mintlify.app/reference/configuration.md): All configuration options available in QitOS: AgentModule.run() arguments, Engine constructor arguments, tracing controls, and environment variables. - [Kit Reference](https://qitor.mintlify.app/reference/kit.md): Reference for qitos.kit — reusable parsers, memory adapters, tool sets, planning helpers, critics, and models. - [Model family matrix](https://qitor.mintlify.app/reference/model-family-matrix.md): The built-in QitOS v0.4 gold presets and their default harness policies. - [Lesson 3: Claude Code-style agent](https://qitor.mintlify.app/tutorials/claude-code.md): Build a long-running coding agent and learn when to move from manual registries to presets, history control, and compaction. - [Lesson 4: Code security audit agent](https://qitor.mintlify.app/tutorials/code-security-audit.md): Specialize the QitOS kernel into a reproducible defensive audit agent with domain tools, ranked findings, and review-grade traces. - [Tutorials](https://qitor.mintlify.app/tutorials/index.md): A self-contained course for researchers designing agents in the QitOS mindmap. - [Inspect a GUI Failure in qita](https://qitor.mintlify.app/tutorials/inspect-a-gui-failure-in-qita.md): Use qita's visual timeline and replay preview to understand a failed desktop run. - [Lesson 2: PlanAct](https://qitor.mintlify.app/tutorials/planact.md): Add explicit planning to the ReAct kernel and learn how QitOS separates planner control from executor control. - [Lesson 1: ReAct](https://qitor.mintlify.app/tutorials/react.md): Build the first real QitOS agent and learn the full prompt-parser-tool-reduce loop from end to end. - [Tutorial: Replay and Inspect a Failed Run](https://qitor.mintlify.app/tutorials/replay-and-inspect-failed-runs.md): Use qita board, replay, export, and diff to understand why an official QitOS run behaved differently. - [Tutorial: Reproducible Benchmark Runs](https://qitor.mintlify.app/tutorials/reproducible-benchmark-runs.md): How to run GAIA, Tau-Bench, or CyBench through the official QitOS benchmark path and inspect the resulting artifacts. - [Run Your First Desktop Benchmark](https://qitor.mintlify.app/tutorials/run-your-first-desktop-benchmark.md): Run the official v0.5 desktop benchmark and inspect its trace. - [Switch model families in one example](https://qitor.mintlify.app/tutorials/switch-model-families.md): Use the same Claude Code-style agent across Qwen, Kimi, MiniMax, gpt-oss, and Gemma 4.