qitafor trace inspectionqitfor demos, benchmarks, and developer workflows
qit demo
Useqit demo when you want the fastest path to a real model-backed QitOS run.
qit demo minimal
- reads your OpenAI-compatible model config from env vars or flags
- seeds a tiny buggy workspace
- runs the minimal coding agent on that workspace
- writes a qita-ready trace under
./runs
--workspace ./playground/minimal_coding_agent--logdir ./runs--model-name Qwen/Qwen3-8B--base-url https://api.siliconflow.cn/v1/--api-key sk-...--task "Fix the bug in buggy_module.py and make the verification command pass."--max-steps 8--render
qita
Useqita when you want to inspect traced runs.
qita board
- run list and filtering
- compare pickers
- run detail links
- replay links
- raw and HTML export
qita replay
qita export
qit bench
qit bench is the canonical benchmark CLI in v0.3.
qit bench run
- loads benchmark tasks
- constructs
RunSpecandExperimentSpec - produces normalized
BenchmarkRunResultrows
desktop-starterfor the canonical starter benchmark familyosworldfor the benchmark-specific OSWorld adapter pathgaia,tau-bench, andcybenchfor the migrated benchmark families now living underqitos.benchmark.*desktopas a compatibility alias fordesktop-starter
qitos.benchmark.*for benchmark adapters and evaluatorsqitos.recipes.*for canonical baseline methodsexamples/*as thin entrypoints only
qit bench eval
qit bench replay
qit bench export
qit skill
Recommended workflow
For a first run:export OPENAI_API_KEY=...qit demo minimalqita board
qit bench runqit bench evalqita boardqit bench replayqit bench export
