An official QitOS run is not just a run that produced a trace. It is a run with enough structure to be compared, replayed, exported, and discussed as a research artifact.Documentation Index
Fetch the complete documentation index at: https://qitor.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Minimum contract
A run counts as an official QitOS run when its trace manifest includes:- a
RunSpec - an
ExperimentSpecfor benchmark work - a standard
manifest.json,events.jsonl, andsteps.jsonl - replay and export compatibility with
qita - a normalized benchmark result row when the run comes from
qit bench runor a benchmark example wrapper
Why this matters
Without that contract, two runs may both finish, but you still cannot answer the important questions:- were they using the same parser contract?
- were they using the same tool surface?
- was the benchmark split the same?
- can I replay the failure later?
- can I diff the run config instead of guessing?
Best-effort replay
QitOS currently provides research-grade best-effort replay (replay that captures enough state to inspect and compare runs, but does not guarantee byte-for-byte identical re-execution), not strict byte-for-byte determinism. QitOS records enough information to make replay and comparison useful:seedgit_shapackage_versionprompt_protocolparser_nametool_manifest- environment summary
- step/event traces
- for debugging and inspection
- for prompt/parser/tool regressions
- for benchmark comparison
- for sharing runs with collaborators
Where you see this in practice
Openqita board and qita replay on a trace directory:
Canonical path
For benchmark work, the canonical path is:examples/benchmarks/ remain available, but they are now thin wrappers around the same official runner contract.
