Skip to main content

Documentation Index

Fetch the complete documentation index at: https://qitor.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Tracing is the persistence layer behind QitOS observability — every run writes structured artifacts (persistent output files like manifest.json, steps.jsonl, and events.jsonl) that capture what happened. Every traced run writes a self-contained directory:
<trace_logdir>/<run_id>/
  manifest.json
  events.jsonl
  steps.jsonl

What each file means

FilePurpose
manifest.jsonRun summary, reproducibility metadata, benchmark metadata, and official-run fields
events.jsonlEvent stream across runtime phases
steps.jsonlOne structured record per completed step
The raw files are the source of truth. qita is the human inspection surface built on top of them.

Why tracing is a first-class feature

QitOS is built for agent research, not only one-off demos, so the framework must preserve:
  • how a run stopped
  • what prompt/parser contract it used
  • what tool surface it saw
  • how context changed over time
  • which config fields matter for replay and comparison
Tracing is enabled by default in AgentModule.run(...) for these reasons.

Trace metadata in v0.3

The v0.3 closure adds stronger reproducibility metadata to the manifest, including:
  • git_sha
  • package_version
  • benchmark_name
  • benchmark_split
  • model_family
  • prompt_protocol
  • parser_name
  • tool_manifest
  • run_spec
  • experiment_spec
  • official_run
  • replay_mode
  • token / latency / cost summaries
These fields support qita comparison and benchmark result normalization.

Best-effort replay

Tracing in QitOS supports best-effort research replay. That means QitOS records enough information to inspect and compare runs well, but does not promise strict deterministic re-execution for remote models or external environments. Use traces for:
  • debugging long trajectories
  • comparing prompt/parser/tool changes
  • exporting artifacts for review
  • replaying benchmark failures
Do not rely on traces for exact token-level reproducibility from remote providers.

qita on top of traces

Once traces exist, use:
qita board --logdir ./runs
qita replay --run ./runs/<run_id>
qita export --run ./runs/<run_id> --html ./report.html
qita also supports run comparison so you can ask why two runs diverged instead of reading raw JSON by hand.