Step 1: open the board
- stop reason
- step count
- event count
- token usage
- parser warnings
- official-run and replay metadata
Step 2: open one failed run
Pick a run withstop_reason=max_steps, exception, or obvious parser trouble.
Then open:
official runreplay modegit SHApackageseedprompt protocolparser
Step 3: inspect parser and context telemetry
In the run page, look for:- parser diagnostics
- context occupancy timeline
- compaction markers
- model response summaries
- a protocol mismatch
- poor tool choice
- context saturation
- benchmark setup failure
Step 4: compare two runs
Use the board compare controls or open the diff route directly:- stop reason
- final result
- step count
- event count
- token usage
- latency
- cost
- parser diagnostics
- first failure step
- run config diff
Step 5: export what matters
When you need to share a failure with a collaborator:Best-effort replay reminder
Replay in QitOS is currently best effort. It is strong enough for:- research debugging
- benchmark review
- prompt/parser regression analysis
- artifact sharing
