Skip to main content

Documentation Index

Fetch the complete documentation index at: https://qitor.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

WandbTraceProcessor implements the TraceProcessor ABC and streams QitOS run data to a Weights & Biases project. Once attached, it automatically logs per-span metrics during the run and writes a final summary when the trace ends.

Installation

pip install qitos[wandb]
This installs the wandb SDK as an optional dependency. Without it, importing WandbTraceProcessor raises an ImportError.

Quick start

from qitos.tracing import add_trace_processor
from qitos.tracing.wandb_processor import WandbTraceProcessor

processor = WandbTraceProcessor(
    project="my-qitos-runs",
    name="gaia-eval-001",
    tags=["benchmark", "gaia"],
    config={"model": "gpt-4o", "max_steps": 15},
)
add_trace_processor(processor)

result = agent.run(task="...", return_state=True)
When the run starts, WandbTraceProcessor calls wandb.init() with the provided arguments. When the trace ends (either normally or on error), it writes a summary and calls wandb.finish() by default.

Constructor parameters

ParameterTypeDefaultDescription
projectstr"qitos"W&B project name passed to wandb.init
namestr | NoneNoneW&B run name. Falls back to the QitOS trace name
configdict | NoneNoneDictionary passed as config to wandb.init
tagslist[str] | NoneNoneTags for the W&B run
entitystr | NoneNoneW&B entity (user or team)
auto_finishboolTrueWhether to call wandb.finish() when the trace ends

What gets logged

Per-span metrics

The processor intercepts span-end events and logs metrics incrementally during the run.
Span typeMetrics logged
GenerationSpanDatageneration/prompt_tokens, generation/completion_tokens, generation/total_tokens, generation/model
StepSpanDatastep/number
CriticSpanDatacritic/score, critic/name
ToolSpanDatatool/name
ActSpanDataaction/name
Each wandb.log() call increments an internal step counter so that the W&B time-series charts align with the agent’s progression through the run.

Final summary

When the trace ends, the processor writes aggregate metrics to run.summary:
Summary keyDescription
total_tokensCumulative prompt + completion tokens across all generation spans
total_stepsNumber of step spans processed
total_tool_callsCount of tool and action spans
critic/avg_scoreMean of all critic scores (only if at least one critic score was logged)
critic/min_scoreMinimum critic score
critic/max_scoreMaximum critic score
stop_reasonThe run’s stop reason, extracted from trace metadata

Combining with other processors

add_trace_processor appends to the global processor list, so you can combine WandbTraceProcessor with any other TraceProcessor (for example, the default LegacyTraceWriterProcessor that writes to disk):
from qitos.tracing import add_trace_processor
from qitos.tracing.wandb_processor import WandbTraceProcessor

wandb_processor = WandbTraceProcessor(
    project="my-qitos-runs",
    config={"model": "gpt-4o"},
)
add_trace_processor(wandb_processor)

# The default file-based trace writer is still active.
result = agent.run(task="...", return_state=True)
To replace all processors (removing the default writer), use set_trace_processors:
from qitos.tracing import set_trace_processors

set_trace_processors([wandb_processor])

Using with presets for config

Family presets provide recommended model parameters. Use them to populate the W&B config dictionary so that your W&B dashboard reflects the same settings the agent used:
from qitos.harness import resolve_family_preset
from qitos.tracing import add_trace_processor
from qitos.tracing.wandb_processor import WandbTraceProcessor

preset = resolve_family_preset("qwen")

processor = WandbTraceProcessor(
    project="qwen-experiments",
    config={
        "model": preset.model_id,
        "max_steps": preset.recommended_max_steps,
        "max_tokens": preset.recommended_max_tokens,
    },
    tags=[preset.family],
)
add_trace_processor(processor)

result = agent.run(task="...", return_state=True)

Lifecycle control

auto_finish

By default, auto_finish=True and the processor calls wandb.finish() automatically when on_trace_end fires. Set auto_finish=False if you want to continue logging custom metrics to the same W&B run after the QitOS trace ends:
import wandb
from qitos.tracing import add_trace_processor
from qitos.tracing.wandb_processor import WandbTraceProcessor

processor = WandbTraceProcessor(
    project="my-qitos-runs",
    auto_finish=False,
)
add_trace_processor(processor)

result = agent.run(task="...", return_state=True)

# Log additional custom metrics to the same W&B run
wandb.log({"custom/accuracy": 0.92})

wandb.finish()

shutdown()

Call shutdown() to close the W&B run early (for example, on SIGTERM or in a notebook cleanup step):
processor.shutdown()
This calls wandb.finish() if a run is active and auto_finish is True. It is safe to call multiple times.

force_flush()

Call force_flush() to ensure all buffered metrics are written to the W&B backend:
processor.force_flush()
This logs an empty record at the current step counter, which triggers a flush of the W&B internal buffer.