Skip to main content
qitos.kit provides concrete, reusable building blocks for common agent authoring patterns. All components attach to the AgentModule + Engine pipeline and do not introduce a second orchestrator.
from qitos.kit import ReActTextParser, WindowMemory, CodingToolSet, ...

How to read this page

Use this page as a capability map for QitOS authoring.
If you are deciding…Start here
How the model should emit actionsParsers
How to keep useful context in long runsMemory and history/compaction
Which tools to exposeTool sets
Which ready-made registries to start fromPreset builders
How to add planning or search structurePlanning
How to start screenshot-first multimodal workScreenshotEnv and the visual guide
How to build desktop / computer-use agentsDesktopEnv, ComputerUseToolSet, and the desktop guide

Capability map

Parsers

  • ReActTextParser for Thought: / Action: protocols
  • JsonDecisionParser for JSON decision objects
  • XmlDecisionParser for XML protocols
  • MiniMaxToolCallParser when the model returns function-call-like structures
  • TerminusJsonParser and TerminusXmlParser for explicit termination-aware formats

Tool presets

  • coding_tools(...) for the canonical coding workspace
  • computer_use_tools() for provider-neutral desktop / GUI action workflows
  • advanced_coding_tools(...) for a Claude-style coding preset
  • web_tools() for web research and extraction
  • task_tools(...) for persistent task-board workflows
  • security_audit_tools(...) for defensive repository review
  • thinking_tools() for explicit thought-recording flows
  • notebook_tools(...), report_tools(...), and epub_tools(...) for narrower scenarios

Environments

  • ScreenshotEnv for screenshot-first multimodal and GUI-adjacent workflows
  • DesktopEnv for OSWorld-inspired desktop and computer-use loops
  • TextWebEnv for text-browser-style web observation
  • TmuxEnv for interactive terminal workflows
ScreenshotEnv is the first built-in multimodal environment. Use it when you want to test screenshot-based reasoning, visual-web prompting, or the new qita visual-asset path without committing to a full benchmark adapter. DesktopEnv builds on the same multimodal core but adds an OSWorld-style desktop lane: screenshot + accessibility + terminal observation, GUI controller ops, and container-first provider boundaries.

Long-running context blocks

  • WindowHistory for simple recency windows
  • TokenBudgetSummaryHistory for token-budget summarization
  • CompactHistory for multi-stage compaction with warning and summary events
  • WindowMemory, SummaryMemory, VectorMemory, and MarkdownFileMemory for cross-step recall
  • NumberedPlanBuilder for explicit plans
  • PlanCursor for plan execution bookkeeping
  • DynamicTreeSearch for branch selection and search-driven agents

Parsers convert raw LLM output strings into typed Decision objects. Choose the parser that matches the output format your prompt requests.
Prompt format and parser must match exactly. If your prompt asks for Thought: / Action: blocks, use ReActTextParser. If it asks for JSON, use JsonDecisionParser. If it asks for XML, use XmlDecisionParser.
Parses ReAct-style text output with labeled blocks such as Thought: and Action:.
from qitos.kit import ReActTextParser

class ReActTextParser(BaseParser[dict[str, Any]])
Constructor
def __init__(
    self,
    *,
    thought_keys: Optional[Sequence[str]] = None,
    reflection_keys: Optional[Sequence[str]] = None,
    action_keys: Optional[Sequence[str]] = None,
    final_keys: Optional[Sequence[str]] = None,
)
ParameterDefault recognized keysDescription
thought_keysthought, thinking, think, rationaleKeys for the reasoning block
reflection_keysreflection, reflect, selfreflectionKeys for self-reflection blocks
action_keysaction, tool, callKeys for the action block
final_keysfinalanswer, final, answerKeys for the final answer block
Usage
from qitos.kit import ReActTextParser

parser = ReActTextParser()

# Expects output like:
# Thought: I should search for the answer.
# Action: search(query="QitOS architecture")
Pass parser to AgentModule.run() or the Engine constructor.
Parses JSON-formatted model output. Supports mode field ("act", "final", "wait") and configurable key names.
from qitos.kit import JsonDecisionParser

class JsonDecisionParser(BaseParser[dict[str, Any]])
Constructor
def __init__(
    self,
    *,
    thought_keys: Optional[Sequence[str]] = None,
    reflection_keys: Optional[Sequence[str]] = None,
    action_keys: Optional[Sequence[str]] = None,
    final_keys: Optional[Sequence[str]] = None,
)
Same parameter names as ReActTextParser. All keys default to the same values.Usage
from qitos.kit import JsonDecisionParser

parser = JsonDecisionParser()

# Expects output like:
# {"thought": "I need to search.", "action": {"name": "search", "args": {"query": "..."}}}
Parses XML-formatted model output. Supports both XML mode attribute and configurable tag names.
from qitos.kit import XmlDecisionParser

class XmlDecisionParser(BaseParser[dict[str, Any]])
Constructor
def __init__(
    self,
    *,
    thought_keys: Optional[Sequence[str]] = None,
    reflection_keys: Optional[Sequence[str]] = None,
    action_keys: Optional[Sequence[str]] = None,
    final_keys: Optional[Sequence[str]] = None,
    xml_think_tags: Optional[Sequence[str]] = None,
    xml_reflection_tags: Optional[Sequence[str]] = None,
    xml_action_tags: Optional[Sequence[str]] = None,
    xml_final_tags: Optional[Sequence[str]] = None,
)
xml_*_tags take priority over *_keys for XML parsing. Defaults: xml_think_tags=("think", "thought", "thinking", "rationale"), xml_action_tags=("action", "tool", "call"), xml_final_tags=("final_answer", "final", "answer").Usage
from qitos.kit import XmlDecisionParser

parser = XmlDecisionParser()

# Expects output like:
# <response mode="act">
#   <think>I should search.</think>
#   <action name="search"><query>QitOS docs</query></action>
# </response>
Terminus parsers are specialized variants that handle agent formats with explicit <terminus> or JSON-level termination signals. Use them with their matching system prompts (TERMINUS_JSON_SYSTEM_PROMPT, TERMINUS_XML_SYSTEM_PROMPT).
from qitos.kit import TerminusJsonParser, TerminusXmlParser
Both share the same constructor signature as JsonDecisionParser and XmlDecisionParser respectively.