Everything you need to ship reliable agents.

From tracing to evaluation to monitoring — one platform, no stitching tools together.

Instrument

One-line tracing

Add @observe to any function. Every LLM call, tool use, retrieval, and decision inside it is captured as a span tree — no manual instrumentation.

OpenAI & Anthropic wrappers

wrap_openai() and wrap_anthropic() trace every completion automatically. Drop it in, change nothing else.

Nested span trees

Agents call tools that call other agents. 2signal captures the full call hierarchy with timing, tokens, and cost at every level.

Evaluate

LLM-as-Judge

Define criteria in plain English. An LLM scores every trace on a pass/fail or 1–5 scale with written reasoning.

Deterministic checks

Contains, Regex Match, and JSON Schema evaluators for hard rules — output must include X, match pattern Y, or conform to schema Z.

Performance evaluators

Latency and Cost evaluators enforce budgets automatically. Flag any trace over 2 seconds or $0.10.

Similarity scoring

TF-IDF cosine similarity compares outputs against expected answers. No LLM calls — runs locally, instantly.

Monitor

Trace explorer

Filter by status, evaluator score, latency, cost, or tags. Click any trace to see the full span tree with inputs, outputs, and timing.

Regression detection

Compare agent versions side-by-side. See which evaluators degraded, which spans got slower, and where costs increased.

Usage tracking

Per-project trace counts, token usage, and cost rollups. Warnings at 80% and 100% of plan limits.

Model routing

Route queries to the right model by complexity. Simple questions get cheap models. Save 30–50% on LLM costs without losing quality.

Integrate

Python SDK

pip install twosignal. Supports Python 3.9+. Background export with daemon threads — zero impact on agent latency.

REST API

Language-agnostic trace ingestion, scoring, and querying. Build custom integrations in any language.

Async by default

Traces persist to S3 first, then process through Redis queues. Evaluations run in background workers — your agent never waits.