Get started with 2Signal
2Signal is a testing and reliability platform for AI agents. Instrument your agent with the Python, TypeScript, Go, Java, or Ruby SDK, run evaluations on every trace, and see what broke and why.
Quickstarts
Log your first trace
Install the Python SDK and send your first trace to 2Signal in under 5 minutes.
Run your first eval
Set up automated evaluations to measure and improve your AI agent quality.
Workflow
Follow 2Signal's structured workflow to build, evaluate, and improve AI agents:
Instrument
Add the @observe decorator and wrap your LLM providers to capture traces.
Evaluate
Configure evaluators to automatically score traces for quality, cost, and latency.
Observe
Monitor your agents in production with the dashboard, CLI, or TUI.
Core Concepts
Traces
A trace represents one complete execution of your agent — from input to output. Traces contain spans.
Spans
Spans represent individual operations within a trace: an LLM call, a tool invocation, a retrieval step. Spans can be nested (a span can have child spans).
Evaluators
Evaluators score your traces automatically. 2Signal includes 8 evaluator types: LLM Judge, Contains, Regex Match, JSON Schema, Similarity, Latency, Cost, and Custom (via API). Evaluators run asynchronously — they never slow down your agent.
Scores
Each evaluator produces a score (0–1) with an optional label ("pass"/"fail") and reasoning. Scores are attached to traces or individual spans.
Three Interfaces
Use whichever fits your workflow — they all share the same auth and data:
- Web Dashboard — full GUI at app.2signal.dev
- CLI — non-interactive commands for scripting and CI/CD
- TUI Dashboard — interactive terminal dashboard with real-time monitoring
Next Steps
- Python SDK / TypeScript SDK / Go SDK / Java SDK / Ruby SDK — instrument your agent
- LangChain / LlamaIndex / CrewAI / AutoGen — framework integrations
- REST API Reference — direct API integration
- Evaluators — configure automated testing
- Prompt Templates — version-controlled prompt management
- Trace Replay — re-run traces with different models or prompts