Testing & reliability for AI agents

Catch failures before your users do.

Three lines of code. Every LLM call traced, every response evaluated, every regression caught.

10
Integrations
7
Built-in evaluators
<3
Lines to instrument
0ms
Latency overhead

Drop-in support for major frameworks

OpenAI
Anthropic
Google Gemini
Mistral
Cohere
Groq
LangChain
LlamaIndex
CrewAI
AutoGen

One decorator. Full visibility.

Add @observe to your agent. Every LLM call, tool use, and decision is traced automatically.

agent.py
from twosignal import TwoSignal, observe

ts = TwoSignal()

@observe                          # that's it
def support_agent(query):
    context = retrieve_docs(query)
    response = llm.chat(query, context)
    return response
Every call
traced
Tokens + cost
tracked
Evals
7 built-in

See what your agents are doing.

Run evaluations on every trace. See what broke, why, and how to fix it.

Your CI pipeline
$ 2signal run --suite regression-v3

  accuracy     0.92  (+0.03)
  safety       0.98
  latency      1.2s  (+0.4s) ⚠
  regressions  3

✓ 844 passed  ✗ 3 failed  34s
Your dashboard
Agent
Accuracy
Latency
Score
support-agent-v2.1
0.92
1.2s
86
booking-agent-v2.1
0.71
2.8s
50
refund-agent-v1.3
0.44
0.9s
32

How it works

No behavior changes. Your agent runs exactly the same — we just watch.

1

Instrument

Install the SDK and add @observe to your agent function. Every LLM call, tool use, and decision gets traced — your agent's behavior doesn't change.

pip install twosignal
2

Evaluate

Each trace runs through your configured evaluators automatically. Accuracy, safety, latency, cost — scored in the background, no extra code needed.

7 evaluators · async
3

Ship

Scores drop, latency spikes, or a new prompt breaks something? You'll know immediately — before your users open a support ticket.

Traces · Scores · Alerts

How we’re different

Most observability tools make you choose between tracing and testing. We don’t.

2signal
Typical tools
Testing + monitoring in one tool
Separate tools for tracing and evals
3 lines of code to instrument
Verbose SDK setup and manual span creation
7 evaluators run automatically on every trace
Write your own eval logic or configure per-trace
Async evals — zero latency overhead
Inline evaluation slows down your agent
10,000 traces free, unlimited deterministic evals
Free tiers with aggressive usage gates

What engineers are saying

Teams ship more reliable agents with 2signal.

We were spending hours manually reviewing agent outputs. 2signal's evaluators catch the regressions we used to miss, automatically.

SC
Sarah Chen
ML Engineer, AI Startup

The @observe decorator took 2 minutes to add. We had full visibility into our agent pipeline the same day.

MR
Marcus Rodriguez
Founding Engineer, Series A Company

Model routing alone cut our LLM costs by 60%. Simple queries go to mini, complex ones to GPT-4o. Took 10 minutes to set up.

PP
Priya Patel
CTO, Developer Tools

10,000 traces free. No credit card.

Unlimited deterministic evals on every plan. Bring your own API key for unlimited LLM evals.

Free
$0
/month
  • 10,000 traces/mo
  • 500 LLM evals/mo
  • Unlimited deterministic evals
  • 7-day retention
  • 1 seat
  • Community support
Start Free
Pro
$23
/monthbilled annually
  • 50,000 traces/mo
  • 5,000 LLM evals/mo
  • Unlimited deterministic evals
  • 30-day retention
  • 3 seats
  • Email alerts
  • Overage billing
Get Started
Popular
Growth
$63
/monthbilled annually
  • 150,000 traces/mo
  • 15,000 LLM evals/mo
  • Unlimited deterministic evals
  • 60-day retention
  • Unlimited seats
  • Email + Slack alerts
  • Basic roles
  • Overage billing
Get Started
Team
$159
/monthbilled annually
  • 500,000 traces/mo
  • 50,000 LLM evals/mo
  • Unlimited deterministic evals
  • 90-day retention
  • Unlimited seats
  • Full RBAC
  • All alert channels
  • Dedicated support
Get Started
EnterpriseUnlimited traces & evals, SSO/SAML, audit logs, SLA, self-hosted option, and dedicated support.
Contact Sales
Startup ProgramUnder $1M ARR? Get 50% off any paid plan for your first 6 months.
Apply Now

Questions

Start free. Ship with confidence.

Set up in under two minutes. See your first trace before your coffee gets cold.