Testing
Testing & Validation
2Signal provides multiple layers of testing for AI agents — from unit-testing individual evaluators to running full regression suites against datasets. Combine these approaches to catch issues at every stage of development and deployment.
Testing Approaches
| Approach | What It Tests | When to Use | Tools |
|---|---|---|---|
| Unit Testing | Individual evaluators in isolation | During evaluator development | Vitest, Pytest |
| Dataset Evaluation | Agent behavior across many inputs | Before deploys, regression testing | Dashboard, CLI, CI/CD |
| Production Monitoring | Live agent performance | Always-on | Evaluators, Alerts |
| Trace Replay | Model/prompt changes against real data | A/B testing, optimization | Dashboard |
Unit Testing Evaluators
Test evaluator configs in isolation before deploying them.
Dataset Evaluation
Run evaluators against curated datasets for regression testing.
Production Monitoring
Monitor live agents with always-on evaluation and alerts.
Trace Replay Testing
A/B test models and prompts against production data.