Reference

Glossary

Definitions of key terms used throughout the 2Signal documentation.

Agent
A software system that uses one or more LLMs to take actions, make decisions, or interact with users. In 2Signal, an agent is the top-level application being traced and evaluated.
API Key
A secret token used to authenticate SDK and REST API requests. API keys are project-scoped — each key belongs to a single project and can only access that project's data.
Cost
The monetary cost of an LLM call, calculated from token usage and the model's pricing. Stored per-span and aggregated per-trace as total_cost.
Dataset
A collection of test cases (input + optional expected output) used for systematic evaluation and regression testing. Datasets enable reproducible experiments.
Evaluator
A function that scores a trace or span on a specific quality dimension. 2Signal includes 7 built-in evaluators: Contains, Regex Match, JSON Schema, Similarity, Latency, Cost, and LLM Judge.
Evaluator Config
A named, saved evaluator configuration attached to a project. Defines the evaluator type, parameters, and whether it runs automatically on new traces.
Experiment
A run of your agent against all items in a dataset, producing scores for each item. Experiments let you compare agent versions, models, or prompts quantitatively.
Flush
The process of sending buffered trace events from the SDK to the 2Signal API. Happens automatically every 1 second or 100 events, whichever comes first.
LLM Judge
An evaluator that uses an LLM (like GPT-4o-mini) to score traces based on custom criteria. Useful for subjective quality dimensions like helpfulness, accuracy, and tone.
Model Routing
Automatic selection of the best LLM for each request based on complexity, token count, or keyword rules. Routes simple queries to cheaper models and complex queries to more capable ones.
Observe
The @observe decorator in the Python SDK. Applied to a function to automatically trace its execution, capturing inputs, outputs, errors, and timing.
Organization
The top-level entity in 2Signal. Organizations contain projects, members, and billing. Plans and usage limits are set at the organization level.
Project
A container for traces, evaluators, datasets, and API keys within an organization. Typically maps to a single agent or application.
Retention
How long trace data is stored before automatic deletion. Varies by plan: 7 days (Free), 30 days (Pro), 90 days (Team), unlimited (Enterprise).
Score
A numeric value (0–1) produced by an evaluator for a specific trace or span. Scores can include a label (pass/fail) and reasoning text explaining the result.
Span
A single operation within a trace — an LLM call, a tool invocation, a retrieval step, or any other discrete unit of work. Spans can be nested to represent parent-child relationships.
Span Type
The category of a span: AGENT, TOOL, LLM, CHAIN, RETRIEVAL, or CUSTOM. Span types help organize and filter trace data.
Trace
A complete record of one execution of your agent — from input to output. Traces contain one or more spans representing the individual operations that occurred.
Usage
Monthly counters tracking how many traces, spans, and evaluator runs your organization has consumed. Usage determines whether you're within your plan limits.
Wrapper
A function like wrap_openai() or wrap_anthropic() that patches an LLM client to automatically capture model, tokens, cost, and input/output for every call.

Have questions? Join our community!

Connect with other developers and the 2Signal team.

Join Discord