POST /api/v1/scores
Submit evaluation scores for traces or individual spans. Scores are typically created by evaluators automatically, but you can also submit them via the API for custom evaluation logic.
Request
POST /api/v1/scores
Authorization: Bearer ts_...
Content-Type: application/json
{
"traceId": "trace-abc-123",
"spanId": null,
"evaluatorName": "accuracy",
"value": 0.92,
"label": "pass",
"reasoning": "Output matches expected answer with high confidence"
}curl Example
curl -X POST https://api.2signal.dev/api/v1/scores \
-H "Authorization: Bearer ts_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"traceId": "trace-abc-123",
"evaluatorName": "accuracy",
"value": 0.92,
"label": "pass"
}'Parameters
| Field | Type | Required | Description |
|---|---|---|---|
traceId | UUID | Yes | Trace to score |
spanId | UUID | No | Score a specific span instead of the whole trace |
evaluatorName | string | Yes | Name of the evaluator (e.g., "accuracy", "helpfulness") |
value | number | Yes | Score value, 0.0 to 1.0 |
label | string | No | Human-readable label (e.g., "pass", "fail", "good") |
reasoning | string | No | Explanation of why this score was given |
Response (201)
{
"id": "score-xyz",
"traceId": "trace-abc-123",
"spanId": null,
"evaluatorName": "accuracy",
"value": 0.92,
"label": "pass",
"reasoning": "Output matches expected answer with high confidence"
}Score Values
- Scores must be between 0.0 (worst) and 1.0 (best)
- Binary evaluators typically return
0.0or1.0 - Continuous evaluators return any value in the range
- Values outside 0–1 will be rejected with a
400
Use Cases
Custom Evaluators
Build your own evaluation logic and submit scores via the API:
import httpx
def evaluate_trace(trace_id, output, expected):
score = compute_similarity(output, expected)
httpx.post(
"https://api.2signal.dev/api/v1/scores",
headers={"Authorization": "Bearer ts_..."},
json={
"traceId": trace_id,
"evaluatorName": "custom-similarity",
"value": score,
"label": "pass" if score > 0.8 else "fail",
},
)Human Review
Submit human review scores alongside automated evaluations:
{
"traceId": "trace-abc-123",
"evaluatorName": "human-review",
"value": 1.0,
"label": "approved",
"reasoning": "Reviewed by QA team - response is accurate"
}Span-Level Scoring
Score individual spans (e.g., a specific LLM call) rather than the whole trace:
{
"traceId": "trace-abc-123",
"spanId": "span-llm-456",
"evaluatorName": "hallucination-check",
"value": 0.0,
"label": "fail",
"reasoning": "LLM output contains fabricated citation"
}Error Responses
| Status | When |
|---|---|
400 | Missing required field, score value out of range |
401 | Invalid or missing API key |
404 | Trace or span not found |