Cookbook

Webhook Evaluator

The Webhook evaluator sends trace data to your own HTTP endpoint for scoring. Use it when you need custom ML models, domain-specific logic, or integration with external systems that the built-in evaluators don't cover.

Prerequisites

  • A 2Signal project with traces flowing in
  • An HTTPS endpoint that accepts POST requests and returns scores

Step 1: Build Your Endpoint

Your endpoint receives a JSON payload with the trace data and must return a score:

Request (from 2Signal)

POST https://your-service.com/evaluate
Content-Type: application/json

{
  "traceId": "abc-123",
  "input": "What is the return policy?",
  "output": "You can return items within 30 days...",
  "config": {
    "custom_field": "any config you set"
  }
}

Expected Response

{
  "score": 0.85,
  "label": "pass",
  "reasoning": "Response correctly addresses the question with specific policy details."
}

The response is validated with a Zod schema. score (0–1) is required. label and reasoning are optional.

Example Endpoint (Python/FastAPI)

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class EvalRequest(BaseModel):
    traceId: str
    input: str | None = None
    output: str | None = None
    config: dict | None = None

class EvalResponse(BaseModel):
    score: float
    label: str | None = None
    reasoning: str | None = None

@app.post("/evaluate")
async def evaluate(req: EvalRequest) -> EvalResponse:
    # Your custom evaluation logic here
    score = run_your_model(req.input, req.output)
    return EvalResponse(
        score=score,
        label="pass" if score > 0.7 else "fail",
        reasoning=f"Custom model scored {score:.2f}"
    )

Step 2: Configure the Evaluator

In the dashboard, create a new evaluator:

{
  "name": "custom-domain-check",
  "type": "WEBHOOK",
  "config": {
    "url": "https://your-service.com/evaluate",
    "headers": {
      "Authorization": "Bearer your-secret-token"
    },
    "timeout_ms": 10000
  }
}

Step 3: Enable and Test

Enable the evaluator. The next trace ingested will trigger a POST to your endpoint. Check the trace detail page to see the score.

Production Considerations

  • HTTPS required in production — Webhook evaluators enforce HTTPS to protect trace data in transit.
  • Concurrency limit — Up to 10 concurrent requests per evaluator to prevent overwhelming your endpoint.
  • Automatic retries — Failed requests are retried with exponential backoff.
  • Timeout — Requests that exceed the configured timeout (default 10s) are scored as failures.

Debugging

If your webhook evaluator is not scoring traces, check:

  • Your endpoint returns a valid JSON response with a score field
  • The score is a number between 0 and 1
  • Your endpoint responds within the timeout window
  • The URL is HTTPS (required in production)
  • Any authentication headers are correct

Have questions? Join our community!

Connect with other developers and the 2Signal team.

Join Discord