OpenAI Wrapper

Wrap your OpenAI client to automatically trace every chat.completions.create() call with model, input messages, output, token usage, and cost.

Installation

pip install twosignal[openai]

Usage

from twosignal import TwoSignal
from twosignal.wrappers.openai import wrap_openai
from openai import OpenAI

ts = TwoSignal()
client = wrap_openai(OpenAI())

# this call is now automatically traced
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}],
)

What Gets Captured

FieldDescription
namee.g., openai.chat.completions.create(gpt-4o-mini)
typeLLM
modelModel name (e.g., gpt-4o-mini)
inputInput messages array
outputFull response object
model_parameterstemperature, max_tokens, etc.
usage.prompt_tokensInput token count
usage.completion_tokensOutput token count
usage.total_tokensTotal token count
costEstimated cost in USD
statusOK or ERROR
error_messageException message if the call failed

Async Client

from openai import AsyncOpenAI

async_client = wrap_openai(AsyncOpenAI())

response = await async_client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

Combining with @observe

The wrapper works with @observe — LLM calls become child spans:

@observe
def my_agent(query):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": query}],
    )
    return response.choices[0].message.content

# trace tree:
# my_agent (CUSTOM)
# └── openai.chat.completions.create(gpt-4o-mini) (LLM)

Supported Models

Cost tracking works automatically for these models:

  • gpt-4o, gpt-4o-mini
  • gpt-4.1, gpt-4.1-mini, gpt-4.1-nano
  • gpt-4-turbo, gpt-4
  • gpt-3.5-turbo
  • o3-mini

Other models are traced normally but cost will be null.

Error Tracking

If the OpenAI call fails, the span records the error and re-raises the exception:

try:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": query}],
    )
except openai.RateLimitError:
    # span shows status: ERROR, error_message: "Rate limit exceeded"
    handle_rate_limit()

Limitations

  • Only chat.completions.create() is wrapped. Other methods (embeddings, images, etc.) are unaffected.
  • Streaming responses are traced, but token usage may not be available until the stream completes.
  • The wrapper is safe to apply multiple times — it detects if already wrapped.

Migrating Existing Code

Add two lines to your existing code — no other changes needed:

  from twosignal import TwoSignal
+ from twosignal.wrappers.openai import wrap_openai
  from openai import OpenAI

+ ts = TwoSignal()
- client = OpenAI()
+ client = wrap_openai(OpenAI())

  # all your existing code works unchanged
  response = client.chat.completions.create(...)

Have questions? Join our community!

Connect with other developers and the 2Signal team.

Join Discord