OpenAI Wrapper

Wrap your OpenAI client to automatically trace every chat.completions.create() call with model, input messages, output, token usage, and cost.

Installation

pip install twosignal[openai]

Usage

from twosignal import TwoSignal
from twosignal.wrappers.openai import wrap_openai
from openai import OpenAI

ts = TwoSignal()
client = wrap_openai(OpenAI())

# this call is now automatically traced
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}],
)

What Gets Captured

Field	Description
`name`	e.g., `openai.chat.completions.create(gpt-4o-mini)`
`type`	`LLM`
`model`	Model name (e.g., gpt-4o-mini)
`input`	Input messages array
`output`	Full response object
`model_parameters`	temperature, max_tokens, etc.
`usage.prompt_tokens`	Input token count
`usage.completion_tokens`	Output token count
`usage.total_tokens`	Total token count
`cost`	Estimated cost in USD
`status`	`OK` or `ERROR`
`error_message`	Exception message if the call failed

Async Client

from openai import AsyncOpenAI

async_client = wrap_openai(AsyncOpenAI())

response = await async_client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

Combining with @observe

The wrapper works with @observe — LLM calls become child spans:

@observe
def my_agent(query):
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": query}],
    )
    return response.choices[0].message.content

# trace tree:
# my_agent (CUSTOM)
# └── openai.chat.completions.create(gpt-4o-mini) (LLM)

Supported Models

Cost tracking works automatically for these models:

gpt-4o, gpt-4o-mini
gpt-4.1, gpt-4.1-mini, gpt-4.1-nano
gpt-4-turbo, gpt-4
gpt-3.5-turbo
o3-mini

Other models are traced normally but cost will be null.

Error Tracking

If the OpenAI call fails, the span records the error and re-raises the exception:

try:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": query}],
    )
except openai.RateLimitError:
    # span shows status: ERROR, error_message: "Rate limit exceeded"
    handle_rate_limit()

Limitations

Only chat.completions.create() is wrapped. Other methods (embeddings, images, etc.) are unaffected.
Streaming responses are traced, but token usage may not be available until the stream completes.
The wrapper is safe to apply multiple times — it detects if already wrapped.

Migrating Existing Code

Add two lines to your existing code — no other changes needed:

  from twosignal import TwoSignal
+ from twosignal.wrappers.openai import wrap_openai
  from openai import OpenAI

+ ts = TwoSignal()
- client = OpenAI()
+ client = wrap_openai(OpenAI())

  # all your existing code works unchanged
  response = client.chat.completions.create(...)

Have questions? Join our community.

Connect with other developers and the 2Signal team.

Join Discord