OpenAI Wrapper
Wrap your OpenAI client to automatically trace every chat.completions.create() call with model, input messages, output, token usage, and cost.
Installation
pip install twosignal[openai]Usage
from twosignal import TwoSignal
from twosignal.wrappers.openai import wrap_openai
from openai import OpenAI
ts = TwoSignal()
client = wrap_openai(OpenAI())
# this call is now automatically traced
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
)What Gets Captured
| Field | Description |
|---|---|
name | e.g., openai.chat.completions.create(gpt-4o-mini) |
type | LLM |
model | Model name (e.g., gpt-4o-mini) |
input | Input messages array |
output | Full response object |
model_parameters | temperature, max_tokens, etc. |
usage.prompt_tokens | Input token count |
usage.completion_tokens | Output token count |
usage.total_tokens | Total token count |
cost | Estimated cost in USD |
status | OK or ERROR |
error_message | Exception message if the call failed |
Async Client
from openai import AsyncOpenAI
async_client = wrap_openai(AsyncOpenAI())
response = await async_client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)Combining with @observe
The wrapper works with @observe — LLM calls become child spans:
@observe
def my_agent(query):
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": query}],
)
return response.choices[0].message.content
# trace tree:
# my_agent (CUSTOM)
# └── openai.chat.completions.create(gpt-4o-mini) (LLM)Supported Models
Cost tracking works automatically for these models:
gpt-4o,gpt-4o-minigpt-4.1,gpt-4.1-mini,gpt-4.1-nanogpt-4-turbo,gpt-4gpt-3.5-turboo3-mini
Other models are traced normally but cost will be null.
Error Tracking
If the OpenAI call fails, the span records the error and re-raises the exception:
try:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": query}],
)
except openai.RateLimitError:
# span shows status: ERROR, error_message: "Rate limit exceeded"
handle_rate_limit()Limitations
- Only
chat.completions.create()is wrapped. Other methods (embeddings, images, etc.) are unaffected. - Streaming responses are traced, but token usage may not be available until the stream completes.
- The wrapper is safe to apply multiple times — it detects if already wrapped.
Migrating Existing Code
Add two lines to your existing code — no other changes needed:
from twosignal import TwoSignal
+ from twosignal.wrappers.openai import wrap_openai
from openai import OpenAI
+ ts = TwoSignal()
- client = OpenAI()
+ client = wrap_openai(OpenAI())
# all your existing code works unchanged
response = client.chat.completions.create(...)