Skip to content

Overview

DeepintShield provides complete OpenAI API compatibility through protocol adaptation. The integration handles request transformation, response normalization, and error mapping between OpenAI’s API specification and DeepintShield’s internal processing pipeline.

This integration enables you to utilize DeepintShield’s features like governance, load balancing, semantic caching, multi-provider support, and more, all while preserving your existing OpenAI SDK-based architecture.

Endpoint: /openai


Install with the OpenAI extra:

Terminal window
pip install "deepintshield[openai]"
from deepintshield import DeepintShield
shield = DeepintShield(virtual_key="sk-bf-your-virtual-key")
client = shield.openai() # pre-wired openai.OpenAI
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Use multiple providers through the same OpenAI SDK format by prefixing model names with the provider:

import openai
client = openai.OpenAI(
base_url="http://localhost:8080/openai",
api_key="dummy-key"
)
# OpenAI models (default)
openai_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello from OpenAI!"}]
)
# Anthropic models via OpenAI SDK format
anthropic_response = client.chat.completions.create(
model="anthropic/claude-3-sonnet-20240229",
messages=[{"role": "user", "content": "Hello from Claude!"}]
)
# Google Vertex models via OpenAI SDK format
vertex_response = client.chat.completions.create(
model="vertex/gemini-pro",
messages=[{"role": "user", "content": "Hello from Gemini!"}]
)
# Azure models
azure_response = client.chat.completions.create(
model="azure/gpt-4o",
messages=[{"role": "user", "content": "Hello from Azure!"}]
)
# Local Ollama models
ollama_response = client.chat.completions.create(
model="ollama/llama3.1:8b",
messages=[{"role": "user", "content": "Hello from Ollama!"}]
)

Pass custom headers required by DeepintShield plugins (like governance, telemetry, etc.):

import openai
client = openai.OpenAI(
base_url="http://localhost:8080/openai",
api_key="dummy-key",
default_headers={
"x-bf-vk": "vk_12345", # Virtual key for governance
}
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello with custom headers!"}]
)

Pass API keys directly in requests to bypass DeepintShield’s load balancing. You can pass any provider’s API key (OpenAI, Anthropic, Mistral, etc.) since DeepintShield only looks for Authorization or x-api-key headers. This requires the Allow Direct API keys option to be enabled in DeepintShield configuration.

Learn more: See Key Management for enabling direct API key usage.

import openai
# Using OpenAI's API key directly
client_with_direct_key = openai.OpenAI(
base_url="http://localhost:8080/openai",
api_key="sk-your-openai-key" # OpenAI's API key works
)
openai_response = client_with_direct_key.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Hello from GPT!"}]
)
# Or pass different provider keys per request
client = openai.OpenAI(
base_url="http://localhost:8080/openai",
api_key="dummy-key"
)
# Use OpenAI key for GPT models
openai_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello GPT!"}],
extra_headers={
"Authorization": "Bearer sk-your-openai-key"
}
)
# Use Anthropic key for Claude models
anthropic_response = client.chat.completions.create(
model="anthropic/claude-3-sonnet-20240229",
messages=[{"role": "user", "content": "Hello Claude!"}],
extra_headers={
"x-api-key": "sk-ant-your-anthropic-key"
}
)
# Use Gemini key for Gemini models
gemini_response = client.chat.completions.create(
model="gemini/gemini-2.5-flash",
messages=[{"role": "user", "content": "Hello Gemini!"}],
extra_headers={
"x-goog-api-key": "sk-gemini-your-gemini-key"
}
)

For Azure, you can use the AzureOpenAI client and point it to DeepintShield integration endpoint. The x-bf-azure-endpoint header is required to specify your Azure resource endpoint.

from openai import AzureOpenAI
azure_client = AzureOpenAI(
api_key="your-azure-api-key",
api_version="2024-02-01",
azure_endpoint="http://localhost:8080/openai", # Point to DeepintShield
default_headers={
"x-bf-azure-endpoint": "https://your-resource.openai.azure.com"
}
)
azure_response = azure_client.chat.completions.create(
model="gpt-4-deployment", # Your deployment name
messages=[{"role": "user", "content": "Hello from Azure!"}]
)
print(azure_response.choices[0].message.content)

Submit inference requests asynchronously and poll for results later using the x-bf-async header. This is useful for long-running requests where you don’t want to hold a connection open. See Async Inference for full details.

import openai
import time
client = openai.OpenAI(
base_url="http://localhost:8080/openai",
api_key="dummy-key"
)
# Submit async request
initial = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Tell me a short story."}],
extra_headers={"x-bf-async": "true"}
)
# If choices are present, the request completed synchronously
if initial.choices:
print(initial.choices[0].message.content)
else:
# Poll until completed
while True:
time.sleep(2)
poll = client.chat.completions.create(
model="openai/gpt-4o-mini",
messages=[{"role": "user", "content": "Tell me a short story."}],
extra_headers={"x-bf-async-id": initial.id}
)
if poll.choices:
print(poll.choices[0].message.content)
break
import openai
import time
client = openai.OpenAI(
base_url="http://localhost:8080/openai",
api_key="dummy-key"
)
# Submit async request
initial = client.responses.create(
model="openai/gpt-4o-mini",
input="Tell me a short story.",
extra_headers={"x-bf-async": "true"}
)
# If status is "completed", the request completed synchronously
if initial.status == "completed":
print(initial.output_text)
else:
# Poll until completed
while True:
time.sleep(2)
poll = client.responses.create(
model="openai/gpt-4o-mini",
input="Tell me a short story.",
extra_headers={"x-bf-async-id": initial.id}
)
if poll.status == "completed":
print(poll.output_text)
break
HeaderDescription
x-bf-async: trueSubmit the request as an async job. Returns immediately with a job ID.
x-bf-async-id: <job-id>Poll for results of a previously submitted async job.
x-bf-async-job-result-ttl: <seconds>Override the default result TTL (default: 3600s).

The OpenAI integration supports all features that are available in both the OpenAI SDK and DeepintShield core functionality. If the OpenAI SDK supports a feature and DeepintShield supports it, the integration will work seamlessly.