Overview
Overview
Section titled “Overview”DeepintShield provides complete OpenAI API compatibility through protocol adaptation. The integration handles request transformation, response normalization, and error mapping between OpenAI’s API specification and DeepintShield’s internal processing pipeline.
This integration enables you to utilize DeepintShield’s features like governance, load balancing, semantic caching, multi-provider support, and more, all while preserving your existing OpenAI SDK-based architecture.
Endpoint: /openai
Install with the OpenAI extra:
pip install "deepintshield[openai]"from deepintshield import DeepintShield
shield = DeepintShield(virtual_key="sk-bf-your-virtual-key")client = shield.openai() # pre-wired openai.OpenAI
response = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello!"}],)
print(response.choices[0].message.content)import openai
client = openai.OpenAI( base_url="https://app.deepintshield.com/openai", api_key="sk-bf-your-virtual-key", default_headers={"x-bf-vk": "sk-bf-your-virtual-key"},)
response = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello!"}],)
print(response.choices[0].message.content)import OpenAI from "openai";
const openai = new OpenAI({ baseURL: "https://app.deepintshield.com/openai", apiKey: "sk-bf-your-virtual-key", defaultHeaders: { "x-bf-vk": "sk-bf-your-virtual-key" },});
const response = await openai.chat.completions.create({ model: "gpt-4o-mini", messages: [{ role: "user", content: "Hello!" }],});
console.log(response.choices[0].message.content);Provider/Model Usage Examples
Section titled “Provider/Model Usage Examples”Use multiple providers through the same OpenAI SDK format by prefixing model names with the provider:
import openai
client = openai.OpenAI( base_url="http://localhost:8080/openai", api_key="dummy-key")
# OpenAI models (default)openai_response = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello from OpenAI!"}])
# Anthropic models via OpenAI SDK formatanthropic_response = client.chat.completions.create( model="anthropic/claude-3-sonnet-20240229", messages=[{"role": "user", "content": "Hello from Claude!"}])
# Google Vertex models via OpenAI SDK formatvertex_response = client.chat.completions.create( model="vertex/gemini-pro", messages=[{"role": "user", "content": "Hello from Gemini!"}])
# Azure modelsazure_response = client.chat.completions.create( model="azure/gpt-4o", messages=[{"role": "user", "content": "Hello from Azure!"}])
# Local Ollama modelsollama_response = client.chat.completions.create( model="ollama/llama3.1:8b", messages=[{"role": "user", "content": "Hello from Ollama!"}])import OpenAI from "openai";
const openai = new OpenAI({ baseURL: "http://localhost:8080/openai", apiKey: "dummy-key",});
// OpenAI models (default)const openaiResponse = await openai.chat.completions.create({ model: "gpt-4o-mini", messages: [{ role: "user", content: "Hello from OpenAI!" }],});
// Anthropic models via OpenAI SDK formatconst anthropicResponse = await openai.chat.completions.create({ model: "anthropic/claude-3-sonnet-20240229", messages: [{ role: "user", content: "Hello from Claude!" }],});
// Google Vertex models via OpenAI SDK formatconst vertexResponse = await openai.chat.completions.create({ model: "vertex/gemini-pro", messages: [{ role: "user", content: "Hello from Gemini!" }],});
// Azure modelsconst azureResponse = await openai.chat.completions.create({ model: "azure/gpt-4o", messages: [{ role: "user", content: "Hello from Azure!" }],});
// Local Ollama modelsconst ollamaResponse = await openai.chat.completions.create({ model: "ollama/llama3.1:8b", messages: [{ role: "user", content: "Hello from Ollama!" }],});Adding Custom Headers
Section titled “Adding Custom Headers”Pass custom headers required by DeepintShield plugins (like governance, telemetry, etc.):
import openai
client = openai.OpenAI( base_url="http://localhost:8080/openai", api_key="dummy-key", default_headers={ "x-bf-vk": "vk_12345", # Virtual key for governance })
response = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello with custom headers!"}])import OpenAI from "openai";
const openai = new OpenAI({ baseURL: "http://localhost:8080/openai", apiKey: "dummy-key", defaultHeaders: { "x-bf-vk": "vk_12345", // Virtual key for governance },});
const response = await openai.chat.completions.create({ model: "gpt-4o-mini", messages: [{ role: "user", content: "Hello with custom headers!" }],});Using Direct Keys
Section titled “Using Direct Keys”Pass API keys directly in requests to bypass DeepintShield’s load balancing. You can pass any provider’s API key (OpenAI, Anthropic, Mistral, etc.) since DeepintShield only looks for Authorization or x-api-key headers. This requires the Allow Direct API keys option to be enabled in DeepintShield configuration.
Learn more: See Key Management for enabling direct API key usage.
import openai
# Using OpenAI's API key directlyclient_with_direct_key = openai.OpenAI( base_url="http://localhost:8080/openai", api_key="sk-your-openai-key" # OpenAI's API key works)
openai_response = client_with_direct_key.chat.completions.create( model="openai/gpt-4o-mini", messages=[{"role": "user", "content": "Hello from GPT!"}])
# Or pass different provider keys per requestclient = openai.OpenAI( base_url="http://localhost:8080/openai", api_key="dummy-key")
# Use OpenAI key for GPT modelsopenai_response = client.chat.completions.create( model="gpt-4o-mini", messages=[{"role": "user", "content": "Hello GPT!"}], extra_headers={ "Authorization": "Bearer sk-your-openai-key" })
# Use Anthropic key for Claude modelsanthropic_response = client.chat.completions.create( model="anthropic/claude-3-sonnet-20240229", messages=[{"role": "user", "content": "Hello Claude!"}], extra_headers={ "x-api-key": "sk-ant-your-anthropic-key" })
# Use Gemini key for Gemini modelsgemini_response = client.chat.completions.create( model="gemini/gemini-2.5-flash", messages=[{"role": "user", "content": "Hello Gemini!"}], extra_headers={ "x-goog-api-key": "sk-gemini-your-gemini-key" })import OpenAI from "openai";
// Using OpenAI's API key directlyconst openaiWithDirectKey = new OpenAI({ baseURL: "http://localhost:8080/openai", apiKey: "sk-your-openai-key", // OpenAI's API key works});
const openaiResponse = await openaiWithDirectKey.chat.completions.create({ model: "openai/gpt-4o-mini", messages: [{ role: "user", content: "Hello from GPT!" }],});
// Or pass different provider keys per requestconst openai = new OpenAI({ baseURL: "http://localhost:8080/openai", apiKey: "dummy-key",});
// Use OpenAI key for GPT modelsconst openaiResponse = await openai.chat.completions.create({ model: "gpt-4o-mini", messages: [{ role: "user", content: "Hello GPT!" }], headers: { "Authorization": "Bearer sk-your-openai-key", },});
// Use Anthropic key for Claude modelsconst anthropicResponseWithHeader = await openai.chat.completions.create({ model: "anthropic/claude-3-sonnet-20240229", messages: [{ role: "user", content: "Hello Claude!" }], headers: { "x-api-key": "sk-ant-your-anthropic-key", },});
// Use Gemini key for Gemini modelsconst geminiResponseWithHeader = await openai.chat.completions.create({ model: "gemini/gemini-2.5-flash", messages: [{ role: "user", content: "Hello Gemini!" }], headers: { "x-goog-api-key": "sk-gemini-your-gemini-key", },});For Azure, you can use the AzureOpenAI client and point it to DeepintShield integration endpoint. The x-bf-azure-endpoint header is required to specify your Azure resource endpoint.
from openai import AzureOpenAI
azure_client = AzureOpenAI( api_key="your-azure-api-key", api_version="2024-02-01", azure_endpoint="http://localhost:8080/openai", # Point to DeepintShield default_headers={ "x-bf-azure-endpoint": "https://your-resource.openai.azure.com" })
azure_response = azure_client.chat.completions.create( model="gpt-4-deployment", # Your deployment name messages=[{"role": "user", "content": "Hello from Azure!"}])
print(azure_response.choices[0].message.content)import { AzureOpenAI } from "openai";
const azureClient = new AzureOpenAI({ apiKey: "your-azure-api-key", apiVersion: "2024-02-01", baseURL: "http://localhost:8080/openai", // Point to DeepintShield defaultHeaders: { "x-bf-azure-endpoint": "https://your-resource.openai.azure.com" }});
const azureResponse = await azureClient.chat.completions.create({ model: "gpt-4-deployment", // Your deployment name messages: [{ role: "user", content: "Hello from Azure!" }],});
console.log(azureResponse.choices[0].message.content);Async Inference
Section titled “Async Inference”Submit inference requests asynchronously and poll for results later using the x-bf-async header. This is useful for long-running requests where you don’t want to hold a connection open. See Async Inference for full details.
Chat Completions
Section titled “Chat Completions”import openaiimport time
client = openai.OpenAI( base_url="http://localhost:8080/openai", api_key="dummy-key")
# Submit async requestinitial = client.chat.completions.create( model="openai/gpt-4o-mini", messages=[{"role": "user", "content": "Tell me a short story."}], extra_headers={"x-bf-async": "true"})
# If choices are present, the request completed synchronouslyif initial.choices: print(initial.choices[0].message.content)else: # Poll until completed while True: time.sleep(2) poll = client.chat.completions.create( model="openai/gpt-4o-mini", messages=[{"role": "user", "content": "Tell me a short story."}], extra_headers={"x-bf-async-id": initial.id} ) if poll.choices: print(poll.choices[0].message.content) breakimport OpenAI from "openai";
const openai = new OpenAI({ baseURL: "http://localhost:8080/openai", apiKey: "dummy-key",});
// Submit async requestconst initial = await openai.chat.completions.create( { model: "openai/gpt-4o-mini", messages: [{ role: "user", content: "Tell me a short story." }], }, { headers: { "x-bf-async": "true" } });
// If choices are present, the request completed synchronouslyif (initial.choices?.length > 0) { console.log(initial.choices[0].message.content);} else { // Poll until completed while (true) { await new Promise((r) => setTimeout(r, 2000)); const poll = await openai.chat.completions.create( { model: "openai/gpt-4o-mini", messages: [{ role: "user", content: "Tell me a short story." }], }, { headers: { "x-bf-async-id": initial.id } } ); if (poll.choices?.length > 0) { console.log(poll.choices[0].message.content); break; } }}Responses API
Section titled “Responses API”import openaiimport time
client = openai.OpenAI( base_url="http://localhost:8080/openai", api_key="dummy-key")
# Submit async requestinitial = client.responses.create( model="openai/gpt-4o-mini", input="Tell me a short story.", extra_headers={"x-bf-async": "true"})
# If status is "completed", the request completed synchronouslyif initial.status == "completed": print(initial.output_text)else: # Poll until completed while True: time.sleep(2) poll = client.responses.create( model="openai/gpt-4o-mini", input="Tell me a short story.", extra_headers={"x-bf-async-id": initial.id} ) if poll.status == "completed": print(poll.output_text) breakimport OpenAI from "openai";
const openai = new OpenAI({ baseURL: "http://localhost:8080/openai", apiKey: "dummy-key",});
// Submit async requestconst initial = await openai.responses.create( { model: "openai/gpt-4o-mini", input: "Tell me a short story." }, { headers: { "x-bf-async": "true" } });
// If status is "completed", the request completed synchronouslyif (initial.status === "completed") { console.log(initial.output_text);} else { // Poll until completed while (true) { await new Promise((r) => setTimeout(r, 2000)); const poll = await openai.responses.create( { model: "openai/gpt-4o-mini", input: "Tell me a short story." }, { headers: { "x-bf-async-id": initial.id } } ); if (poll.status === "completed") { console.log(poll.output_text); break; } }}Async Headers
Section titled “Async Headers”| Header | Description |
|---|---|
x-bf-async: true | Submit the request as an async job. Returns immediately with a job ID. |
x-bf-async-id: <job-id> | Poll for results of a previously submitted async job. |
x-bf-async-job-result-ttl: <seconds> | Override the default result TTL (default: 3600s). |
Supported Features
Section titled “Supported Features”The OpenAI integration supports all features that are available in both the OpenAI SDK and DeepintShield core functionality. If the OpenAI SDK supports a feature and DeepintShield supports it, the integration will work seamlessly.
Next Steps
Section titled “Next Steps”- Files and Batch API - File uploads and batch processing
- Anthropic SDK - Claude integration patterns
- Google GenAI SDK - Gemini integration patterns
- Configuration - DeepintShield setup and configuration
- Core Features - Advanced DeepintShield capabilities