Cerebras
Overview
Section titled “Overview”Cerebras is a fully OpenAI-compatible provider leveraging the complete set of OpenAI API features. DeepIntShield delegates all functionality to the OpenAI provider implementation with standard parameter filtering. Key characteristics:
- Complete OpenAI compatibility - All chat, text, and streaming features supported
- Full tool calling - Function definitions and parallel tool execution
- Streaming support - Server-Sent Events with token usage tracking
- Parameter preservation - Passes through all standard OpenAI parameters
- Responses API - Full support with format conversion
Supported Operations
Section titled “Supported Operations”| Operation | Non-Streaming | Streaming | Endpoint |
|---|---|---|---|
| Chat Completions | ✅ | ✅ | /v1/chat/completions |
| Responses API | ✅ | ✅ | /v1/chat/completions |
| Text Completions | ✅ | ✅ | /v1/completions |
| List Models | ✅ | - | /v1/models |
| Embeddings | ❌ | ❌ | - |
| Image Generation | ❌ | ❌ | - |
| Speech (TTS) | ❌ | ❌ | - |
| Transcriptions (STT) | ❌ | ❌ | - |
| Files | ❌ | ❌ | - |
| Batch | ❌ | ❌ | - |
1. Chat Completions
Section titled “1. Chat Completions”Request Parameters
Section titled “Request Parameters”Cerebras supports all standard OpenAI chat completion parameters. For full parameter reference and behavior, see OpenAI Chat Completions.
Filtered Parameters
Section titled “Filtered Parameters”Removed for Cerebras compatibility:
prompt_cache_key- Not supportedverbosity- Anthropic-specificstore- Not supportedservice_tier- OpenAI-specific
Reasoning Parameter
Section titled “Reasoning Parameter”Cerebras delegates to OpenAI via ToOpenAIChatRequest, so reasoning parameters are transformed: reasoning.effort values (e.g., minimal → low) are mapped per the OpenAI-compatible providers convention, and reasoning.max_tokens is cleared/omitted (removed during conversion).
Cerebras supports all standard OpenAI message types, tools, responses, and streaming formats. For details on message handling, tool conversion, responses, and streaming, refer to OpenAI Chat Completions.
2. Responses API
Section titled “2. Responses API”DeepIntShield converts Responses API format to Chat Completions internally, then converts response back:
DeepIntShieldResponsesRequest → ToChatRequest() → ChatCompletion → ToDeepIntShieldResponsesResponse()Same parameter support as Chat Completions with response format differences (output items instead of message content).
3. Text Completions
Section titled “3. Text Completions”Cerebras supports legacy text completion API:
| Parameter | Mapping |
|---|---|
prompt | Sent as-is |
max_tokens | max_tokens |
temperature | temperature |
top_p | top_p |
stop | stop sequences |
Response returns choices[].text with completion text.
4. Text Completions Streaming
Section titled “4. Text Completions Streaming”Streaming text completions use same SSE format as chat streaming.
5. List Models
Section titled “5. List Models”Lists available models from Cerebras with capabilities and context length information.
Unsupported Features
Section titled “Unsupported Features”| Feature | Reason |
|---|---|
| Embedding | Not offered by Cerebras API |
| Image Generation | Not offered by Cerebras API |
| Speech/TTS | Not offered by Cerebras API |
| Transcription/STT | Not offered by Cerebras API |
| Batch Operations | Not offered by Cerebras API |
| File Management | Not offered by Cerebras API |
Caveats
Section titled “Caveats”User Field Size Limit
Severity: Low Behavior: User field > 64 characters is silently dropped Impact: Longer user identifiers are lost Code: SanitizeUserField enforces 64-char max