Cerebras

Overview

Cerebras is a fully OpenAI-compatible provider leveraging the complete set of OpenAI API features. DeepIntShield delegates all functionality to the OpenAI provider implementation with standard parameter filtering. Key characteristics:

Complete OpenAI compatibility - All chat, text, and streaming features supported
Full tool calling - Function definitions and parallel tool execution
Streaming support - Server-Sent Events with token usage tracking
Parameter preservation - Passes through all standard OpenAI parameters
Responses API - Full support with format conversion

Supported Operations

Operation	Non-Streaming	Streaming	Endpoint
Chat Completions	✅	✅	`/v1/chat/completions`
Responses API	✅	✅	`/v1/chat/completions`
Text Completions	✅	✅	`/v1/completions`
List Models	✅	-	`/v1/models`
Embeddings	❌	❌	-
Image Generation	❌	❌	-
Speech (TTS)	❌	❌	-
Transcriptions (STT)	❌	❌	-
Files	❌	❌	-
Batch	❌	❌	-

1. Chat Completions

Request Parameters

Cerebras supports all standard OpenAI chat completion parameters. For full parameter reference and behavior, see OpenAI Chat Completions.

Filtered Parameters

Removed for Cerebras compatibility:

prompt_cache_key - Not supported
verbosity - Anthropic-specific
store - Not supported
service_tier - OpenAI-specific

Reasoning Parameter

Cerebras delegates to OpenAI via ToOpenAIChatRequest, so reasoning parameters are transformed: reasoning.effort values (e.g., minimal → low) are mapped per the OpenAI-compatible providers convention, and reasoning.max_tokens is cleared/omitted (removed during conversion).

Cerebras supports all standard OpenAI message types, tools, responses, and streaming formats. For details on message handling, tool conversion, responses, and streaming, refer to OpenAI Chat Completions.

2. Responses API

DeepIntShield converts Responses API format to Chat Completions internally, then converts response back:

DeepIntShieldResponsesRequest
  → ToChatRequest()
  → ChatCompletion
  → ToDeepIntShieldResponsesResponse()

Same parameter support as Chat Completions with response format differences (output items instead of message content).

3. Text Completions

Cerebras supports legacy text completion API:

Parameter	Mapping
`prompt`	Sent as-is
`max_tokens`	max_tokens
`temperature`	temperature
`top_p`	top_p
`stop`	stop sequences

Response returns choices[].text with completion text.

4. Text Completions Streaming

Streaming text completions use same SSE format as chat streaming.

5. List Models

Lists available models from Cerebras with capabilities and context length information.

Unsupported Features

Feature	Reason
Embedding	Not offered by Cerebras API
Image Generation	Not offered by Cerebras API
Speech/TTS	Not offered by Cerebras API
Transcription/STT	Not offered by Cerebras API
Batch Operations	Not offered by Cerebras API
File Management	Not offered by Cerebras API

Caveats

User Field Size Limit

Severity: Low Behavior: User field > 64 characters is silently dropped Impact: Longer user identifiers are lost Code: SanitizeUserField enforces 64-char max