Skip to content

Perplexity

Perplexity is an OpenAI-compatible API with built-in web search capabilities and reasoning support. DeepIntShield performs conversions including:

  • OpenAI-compatible base - Uses OpenAI’s chat format as foundation
  • Web search parameters - Search mode, domain filters, recency filters, and location-based search
  • Reasoning effort mapping - reasoning.effort mapped to Perplexity’s reasoning_effort with special handling for “minimal”
  • Search results inclusion - Citations, search results, and videos included in response
  • Special usage tracking - Citation tokens, search queries, and reasoning tokens tracked separately
OperationNon-StreamingStreamingEndpoint
Chat Completions/chat/completions
Responses API/chat/completions
Text Completions-
Embeddings-
Image Generation-
Speech (TTS)-
Transcriptions (STT)-
Files-
Batch-
List Models-

Perplexity supports most OpenAI chat completion parameters. For standard parameter reference, see OpenAI Chat Completions.

  • No function calling: tools and tool_choice are silently dropped
  • Dropped parameters: stop, logit_bias, logprobs, top_logprobs, seed, parallel_tool_calls, service_tier
  • Reasoning: Uses reasoning_effort instead of reasoning object (see Reasoning & Effort)

Use extra_params (SDK) or pass directly in request body (Gateway) for Perplexity-specific search and configuration fields:

Terminal window
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "sonar",
"messages": [{"role": "user", "content": "What is the latest news?"}],
"search_mode": "web",
"language_preference": "en",
"return_images": true,
"return_related_questions": true,
"disable_search": false,
"search_domain_filter": ["news.example.com"],
"search_recency_filter": "week"
}'
ParameterTypeDescription
search_modestringSearch mode: "web", "academic", "news", etc.
language_preferencestringLanguage preference (e.g., "en", "fr")
search_domain_filterstring[]Restrict search to specific domains
return_imagesbooleanInclude images in search results
return_related_questionsbooleanReturn related questions
search_recency_filterstringRecency filter: "hour", "day", "week", "month", "year"
search_after_date_filterstringSearch results after date (ISO format)
search_before_date_filterstringSearch results before date (ISO format)
last_updated_after_filterstringContent last updated after date
last_updated_before_filterstringContent last updated before date
disable_searchbooleanDisable web search entirely
enable_search_classifierbooleanEnable search classifier
top_kintegerTop-k results to use
ParameterTypeDescription
web_search_optionsobject[]Array of web search option configurations with user location support
media_response.overrides.return_videosbooleanReturn videos in results
media_response.overrides.return_imagesbooleanReturn images in results

Configure detailed search behavior including location:

{
"web_search_options": [
{
"search_context_size": "high",
"user_location": {
"latitude": 40.7128,
"longitude": -74.0060,
"city": "New York",
"country": "US",
"region": "NY"
},
"image_search_relevance_enhanced": true
}
]
}
  • reasoning.effortreasoning_effort
  • Supported efforts: "low", "medium", "high"
  • Special conversion: "minimal""low" (Perplexity normalizes to low/medium/high)
  • reasoning.max_tokens is silently dropped (Perplexity doesn’t support token budget control)
// Request
{"reasoning": {"effort": "high"}}
// Perplexity conversion
{"reasoning_effort": "high"}
// Special case: "minimal" effort
{"reasoning": {"effort": "minimal"}}
→ {"reasoning_effort": "low"}

Perplexity responses include additional fields for search integration:

  • citations[] - Source citations from search
  • search_results[] - Full search results with metadata
  • videos[] - Video results from search

These fields are preserved in the DeepIntShield response for client use.

Extended usage tracking specific to Perplexity:

FieldSourceDescription
completion_tokens_details.citation_tokensusage.citation_tokensTokens used for citations
completion_tokens_details.num_search_queriesusage.num_search_queriesNumber of web search queries performed
completion_tokens_details.reasoning_tokensusage.reasoning_tokensTokens consumed by reasoning process
usage.costusage.costCost of the request
{
"id": "...",
"choices": [...],
"usage": {
"prompt_tokens": 100,
"completion_tokens": 150,
"total_tokens": 250,
"completion_tokens_details": {
"citation_tokens": 25,
"num_search_queries": 3,
"reasoning_tokens": 40
},
"cost": { "prompt_cost": 0.001, "completion_cost": 0.002 }
},
"citations": ["https://example.com/article1", "https://example.com/article2"],
"search_results": [
{
"title": "...",
"url": "...",
"snippet": "...",
"date": "2025-01-15"
}
],
"videos": [
{
"title": "...",
"url": "...",
"duration": 300
}
]
}

Perplexity uses OpenAI-compatible streaming format. Event sequence:

  • chat.completion.chunk events with delta updates
  • Standard OpenAI finish reason mapping

No Tool Support

Severity: High Behavior: Tool-related parameters are silently dropped Impact: Function calling not available Code: chat.go:8-36

Reasoning Effort Mapping

Severity: Medium Behavior: "minimal" effort is mapped to "low" (Perplexity only supports low/medium/high) Impact: Requested minimal effort becomes low effort Code: chat.go:30-36, responses.go:25-30

Reasoning Max Tokens Dropped

Severity: Low Behavior: reasoning.max_tokens is silently dropped Impact: No control over reasoning token budget Code: chat.go:29-36

Stop Sequences Not Supported

Severity: Low Behavior: stop parameter is silently dropped Impact: Stop sequences not enforced Code: chat.go:8-36


The Responses API is adapted for Perplexity by converting to the Chat Completions format internally and returning results in Responses format.

ParameterTransformation
max_output_tokensDirect pass-through to max_tokens
temperature, top_pDirect pass-through
instructionsConverted to system message (prepended)
reasoning.effortMapped to reasoning_effort (see Reasoning & Effort)
text.formatPassed through as response_format
input (string/array)Converted to messages

Same Perplexity-specific search and configuration parameters as Chat Completions (see Perplexity-Specific Parameters).

Terminal window
curl -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "sonar",
"instructions": "You are a helpful assistant with web search capabilities",
"input": "What is the latest news in technology?",
"search_mode": "news",
"return_images": true
}'
  • instructions becomes a system message prepended to input messages
  • input (string or array) converted to user message(s)
  • Response converted to Responses API format with same search results and extended usage details

Same as Chat Completions with search results, citations, and extended usage tracking preserved.

Responses streaming uses the same OpenAI-compatible streaming as Chat Completions, with results adapted to Responses format.