Reranking
Use reranking to sort documents by relevance for search, retrieval, and context selection.
Provider Model Examples
Section titled “Provider Model Examples”- Cohere:
cohere/rerank-v3.5 - vLLM:
vllm/BAAI/bge-reranker-v2-m3 - Bedrock:
bedrock/<rerank-model-or-arn> - Vertex AI:
vertex/<ranking-model>
Basic Request
Section titled “Basic Request”curl --location 'http://localhost:8080/v1/rerank' \--header 'Content-Type: application/json' \--data '{ "model": "cohere/rerank-v3.5", "query": "What is DeepIntShield?", "documents": [ {"text": "DeepIntShield is an AI gateway that unifies many LLM providers."}, {"text": "Paris is the capital of France."}, {"text": "DeepIntShield exposes an OpenAI-compatible API."} ]}'Request Parameters
Section titled “Request Parameters”model(required): model inprovider/modelformatquery(required): query used for rankingdocuments(required): array of documents withtext(optionalid,meta)top_n(optional): maximum number of resultsmax_tokens_per_doc(optional): provider-dependent document token cappriority(optional): provider-dependent priority hintreturn_documents(optional): include matched document content in each resultfallbacks(optional): fallback models inprovider/modelformat
Example with Options
Section titled “Example with Options”curl --location 'http://localhost:8080/v1/rerank' \--header 'Content-Type: application/json' \--data '{ "model": "cohere/rerank-v3.5", "query": "gateway observability", "top_n": 2, "return_documents": true, "documents": [ {"id": "a", "text": "DeepIntShield supports observability plugins like OTEL and Maxim."}, {"id": "b", "text": "DeepIntShield can run in Kubernetes and ECS."}, {"id": "c", "text": "Token counting is available at /v1/responses/input_tokens."} ]}'vLLM Endpoint Compatibility
Section titled “vLLM Endpoint Compatibility”When using a vllm/... model, DeepIntShield sends rerank requests to /v1/rerank first and automatically retries /rerank when the upstream endpoint responds with 404, 405, or 501.
Response Shape
Section titled “Response Shape”{ "results": [ { "index": 0, "relevance_score": 0.98, "document": { "id": "a", "text": "DeepIntShield supports observability plugins like OTEL and Maxim." } }, { "index": 2, "relevance_score": 0.63, "document": { "id": "c", "text": "Token counting is available at /v1/responses/input_tokens." } } ], "model": "rerank-v3.5", "usage": { "prompt_tokens": 52, "completion_tokens": 0, "total_tokens": 52 }, "extra_fields": { "request_type": "rerank", "provider": "cohere", "latency": 245, "chunk_index": 0 }}Common Validation Errors
Section titled “Common Validation Errors”- Missing
query->query is required for rerank - Empty
documents->documents are required for rerank - Blank document text ->
document text is required for rerank at index N top_n < 1->top_n must be at least 1
Next Steps
Section titled “Next Steps”Now that you understand reranking, explore these related topics:
Essential Topics
Section titled “Essential Topics”- Multimodal AI - Process images, audio, and multimedia content
- Tool Calling - Enable AI models to use external tools and functions
- Provider Configuration - Multiple providers for redundancy
- Integrations - Drop-in compatibility with existing SDKs
Advanced Topics
Section titled “Advanced Topics”- Core Features - Advanced DeepIntShield capabilities
- Architecture - How DeepIntShield works internally
- Deployment - Production setup and scaling