Quick Technical Overview
OpenRouter exposes a REST API that is fully compatible with the OpenAI chat completions format. Every endpoint documented below accepts standard JSON request bodies and returns JSON responses over HTTPS. The base URL for all requests is https://openrouter.ai/api/v1. Switch your existing OpenAI-compatible client to this base URL, replace your API key with an OpenRouter key, and you gain access to hundreds of models across dozens of providers through a single integration surface.
-
Chat Completions
The primary endpoint for generating AI model responses. Accepts messages arrays, system prompts, and generation parameters identical to the OpenAI format while adding provider routing controls.
View endpoint details → -
Model Listing & Discovery
Retrieve the full catalog of available models programmatically, including real-time pricing, context window limits, and supported features for every model on the platform.
Explore model endpoints → -
Streaming & Real-Time Delivery
Enable token-by-token streaming for interactive applications using standard server-sent events. Compatible with all major streaming client libraries without modification.
Read streaming docs → -
Utility & Account Endpoints
Access usage statistics, credit balance, and key management endpoints that support operational workflows for team environments and production monitoring.
View utility endpoints →
Chat Completions Endpoint
The chat completions endpoint is the core of the OpenRouter API — the single interface through which your application sends prompts to any supported AI model and receives generated responses. It accepts POST requests with a JSON body that matches the OpenAI chat completions schema, making migration from direct provider integrations nearly effortless.
Beyond standard OpenAI parameters like model, messages, temperature, and max_tokens, OpenRouter extends the request body with routing controls. The provider field lets you specify preferences for which AI lab handles your request. The transforms array supports middleware operations including prompt compression, response filtering, and custom routing rules. Fallback configuration through the models array enables automatic failover when a primary model is unavailable, with priority ordering that respects your cost and quality preferences.
Responses from the chat completions endpoint include the standard choices array with message content, finish reason, and usage statistics. OpenRouter adds a model field to the response identifying the exact model that processed your request — useful when fallback routing selects an alternative model. The usage object includes token counts for both prompt and completion, plus the total cost in credits consumed by the request.
| Method | Path | Description |
|---|---|---|
| POST | /api/v1/chat/completions | Generate a chat completion from any supported model with standard or streaming response |
| GET | /api/v1/models | List all available models with pricing, context windows, and feature support |
| GET | /api/v1/models/:model | Retrieve detailed information for a single model by identifier |
| GET | /api/v1/credits | Check current account credit balance and usage summary |
| POST | /api/v1/auth/keys | Generate a new API key with configurable permission scope |
| DELETE | /api/v1/auth/keys/:key_id | Revoke an existing API key immediately |
| GET | /api/v1/usage | Retrieve token consumption and cost data filtered by date range, model, or tag |
Model Discovery Endpoints
The model listing endpoint provides programmatic access to the entire OpenRouter model catalog — over 200 models from more than a dozen AI labs. A GET request to /api/v1/models returns a JSON array where each entry includes the model identifier, human-readable name, provider attribution, per-token pricing, context window size, and supported features like function calling or vision capabilities.
Filtering the model list is supported through query parameters. Append ?provider=anthropic to see only Anthropic models, or ?free=true to list only the free-tier models available on the platform. The ?search= parameter performs a substring match against model names and descriptions, useful for building model selection interfaces in your own applications. Pricing information in the response uses the same per-token format shown in the OpenRouter dashboard, so your application can calculate estimated costs before sending requests.
Models that support advanced features include a capabilities object in their listing entry. This object indicates support for function calling, structured JSON output, vision input, and tool use — letting your application determine at runtime whether a selected model can handle the task you intend to send it.
Request and Response Format Conventions
Every OpenRouter endpoint accepts application/json request bodies and returns application/json responses. Authentication is handled via the Authorization: Bearer <key> header. Error responses follow a consistent structure: an error object with a code string, a message string describing the failure, and optionally a metadata object with additional diagnostic information. HTTP status codes follow REST conventions — 200 for success, 400 for malformed requests, 401 for authentication failures, 429 for rate limit exceeded, and 500 for server-side issues.
Streaming Response Endpoints
Streaming delivers model-generated tokens to your client incrementally, dramatically reducing the perceived latency of AI-powered applications. OpenRouter supports streaming on the chat completions endpoint by setting "stream": true in your request body. The platform then returns a stream of server-sent events (SSE), each containing a chunk of the response with a delta of new tokens.
The streaming format is byte-for-byte compatible with OpenAI's streaming implementation. Existing client code that handles OpenAI streaming — whether written with the Python OpenAI library, the JavaScript openai npm package, or a custom SSE parser — will work with OpenRouter streaming after changing only the base URL and API key. Each SSE event includes an id field, a data payload containing the JSON chunk, and a [DONE] sentinel to signal stream completion.
For applications built on frameworks that do not natively support SSE, OpenRouter also supports the standard chunked transfer encoding approach used by many HTTP client libraries. The behavior is identical from the developer's perspective — the only difference is how the transport layer delivers the byte stream to your application code.
Utility and Account Endpoints
Beyond model invocation, OpenRouter provides utility endpoints that support operational workflows. The credits endpoint returns your current balance as a simple JSON number, making it easy to build balance-checking into deployment scripts or monitoring dashboards. The usage endpoint accepts date range and filter parameters, returning time-series token consumption data suitable for feeding into external analytics systems or cost-tracking spreadsheets.
API key management endpoints support programmatic key creation and revocation — particularly useful for teams that automate environment provisioning. A deployment script can create a scoped key for a new staging environment, configure it with spending limits, and revoke it during teardown, all without manual dashboard interaction. The NIST AI standards program provides additional guidance on secure API design patterns that complement the endpoint architecture described here. For teams concerned about financial transparency in API billing, the Consumer Financial Protection Bureau's tools portal offers resources for evaluating the cost transparency of online services.
We migrated our entire API integration layer to OpenRouter by changing one base URL and one API key. Three lines of configuration, and our application gained access to Claude, Gemini, and Llama alongside GPT-4 — each with fallback routing that keeps our production services running even when individual providers experience degradation. The endpoint documentation made the migration a one-afternoon task.Fatima Adebayo — Head of AI, Beacon Digital (Atlanta, GA)
Frequently Asked Questions About API Endpoints
What is the base URL for OpenRouter API requests?
All OpenRouter API requests use the base URL https://openrouter.ai/api/v1. This single entry point serves every supported endpoint — chat completions, model listing, credits, usage, and key management. Replace the base URL in your existing OpenAI-compatible client and use your OpenRouter API key to authenticate.
Does OpenRouter support the same request format as OpenAI?
Yes, the chat completions endpoint accepts standard OpenAI-compatible request bodies with messages, model, temperature, max_tokens, and all other standard parameters. OpenRouter extends the format with optional fields for provider routing, fallback configuration, and middleware transforms — all of which are backward-compatible.
How do I handle errors from the OpenRouter API?
OpenRouter returns standard HTTP status codes with JSON error bodies containing a code string and human-readable message. 400 indicates a malformed request, 401 signals an invalid or expired API key, 429 means the rate limit has been exceeded, and 500 reflects a server-side issue. Implement exponential backoff for 429 and 5xx responses.
Can I switch models without changing my API endpoint URL?
Absolutely — change only the model parameter in your request body and the same endpoint routes to whichever model you specify. No new endpoint URLs, no additional authentication setup, and no client library changes are needed when you want to test or deploy with a different model from the OpenRouter catalog.
Start Building with the OpenRouter API
Generate your API keys and send your first request in under five minutes. Every endpoint is ready for production workloads.
Get Started Now