Skip to main content
Available on all Portkey plans.
Portkey’s /v1/messages endpoint accepts the Anthropic Messages API format and routes to any of 3000+ models across all major providers. Tools built natively on the Messages format — like Claude Code and the Claude Agent SDK — work with any backend model through Portkey without modification.

Why Messages API

  • Write once, run anywhere — Any SDK or tool built on the Anthropic Messages format works instantly. No rewrites.
  • Switch providers with one string — Change the model parameter to route to a different provider. Request format and response shape stay identical.
  • Full gateway features — Fallbacks, load balancing, caching, and observability work transparently across all providers.

Quick Start

Use the Anthropic SDK with Portkey’s base URL. The @provider/model format routes requests to the correct provider.
import anthropic

client = anthropic.Anthropic(
    api_key="PORTKEY_API_KEY",
    base_url="https://api.portkey.ai"
)

message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain quantum computing in simple terms"}]
)

print(message.content[0].text)
max_tokens is required. See the Model Catalog for all supported provider and model strings.

Switching Providers

Change the model string to route to any provider. Everything else stays the same.
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}]
)
The SDK code, request format, and response shape are identical across all providers. Portkey translates the Messages format to each provider’s native API. See Provider Support for how this works.

Migrate in 2 Lines

Already using the Anthropic SDK? Point it at Portkey:
Python
client = anthropic.Anthropic(
    api_key="PORTKEY_API_KEY",       # Replace your Anthropic key with your Portkey API key
    base_url="https://api.portkey.ai"  # Point at Portkey
)
All existing Messages API calls work as-is. Use the @anthropic-provider/ prefix to keep routing to Anthropic, or switch the model string to any other provider.

Text Generation

System Prompt

Set a system prompt with the top-level system parameter:
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    system="You are a pirate. Always respond in pirate speak.",
    messages=[{"role": "user", "content": "Say hello."}]
)
The system parameter also accepts an array of content blocks for prompt caching:
Python
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    system=[
        {"type": "text", "text": "You are an expert on this topic..."},
        {"type": "text", "text": "Here is the reference material...", "cache_control": {"type": "ephemeral"}}
    ],
    messages=[{"role": "user", "content": "Summarize the key points"}]
)

Streaming

Stream responses with stream=True in the SDK or "stream": true in cURL.
with client.messages.stream(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a haiku about AI"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
Portkey normalizes all streaming responses to the Anthropic SSE event format, regardless of which provider handles the request.
Events are emitted in this sequence for every streaming response:
EventDescription
message_startOpens the message with metadata (id, model, initial usage)
content_block_startOpens a content block — type: "text" for text, type: "tool_use" for tool calls
content_block_deltaIncremental content — text_delta for text, input_json_delta for tool input
content_block_stopCloses a content block
message_deltaCloses the message with stop_reason (end_turn, max_tokens, tool_use) and final usage
message_stopFinal event signaling stream completion
Example content_block_delta event:
event: content_block_delta
data: {"type": "content_block_delta", "index": 0, "delta": {"type": "text_delta", "text": "Hello"}}
Example message_delta event:
event: message_delta
data: {"type": "message_delta", "delta": {"stop_reason": "end_turn", "stop_sequence": null}, "usage": {"output_tokens": 42}}

Multi-turn Conversations

Build conversations by passing the full message history. Messages must alternate between user and assistant roles.
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "My name is Alice."},
        {"role": "assistant", "content": "Hello Alice! How can I help you?"},
        {"role": "user", "content": "What is my name?"}
    ]
)

print(message.content[0].text)  # "Your name is Alice."

Generation Parameters

ParameterTypeDescription
max_tokensintegerRequired. Maximum tokens in the response
temperaturefloatSampling temperature (0–1). Higher = more creative
top_pfloatNucleus sampling threshold (0–1)
top_kintegerTop-K sampling. Anthropic native only — silently dropped on adapter providers
stop_sequencesarrayStop strings. Translated to stop for adapter providers
streambooleanEnable streaming responses

Tool Use

Define tools with name, description, and input_schema:
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "What's the weather in San Francisco?"}],
    tools=[{
        "name": "get_weather",
        "description": "Get current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
        }
    }]
)

for block in message.content:
    if block.type == "tool_use":
        print(f"Tool: {block.name}, Input: {block.input}")

Tool Results

Pass tool results back in a user message with tool_result content blocks to continue the conversation:
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "What's the weather in Paris?"},
        {"role": "assistant", "content": [
            {"type": "tool_use", "id": "tool_123", "name": "get_weather", "input": {"location": "Paris"}}
        ]},
        {"role": "user", "content": [
            {"type": "tool_result", "tool_use_id": "tool_123", "content": '{"temp": "22°C", "condition": "sunny"}'}
        ]}
    ],
    tools=[{
        "name": "get_weather",
        "description": "Get weather for a location",
        "input_schema": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}
    }]
)

print(message.content[0].text)
For MCP-based tool use, see Remote MCP.

Vision

Send images using content blocks. Supports both URLs and base64-encoded data.
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {"type": "image", "source": {"type": "url", "url": "https://example.com/image.jpg"}},
            {"type": "text", "text": "Describe this image"}
        ]
    }]
)

print(message.content[0].text)

Structured Output

Use output_config to constrain responses to a JSON schema. Portkey maps this to response_format for adapter providers.
message = client.messages.create(
    model="@openai-provider/gpt-4.1",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Extract name and age from: Alice is 30 years old."}],
    extra_body={
        "output_config": {
            "format": {
                "type": "json_schema",
                "schema": {
                    "type": "object",
                    "properties": {
                        "name": {"type": "string"},
                        "age": {"type": "integer"}
                    },
                    "required": ["name", "age"]
                }
            }
        }
    }
)
output_config is a Portkey extension to the Messages API format. Only json_schema is supported — json_object is not available via the adapter.

Extended Thinking

Two mechanisms for controlling model reasoning: thinking — Anthropic native. Pass directly to Anthropic Claude models. Silently dropped on adapter providers. output_config.effort — Cross-provider. Works across Anthropic, OpenAI o-series, and Gemini 2.5. Portkey maps it to each provider’s native reasoning format.
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=16000,
    thinking={"type": "enabled", "budget_tokens": 10000},
    messages=[{"role": "user", "content": "Analyze the implications of quantum computing on cryptography"}]
)

for block in message.content:
    if block.type == "thinking":
        print(f"Thinking: {block.thinking[:200]}...")
    elif block.type == "text":
        print(f"Response: {block.text}")
When using Anthropic’s thinking parameter, max_tokens must exceed budget_tokens. See Thinking Mode for provider-specific effort mappings.

Prompt Caching

Use cache_control on system prompts, messages, and tool definitions to cache frequently-used content.
message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    system=[{
        "type": "text",
        "text": "You are an expert analyst. Here is a very long reference document...",
        "cache_control": {"type": "ephemeral"}
    }],
    messages=[{"role": "user", "content": "Summarize the key points"}]
)
Cached content reduces latency and cost. Cache usage is reflected in the usage object of the response.
cache_control is Anthropic-native and is stripped when routing to adapter providers. See Anthropic prompt caching for details.

Provider Support

Portkey handles the Messages API in two ways depending on the provider:
  • Native providers — Requests pass through directly. All Anthropic-specific features work (thinking, cache_control, top_k, etc.).
  • Adapter providers — Portkey translates between Messages format and the provider’s native Chat Completions format. See Parameter Compatibility for what is and isn’t supported.
The response always comes back in Anthropic Messages format, regardless of which provider handles the request. Native providers: Anthropic, AWS Bedrock (Claude models) Adapter providers: OpenAI, Azure OpenAI, Google Gemini, Google Vertex AI, AWS Bedrock (non-Claude), Mistral AI, Groq, Together AI, and all other providers

Parameter Compatibility

Portkey’s Messages adapter translates requests to each provider’s Chat Completions format for non-native providers. Unsupported parameters are silently dropped — no error is returned. Parameters translated for adapter providers:
Messages API paramAdapter equivalentNotes
max_tokensmax_completion_tokens
stop_sequencesstop
systemFirst message with role: "system"String or array both handled
tools[].input_schematools[].function.parametersFormat converted
tool_choice: {type: "auto"}"auto"
tool_choice: {type: "any"}"required"
tool_choice: {type: "tool", name: X}{type: "function", function: {name: X}}
metadata.user_iduser
temperature, top_p, streamDirect pass-through
output_config.format (json_schema)response_formatPortkey extension; json_object not supported
output_config.effortreasoning_effortPortkey extension for cross-provider reasoning
Parameters silently dropped on adapter providers:
  • thinking — Anthropic-native; use output_config.effort for cross-provider reasoning control
  • top_k — no Chat Completions equivalent
  • cache_control — stripped during message transformation
  • container, mcp_servers, service_tier, anthropic_beta
Provider-specific parameters (e.g. Gemini’s safety_settings, Bedrock guardrail configs) cannot be passed through the Messages adapter. Use the provider’s native integration or Chat Completions instead.

Using with Portkey Features

The Messages API works with all Portkey gateway features. Pass a config via header alongside any Anthropic SDK call:
import anthropic

client = anthropic.Anthropic(
    api_key="PORTKEY_API_KEY",
    base_url="https://api.portkey.ai",
    default_headers={
        "x-portkey-config": "pp-config-xxx"  # Config with fallbacks, load balancing, etc.
    }
)

message = client.messages.create(
    model="@anthropic-provider/claude-sonnet-4-5-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)

API Reference

Last modified on February 19, 2026