AI Gateway Now Supports OpenAI Responses API

AI Gateway now supports OpenAI's Responses API. You can use the OpenAI SDK you already know, point it at AI Gateway, and route requests to models from all providers through a single interface.

The Responses API is an alternative to the Chat Completions API and relies on the exact same routing format that AI Gateway uses across all its endpoints. Compared to Chat Completions, Responses API is designed around a flatter input/output format and has built in reasoning support.

All of the functionality in the Responses API was already accessible through AI Gateway via the AI SDK and Chat Completions API, but you can now use the Responses API directly. TypeScript and Python support are both available.

What you can do

Text generation and streaming: Send prompts, get responses, stream tokens as they arrive.
Tool calling: Define functions the model can invoke with structured arguments, feed results back for multi-turn workflows.
Structured output: Constrain responses to a JSON schema for reliable parsing.
Reasoning: Control thinking effort with a single parameter across providers.
Provider routing: Switch between OpenAI, Anthropic, Google, and more by changing the model string.

Getting started

To begin, install the OpenAI SDK for either TypeScript or Python. Then, initialize the client by setting your base URL to the AI Gateway endpoint and providing your API key.

Setup for TypeScript:

Python works the same way:

Text generation

You can send a prompt and receive a text response from any supported model. To switch providers, change the model string from openai/gpt-5.4 to any other models through AI Gateway following the creator/model format.

Streaming

For interactive interfaces, you can stream tokens as they generate. The Responses API uses server-sent events to deliver the output in real time.

Tool calling

You can define specific functions that the model can invoke during a conversation. When the model needs external data, it returns a function call instead of standard text.

Your application executes that function and feeds the results back into a follow-up request to continue the interaction.

Structured output

You can enforce structured outputs to ensure the model returns data that matches your exact schema requirements.

Reasoning

The Responses API introduces configurable reasoning capabilities for complex tasks. You can adjust the reasoning parameters to control how much time the model spends processing before it generates a final answer. The effort parameter accepts none, minimal, low, medium, high, or xhigh. AI Gateway maps this to provider-specific reasoning settings, so you don't need to learn each provider's API.

Explore the AI Gateway documentation to learn more about the integration. You can also view the complete list of supported models and providers.

More information

Read the Responses API documentation.
View all supported models.