OpenAI API - Complete Guide
Version: Production Ready β
Package: [email protected]
Last Updated: 2026-01-20
Status
β
Production Ready:
- β
Chat Completions API (GPT-5, GPT-4o, GPT-4 Turbo)
- β
Embeddings API (text-embedding-3-small, text-embedding-3-large)
- β
Images API (DALL-E 3 generation + GPT-Image-1 editing)
- β
Audio API (Whisper transcription + TTS with 11 voices)
- β
Moderation API (11 safety categories)
- β
Streaming patterns (SSE)
- β
Function calling / Tools
- β
Structured outputs (JSON schemas)
- β
Vision (GPT-4o)
- β
Both Node.js SDK and fetch approaches
Table of Contents
- Quick Start
- Chat Completions API
- GPT-5 Series Models
- Streaming Patterns
- Function Calling
- Structured Outputs
- Vision (GPT-4o)
- Embeddings API
- Images API
- Audio API
- Moderation API
- Error Handling
- Rate Limits
- Common Mistakes & Gotchas
- TypeScript Gotchas
- Production Best Practices
- Relationship to openai-responses
Quick Start
Installation
npm install [email protected]
Environment Setup
export OPENAI_API_KEY="sk-..."
Or create .env file:
OPENAI_API_KEY=sk-...
First Chat Completion (Node.js SDK)
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
const completion = await openai.chat.completions.create({
model: 'gpt-5',
messages: [
{ role: 'user', content: 'What are the three laws of robotics?' }
],
});
console.log(completion.choices[0].message.content);
First Chat Completion (Fetch - Cloudflare Workers)
const response = await fetch('https://api.openai.com/v1/chat/completions', {
method: 'POST',
headers: {
'Authorization': `Bearer ${env.OPENAI_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'gpt-5',
messages: [
{ role: 'user', content: 'What are the three laws of robotics?' }
],
}),
});
const data = await response.json();
console.log(data.choices[0].message.content);
Chat Completions API
Endpoint: POST /v1/chat/completions
The Chat Completions API is the core interface for interacting with OpenAI's language models. It supports conversational AI, text generation, function calling, structured outputs, and vision capabilities.
Supported Models
GPT-5 Series (Released August 2025)
- gpt-5: Full-featured reasoning model with advanced capabilities
- gpt-5-mini: Cost-effective alternative with good performance
- gpt-5-nano: Smallest/fastest variant for simple tasks
GPT-4o Series
- gpt-4o: Multimodal model with vision capabilities
- gpt-4-turbo: Fast GPT-4 variant
GPT-4 Series (Legacy)
- gpt-4: Original GPT-4 model (deprecated - use gpt-5 or gpt-4o)
Basic Request Structure
{
model: string,
messages: Message[],
reasoning_effort?: string,
verbosity?: string,
temperature?: number,
max_tokens?: number,
stream?: boolean,
tools?: Tool[],
}
Response Structure
{
id: string,
object: "chat.completion",
created: number,
model: string,
choices: [{
index: number,
message: {
role: "assistant",
content: string,
tool_calls?: ToolCall[]
},
finish_reason: string
}],
usage: {
prompt_tokens: number,
completion_tokens: number,
total_tokens: number
}
}
Message Roles & Multi-turn Conversations
Three roles: system (behavior), user (input), assistant (model responses).
Important: API is stateless - send full conversation history each request. For stateful conversations, use openai-responses skill.
GPT-5 Series Models
GPT-5 models (released August 2025) introduce reasoning and verbosity controls.
GPT-5.2 (Released December 11, 2025)
Latest flagship model:
- gpt-5.2: 400k context window, 128k output tokens
- xhigh reasoning_effort: New level beyond "high" for complex problems
- Compaction: Extends context for long workflows (via API endpoint)
- Pricing: $1.75/$14 per million tokens (1.4x of GPT-5.1)
const completion = await openai.chat.completions.create({
model: 'gpt-5.2',
messages: [{ role: 'user', content: 'Solve this extremely complex problem...' }],
reasoning_effort: 'xhigh',
});
GPT-5.1 (Released November 13, 2025)
Warmer, more intelligent model:
- gpt-5.1: Adaptive reasoning that varies thinking time dynamically
- 24-hour extended prompt caching: Faster follow-up queries at lower cost
- New developer tools: apply_patch (code editing), shell (command execution)
BREAKING CHANGE: GPT-5.1/5.2 default to reasoning_effort: 'none' (vs GPT-5 defaulting to 'medium').
O-Series Reasoning Models
Dedicated reasoning models (separate from GPT-5):
| Model |
Released |
Purpose |
| o3 |
Apr 16, 2025 |
Successor to o1, advanced reasoning |
| o3-pro |
Jun 10, 2025 |
Extended compute version of o3 |
| o3-mini |
Jan 31, 2025 |
Smaller, faster o3 variant |
| o4-mini |
Apr 16, 2025 |
Fast, cost-efficient reasoning |
const completion = await openai.chat.completions.create({
model: 'o3',
messages: [{ role: 'user', content: 'Complex reasoning task...' }],
});
Note: O-series may be deprecated in favor of GPT-5 with reasoning_effort parameter.
reasoning_effort Parameter
Controls thinking depth (GPT-5/5.1/5.2):
- "none": No reasoning (fastest) - GPT-5.1/5.2 default
- "minimal": Quick responses (Note: May not be available - Issue #1690)
- "low": Basic reasoning
- "medium": Balanced - GPT-5 default
- "high": Deep reasoning
- "xhigh": Maximum reasoning (GPT-5.2 only)
verbosity Parameter
Controls output detail (GPT-5 series):
- "low": Concise
- "medium": Balanced (default)
- "high": Verbose
GPT-5 Limitations
NOT Supported:
- β
temperature, top_p, logprobs parameters
- β Stateful Chain of Thought between turns
Alternatives: Use GPT-4o for temperature/top_p, or openai-responses skill for stateful reasoning
Streaming Patterns
Enable with stream: true for token-by-token delivery.
Node.js SDK