← Back to blog

explainx / blog

Codex With Open Source Models: Ollama, OSS Mode, and Local Providers Guide

Step-by-step guide to run open-source models in OpenAI Codex: Ollama quick launch, codex --oss, config.toml profiles, LM Studio, vLLM. GLM-5.2, Kimi-K2.7-Code, gpt-oss:120b, wire_api=responses, troubleshooting, and June 2026 limits.

·10 min read·Yash Thakker
OpenAI CodexOllamaOpen Source ModelsGLM-5.2Kimi K2.7AI Coding
Codex With Open Source Models: Ollama, OSS Mode, and Local Providers Guide

On June 17, 2026, Tibo Sottiaux (@thsottiaux, OpenAI Codex) posted a reminder that surprised developers who still treat Codex as GPT-only:

Reminder that you can use the Codex App, CLI and SDK with any open source model, not just with OpenAI models.

The post hit 1.6M+ views within a day—because it reframes Codex from "OpenAI's coding agent" to a portable agent harness that can point at local weights, Chinese frontier models, or your own vLLM endpoint.

Ollama immediately showed what that looks like in practice: ollama launch codex and ollama launch codex-app with GLM-5.2 and Kimi-K2.7-Code—the same open models many teams adopted during the Fable 5 suspension.

This guide covers official OSS mode, Ollama integration, what works today, and what the community says is still broken.


TL;DR

QuestionAnswer
DocsCodex config advanced — OSS mode
Ollama docsdocs.ollama.com/integrations/codex
Quick launchollama launch codex · ollama launch codex-app
Manual CLIcodex --oss · codex --oss -m glm-5.2
Context min64k tokens (Ollama recommendation)
Wire APIwire_api = "responses" (required)
Models citedGLM-5.2, Kimi-K2.7-Code, gpt-oss:120b
Confirmed by@thsottiaux (OpenAI Codex), @ollama
Known bugsDesktop model picker hides custom providers
GPT-only featuresComputer use, browser automation (reported)

Prerequisites

Before pointing Codex at open weights, confirm:

RequirementWhy
Codex CLInpm install -g @openai/codex — or Codex Desktop App
Ollama 0.30+Profile v2 support for ollama launch codex (Codex 0.134+)
64k+ contextOllama docs: Codex agent loops need large context windows
VRAM / RAMModel-dependent — see model table below
~/.codex/config.tomlUser-level config (not project .codex/ for provider auth)

Config location:

  • macOS / Linux / WSL: ~/.codex/config.toml
  • Windows: %USERPROFILE%\.codex\config.toml

Project-scoped .codex/config.toml cannot override model_provider or model_providers — those must live in user config per OpenAI docs.


Pick Your Setup Path

PathBest forCommand
A. Ollama launchFastest startollama launch codex
B. --oss flagOne-off sessionscodex --oss -m <model>
C. Profile configDaily driver, switch GPT ↔ OSScodex --profile ollama-launch
D. LM StudioGUI-first local inferenceoss_provider = "lmstudio"
E. Custom providervLLM, Unsloth, OpenRouter OSS[model_providers.my_api]

Path A: Ollama Quick Launch (Recommended)

The lowest-friction path. Ollama manages Codex profiles, model catalog, and provider wiring.

Step 1 — Install and pull a model

# Install Codex CLI
npm install -g @openai/codex

# Pull a coding-capable model (examples)
ollama pull glm-5.2
ollama pull kimi-k2.7-code
ollama pull gpt-oss:120b

Step 2 — Launch Codex through Ollama

ollama launch codex          # CLI
ollama launch codex-app      # Desktop app

What happens under the hood (Ollama docs):

  • Refreshes the model catalog for Codex
  • Creates ~/.codex/ollama-launch.config.toml (profile v2 — separate from base config)
  • Keeps [model_providers.ollama-launch] in ~/.codex/config.toml
  • Invokes Codex with --profile ollama-launch

Configure without launching:

ollama launch codex --config

Remove Ollama-managed profile:

ollama launch codex --restore

Profile v2 migration (Codex 0.134+)

If you see:

--profile ollama-launch cannot be used while config.toml contains legacy
[profiles.ollama-launch] or profile = "ollama-launch"

Fix: Update Ollama to v0.30+. Older ollama launch codex wrote legacy [profiles.*] tables Codex no longer accepts. Profile settings now belong in ~/.codex/ollama-launch.config.toml, not nested under [profiles] in the main config.


Path B: Manual --oss Flag

For ad-hoc sessions without Ollama's launcher:

# Default OSS provider (oss_provider in config.toml, usually ollama)
codex --oss

# Specific model
codex --oss -m gpt-oss:120b

# Ollama cloud-hosted variant
codex --oss -m gpt-oss:120b-cloud

Set default provider in ~/.codex/config.toml:

# Default local provider used with `--oss`
oss_provider = "ollama"   # or "lmstudio"

Ensure Ollama is running (ollama serve) and the model is pulled before launching.


Path C: Persistent Profile Config (Power Users)

For teams that switch between GPT for planning and OSS for execution across sessions.

Base provider in ~/.codex/config.toml

[model_providers.ollama-launch]
name = "Ollama"
base_url = "http://localhost:11434/v1/"
wire_api = "responses"

Critical: wire_api = "responses". Codex uses OpenAI's Responses API, not legacy Chat Completions. Endpoints that only expose /v1/chat/completions will fail or need a proxy (CC Switch community tool cited by developers).

Profile overlay in ~/.codex/ollama-launch.config.toml

model = "glm-5.2"
model_provider = "ollama-launch"
model_catalog_json = "/Users/you/.codex/ollama-launch-models.json"

Then:

codex --profile ollama-launch
codex exec --profile ollama-launch "fix the failing test in src/auth"

Profiles (Codex 0.134+): Each profile is a separate TOML file at ~/.codex/<profile-name>.config.toml with top-level keys — not nested [profiles.name] in the base config. Switch with --profile profile-name.

GPT profile for comparison

Create ~/.codex/gpt.config.toml:

model = "gpt-5.5"
model_reasoning_effort = "high"
approval_policy = "on-request"
codex --profile gpt          # OpenAI weights
codex --profile ollama-launch # Local OSS

Path D: LM Studio

Codex reserves built-in provider IDs ollama and lmstudio. For LM Studio:

oss_provider = "lmstudio"

Start LM Studio's local server, then:

codex --oss

Same 64k context and Responses API requirements apply — verify LM Studio exposes a Responses-compatible endpoint or use a proxy.


Path E: Custom Providers (vLLM, Unsloth, OpenRouter)

For self-hosted vLLM or API routers serving open models:

model = "your-model-id"
model_provider = "local_vllm"

[model_providers.local_vllm]
name = "Local vLLM"
base_url = "http://localhost:8000/v1"
wire_api = "responses"
requires_openai_auth = false
env_key = "LOCAL_API_KEY"   # optional; use dummy if none

Launch:

codex --oss --profile local_vllm
# or one-off:
codex --config model_provider='"local_vllm"' --config model='"your-model-id"'

Verify model ID: curl http://localhost:8000/v1/models

Reserved IDs you cannot use for custom providers: openai, ollama, lmstudio. Pick a unique name like local_vllm or openrouter_oss.

To route open models through OpenRouter while keeping Codex's harness, define a custom provider pointing at OpenRouter's base URL with your API key — same pattern as Fusion API but with a single OSS model instead of the fusion panel.


Recommended Models (June 2026)

ModelOllama tagStrengthContext note
GLM-5.2glm-5.2Reasoning, post-Fable codingPull + verify 64k+
Kimi K2.7-Codekimi-k2.7-codeAgentic coding, SWE-benchLarge MoE — check VRAM
gpt-oss:120bgpt-oss:120bOpenAI open weights in OllamaOfficial OSS stack pairing
DeepSeek V3 / R1variesReasoning, mathPopular self-host choice
Qwen3-CodervariesFast coding slicesGood on 24GB GPUs

Pair with our closed vs open source comparison when picking a GPT replacement.

Hardware rule of thumb: Coding agents need enough VRAM for the model plus headroom for long context. If the model stutters or truncates mid-task, reduce parallel tool calls or switch to a smaller quant.


What OSS Mode Actually Means

OpenAI's advanced Codex configuration documents OSS mode and local providers: point the Codex App, CLI, or SDK at an OpenAI-compatible or configured third-party base URL instead of default GPT endpoints.

That is structurally different from "run a chat UI on Llama." Codex brings:

  • Agent loop with tool execution
  • Repo-aware coding workflows
  • SDK embedding for custom products

The model underneath becomes pluggable—same harness, different weights.

What stays the same with OSS:

  • Agent loop, tool execution, sandbox
  • Slash commands (/plan, /goal, /review)
  • Project memory via AGENTS.md / .codex/config.toml
  • Skills support (if your OSS model handles tool calls reliably)

What may degrade:

  • Tool-calling reliability (model-dependent)
  • Reasoning quality on hard agentic tasks
  • Features explicitly gated to GPT (below)

First Session Workflow

Once configured, a typical OSS Codex session:

cd your-repo
codex --profile ollama-launch   # or: ollama launch codex

# Inside Codex TUI:
/init                            # generate AGENTS.md if missing
/permissions                     # set sandbox: read-only → workspace-write
/model                           # confirm local model selected
"Fix the auth test in src/login.test.ts using TDD"

Tips for OSS models:

  1. Smaller tasks — vertical slices, not "refactor the entire app"
  2. /plan first — OSS models benefit from explicit planning (Matt Pocock's /to-prd pattern works in any agent)
  3. /compact often — local models hit context limits faster than GPT-5.x
  4. Verify tool output — weaker models may hallucinate file paths or skip tests

Codex App (Desktop) With OSS

@ollama (June 17, 2026) added desktop support:

ollama launch codex-app

Same profile wiring as CLI. Known issue: @trashpandaemoji and others report the Desktop model picker does not list external providers even when config is correct—you may need to launch via Ollama integration or CLI until OpenAI fixes the UI.

No OpenAI API key required for local model inference. You still install the Codex client from OpenAI; only the inference endpoint changes.


Troubleshooting

SymptomLikely causeFix
Legacy profile errorOld [profiles.*] in configollama launch codex --restore, update Ollama 0.30+, relaunch
404 on /v1/responsesChat Completions-only serverSet wire_api = "responses" or use CC Switch proxy
Model not in pickerDesktop UI bugLaunch via ollama launch codex-app or CLI --profile
Context overflow / truncationModel context too smallUse 64k+ models; /compact; smaller tasks
Tool calls fail silentlyOSS model weak at function callingTry gpt-oss:120b, GLM-5.2, or Kimi K2.7-Code
Auth / sign-in promptrequires_openai_auth defaultSet requires_openai_auth = false on custom provider
Hybrid GPT + OSS failsProtocol mismatch in one sessionUse separate profiles/sessions (below)

Reset Ollama integration:

ollama launch codex --restore
ollama launch codex --config    # regenerate profile

Community Limitations (June 2026)

The viral tweet surfaced practical friction. Treat these as reported, not official roadmaps:

1. Desktop model picker bug

@trashpandaemoji and others: Codex Desktop does not show external provider models in the picker when using custom providers—config works, UI does not.

2. Computer use and browser need GPT

@0xSero, @rodasjateno: Computer use and Chrome/browser capabilities appear GPT-locked. Hacky workarounds exist; not first-class for OSS endpoints.

3. Responses vs Chat Completions gap

@Jason_Young1231: Many third-party APIs expose Chat Completions, not OpenAI's Responses API Codex prefers. Tools like CC Switch try to bridge the gap.

4. No hybrid orchestration in one session

@FilipBaturan: Using GPT as architect and OSS as executor fails because Codex expects uniform tool-calling protocols across the session.

@EatMyTarts17: Cannot combine image generation, planning, and OSS subagents in a single Codex session today.

5. Local vs cloud switching

@SongbeiYing: Switching flexibly between local and cloud models inside the app remains awkward—power users lean on CLI config.

6. Anthropic-style APIs

@SuryavirKapur asked about Anthropic-compatible endpoints—not the default OSS path; Codex is OpenAI-protocol-centric.


Codex OSS vs Claude Code Local

DimensionCodex + OSS modeClaude Code
Default stackOpenAI Responses + toolsAnthropic Messages + tools
Local modelsDocumented OSS providersCommunity/Ollama patterns vary
Open models post-FableOllama GLM/Kimi launchOpus 4.8 fallback; API to others
Maturity (Jun 2026)New; picker/browser gapsMature; 90+ slash commands

Neither replaces the other—they compete as agent harnesses. Codex OSS mode matters if you already standardized on Codex SDK or want one agent client for GPT and GLM/Kimi.


When to Use OSS vs GPT in Codex

Use OSS (local)Stay on GPT
Routine bug fixes, tests, refactorsBrowser automation, computer use
Air-gapped / no API spendHardest agentic coding (SWE-bench gap)
Fable 5 unavailable fallbackMultimodal (screenshots, design review)
Privacy-sensitive codebasesHybrid architect + worker in one session
Experimenting with GLM/KimiProduction CI with /review at scale

Pragmatic pattern: --profile gpt for planning and architecture reviews; --profile ollama-launch for implementation slices—separate sessions, not one hybrid thread (until Filip Baturan's use case is supported).


Why OpenAI Did This Now

Timing aligns with:

  • Export-control turbulence around US frontier models (Fable 5 ban)
  • Open-weight coding models (Kimi K2.7, GLM-5.2) matching closed APIs on benchmarks
  • Developer demand for local-first agents (Headroom, Ollama ecosystem)

Codex without model lock-in is OpenAI's answer to "what if GPT is unavailable or too expensive?"—while keeping the harness proprietary.


Summary

Codex is no longer GPT-only. Official OSS mode plus Ollama's launch codex makes GLM-5.2, Kimi-K2.7-Code, and gpt-oss first-class targets for the same agent loop many teams used only with GPT-5.x.

Fastest path: ollama pull glm-5.2ollama launch codex/init → ship.

Power-user path: wire_api = "responses" provider in ~/.codex/config.toml + profile overlay → codex --profile ollama-launch.

The June 2026 reality is messier than @thsottiaux's tweet: model picker bugs, GPT-gated browser tools, Responses API requirements, profile v2 migration, and no clean multi-model orchestration in one session. For local coding on open weights today, it works. For full Codex desktop parity, expect friction.


Related Reading

Setup paths from OpenAI Codex advanced config, Ollama Codex integration, @thsottiaux and @ollama on X (June 17, 2026).

Related posts