What is Ideogram 4.0?

Ideogram 4.0 is Ideogram's first open-weight text-to-image foundation model, released June 3, 2026. It is a 9.3B-parameter flow-matching DiT trained from scratch — not a fine-tune of FLUX or SDXL. It ships with native 2K resolution, best-in-class in-image text rendering, bounding-box layout control, and JSON-first prompting. Weights are on Hugging Face; hosted access is at ideogram.ai and via the Ideogram API.

How do I run Ideogram 4.0 locally?

Clone github.com/ideogram-oss/ideogram4, run pip install ., accept the license gate on Hugging Face (ideogram-ai/ideogram-4-nf4 or ideogram-4-fp8), authenticate with hf auth login, then run python run_inference.py --prompt "your prompt" --output out.png --quantization nf4 --magic-prompt-key "$IDEOGRAM_API_KEY". The NF4 checkpoint fits on a single 24GB CUDA GPU with Diffusers support.

How does the Ideogram 4.0 API work?

Get an API key at developer.ideogram.ai, then POST to https://api.ideogram.ai/v1/ideogram-v4/generate with an Api-Key header and a JSON body containing text_prompt or json_prompt. Pricing is per image with no subscription: Turbo $0.03, Default $0.06, Quality $0.10. The default rate limit is 10 in-flight requests.

Why does Ideogram 4.0 use JSON prompts?

Ideogram 4.0 was trained exclusively on structured JSON captions, not plain prose. JSON gives explicit control over style, color palette, spatial layout (bbox coordinates), and in-image text. Plain-text prompts still work via magic-prompt — a free LLM expansion step that converts casual prompts into the JSON schema the model expects.

How does Ideogram 4.0 compare to FLUX and GPT Image?

On Design Arena, Ideogram 4.0 is the top open-weight model and trails only proprietary GPT and Gemini models overall. ContraLabs blind typography tests rated it first 47.9% of the time vs Gemini 3.1 Flash Image (30%), FLUX.2 max (15.5%), and Grok Imagine (15%). At 9.3B params it beats much larger open models on text rendering benchmarks.

Can I use Ideogram 4.0 commercially?

Open weights ship under Ideogram's commercial license (see ideogram.ai/licensing). The NF4/FP8 Hugging Face checkpoints use a non-commercial license for the open release; enterprise and commercial tiers are listed at ideogram.ai/licensing. The hosted API is billed separately from the Ideogram app subscription.

Ideogram 4.0: Open Image Model — How to Run & API Guide (2026) | explainx.ai Blog

explainx.ainewsletter3.5k

workshops ↗

Capability	Status
Transparent background cutouts	Available via Background Remover API
Editable text + movable image layers	Follow-up 4.0 release
Branded assets (typography, palette, logo fidelity)	Scheduled

Ideogram 4.0: Open Image Model — How to Run & API Guide (2026) | explainx.ai Blog | explainx.ai

Component	Detail
Backbone	34-layer single-stream Diffusion Transformer (DiT) — text and image tokens in one unified sequence
Text encoder	Qwen3-VL-8B-Instruct — hidden states from 13 intermediate layers concatenated
Training objective	Flow matching
Guidance	Dual-branch classifier-free guidance (independent positive/negative refinement)
Training data format	Structured JSON captions exclusively

Benchmark	Result
Design Arena (overall)	Top open-weight model; trails only proprietary GPT and Gemini
Design Arena (open-weight only)	#1 by commanding margin
ContraLabs typography (1st-place win rate)	47.9%
ContraLabs "would use in client work"	3.55 / 5
LMArena text-to-image	Top open-weight lab, top-5 overall
7Bench (layout control)	Better than all closed-source models tested
Internal human-preference (design + photography)	#2 overall — behind only GPT Image 2 medium

python

import requests

response = requests.post(
    "https://api.ideogram.ai/v1/ideogram-v4/generate",
    headers={"Api-Key": "<your-api-key>"},
    json={
        "text_prompt": "A poster for a summer design conference with bold sans-serif typography",
        "rendering_speed": "DEFAULT",
        "aspect_ratio": "ASPECT_16_9",
    },
)

image = response.json()["data"][0]
print(image["url"])

bash

curl -X POST https://api.ideogram.ai/v1/ideogram-v4/generate \
  -H "Api-Key: <your-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "text_prompt": "A poster for a summer design conference",
    "rendering_speed": "TURBO"
  }'

typescript

const res = await fetch("https://api.ideogram.ai/v1/ideogram-v4/generate", {
  method: "POST",
  headers: {
    "Api-Key": "<your-api-key>",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    text_prompt: "A poster for a summer design conference",
    rendering_speed: "DEFAULT",
  }),
});

const { data } = await res.json();
console.log(data[0].url);

Rendering speed	Price per image	Use case
TURBO	$0.03	Rapid prototyping, A/B testing
DEFAULT	$0.06	Daily production work
QUALITY	$0.10	Final delivery assets

bash

export IDEOGRAM_API_KEY="your_key_from_developer.ideogram.ai"

python run_inference.py \
  --prompt "a ginger cat wearing a tiny wizard hat reading a spellbook" \
  --output out.png \
  --quantization "nf4" \
  --magic-prompt-key "$IDEOGRAM_API_KEY"

bash

python run_inference.py \
  --prompt "a campaign poster with clean sans-serif typography" \
  --output poster.png \
  --quantization "nf4" \
  --height 2048 \
  --width 2048 \
  --sampler-preset V4_QUALITY_48 \
  --magic-prompt-key "$IDEOGRAM_API_KEY"

bash

export HIVE_TEXT_MODERATION_KEY="..."
export HIVE_VISUAL_MODERATION_KEY="..."

python run_inference.py \
  --prompt "an isometric illustration of a tiny city floating in the clouds" \
  --output out.png \
  --quantization "nf4" \
  --magic-prompt-key "$IDEOGRAM_API_KEY" \
  --hive-text-key "$HIVE_TEXT_MODERATION_KEY" \
  --hive-visual-key "$HIVE_VISUAL_MODERATION_KEY"

Checkpoint	Quantization	Hardware	Diffusers
ideogram-4-nf4	NF4	CUDA (24GB)	Yes
ideogram-4-fp8	FP8	All	No

json

{
  "high_level_description": "A clean business card layout for a tech startup.",
  "style_description": {
    "aesthetics": "minimal, professional, geometric",
    "lighting": "even, diffuse studio lighting",
    "medium": "graphic_design",
    "art_style": "flat vector design, generous whitespace, sans-serif typography",
    "color_palette": ["#FFFFFF", "#F0F0F0", "#333333", "#0066FF", "#00CC88"]
  },
  "compositional_deconstruction": {
    "background": "A solid off-white card surface with subtle paper texture.",
    "elements": [
      {
        "type": "text",
        "text": "ACME TECH",
        "desc": "Bold dark grey sans-serif company name across the upper third."
      },
      {
        "type": "text",
        "text": "[email protected]",
        "desc": "Small blue sans-serif contact email near the bottom."
      }
    ]
  }
}

Field	Required	Purpose
`high_level_description`	Strongly recommended	One- or two-sentence summary
`style_description`	Optional	Aesthetics, lighting, medium, color palette
`compositional_deconstruction`	Required	Background + spatial elements

Config	Registry key	Backend
`Ideogram4MagicPromptV1`	`ideogram-4-v1`	Ideogram hosted API (free)
`ClaudeOpusMagicPromptV1`	`claude-opus-v1`	OpenRouter
`ClaudeSonnetMagicPromptV1`	`claude-sonnet-v1`	OpenRouter

Endpoint	Purpose
`POST /v1/ideogram-v4/magic-prompt`	Convert plain text → structured `json_prompt`
`POST /v1/ideogram-v4/describe`	Upload a reference image → structured JSON prompt (preserves bboxes optionally)

Capability	Endpoint family	Notes
Generate	`/v1/ideogram-v4/generate`	Text or JSON prompt → image
Transparent backgrounds	v4 endpoints	Native alpha cutouts
Edit with prompt	v3 endpoints	Describe changes in plain language
Remix	v3 endpoints	Reimagine with `image_weight` control
Reframe	v3 endpoints	Extend to new aspect ratio
Remove background	v4 endpoints	Clean cutout in one call
Layerized text	v3 endpoints	Pull editable text layers
Custom models	Training + generate	Fine-tune on brand assets
Upscale	Upscale endpoint	Raise resolution for delivery
Magic-prompt	`/v1/ideogram-v4/magic-prompt`	Plain text → JSON caption
Describe	`/v1/ideogram-v4/describe`	Image → JSON caption

Surface	Best for	Trade-off
Ideogram app	Hands-on creation, iteration, editing	Subscription credits; no programmatic access
API	Production pipelines, product integration, agents	Per-image cost; ephemeral URLs
Local (CLI)	Fine-tuning, research, air-gapped, unlimited gen	24GB GPU; magic-prompt still needs API key (free)
ComfyUI	Node-based visual workflows	Requires ComfyUI 0.24.0+ and `image_ideogram4_t2i.json` template

Release date	June 3, 2026
Parameters	9.3B
Architecture	Flow-matching DiT, single-stream, Qwen3-VL-8B text encoder
Max resolution	2048×2048 (multiples of 16, aspect ratios up to 6:1)
Open weights	ideogram-oss/ideogram4
Checkpoints	ideogram-4-nf4 (24GB GPU) · ideogram-4-fp8
API endpoint	`POST https://api.ideogram.ai/v1/ideogram-v4/generate`
API pricing	Turbo $0.03 · Default $0.06 · Quality $0.10 per image
Prompt format	JSON-first (plain text via magic-prompt expansion)
GitHub stars	2,100+ (as of June 2026)

Ideogram 4.0: Open-Weight Image Generation — How to Run, API & JSON Prompts (2026)

Quick reference

Related posts

Krea 2 Technical Report: Open-Weights Image Foundation Model Built for Creative Exploration

Reve 2.1: #2 Text-to-Image Arena, Top 4K Model, and Layout-First Visual Intelligence

Kokoro TTS: Local CPU-Friendly Speech at 82M Parameters (HN Guide, July 2026)

What Ideogram 4.0 ships today

1. Text rendering at production fidelity

2. Bounding-box layout control

3. Photoreal output at 2K

Layer-based roadmap

Architecture: a specialized foundation, not a unified multimodal model

Benchmarks: where Ideogram 4.0 ranks

How to run Ideogram 4.0 via the API

Step 1: Get an API key

Step 2: Generate your first image

API pricing and speed tiers

How to run Ideogram 4.0 locally (CLI)

Prerequisites

Step 1: Clone and install

Step 2: Accept the license gate and authenticate

Step 3: Generate with plain-text prompt

Step 4: Max quality settings

Optional: safety screening with Hive

Model checkpoints

JSON prompting: the format that matters

Why JSON-only training?

The caption schema (three top-level fields)

Magic-prompt: JSON without writing JSON

Bounding-box layout and color palettes

Spatial control with bbox

Color palette conditioning

API endpoints beyond generate

When to use API vs local vs the app

Enterprise and commercial licensing

Summary

Quick reference

Related posts

Krea 2 Technical Report: Open-Weights Image Foundation Model Built for Creative Exploration

Reve 2.1: #2 Text-to-Image Arena, Top 4K Model, and Layout-First Visual Intelligence

Kokoro TTS: Local CPU-Friendly Speech at 82M Parameters (HN Guide, July 2026)

What Ideogram 4.0 ships today

1. Text rendering at production fidelity

2. Bounding-box layout control

3. Photoreal output at 2K

Layer-based roadmap

Architecture: a specialized foundation, not a unified multimodal model

Benchmarks: where Ideogram 4.0 ranks

How to run Ideogram 4.0 via the API

Step 1: Get an API key

Step 2: Generate your first image

API pricing and speed tiers

How to run Ideogram 4.0 locally (CLI)

Prerequisites

Step 1: Clone and install

Step 2: Accept the license gate and authenticate

Step 3: Generate with plain-text prompt

Step 4: Max quality settings

Optional: safety screening with Hive

Model checkpoints

JSON prompting: the format that matters

Why JSON-only training?

The caption schema (three top-level fields)

Magic-prompt: JSON without writing JSON

Bounding-box layout and color palettes

Spatial control with bbox

Color palette conditioning

API endpoints beyond generate

When to use API vs local vs the app

Enterprise and commercial licensing

Summary

Related reading