LFM2.5-230M is Liquid AI's smallest open-weight foundation model, released June 25, 2026. At 230 million parameters it is built on the LFM2 architecture for fast inference on CPUs, NPUs, and GPUs — targeting lightweight agentic workloads on phones, robots, home automation, and network devices. Base (LFM2.5-230M-Base) and post-trained (LFM2.5-230M) checkpoints are on Hugging Face.

How fast is LFM2.5-230M on edge hardware?

Liquid AI reports 213 tokens per second decode speed on a Samsung Galaxy S25 Ultra (Qualcomm Snapdragon Gen4 CPU) and 42 tok/s on a Raspberry Pi 5 CPU. The model delivers the highest prefill and decode throughput in its class on both platforms while keeping the smallest memory footprint among comparable small models.

What is LFM2.5-230M good at — and what should you avoid?

Liquid AI recommends it for large-scale data extraction pipelines and lightweight on-device agentic workloads — instruction following, structured extraction, and tool use (BFCL benchmarks). It explicitly does not recommend the model for reasoning-heavy tasks: advanced math, code generation, or creative writing. For frontier reasoning at small scale, see specialist models like VibeThinker-3B instead.

How was LFM2.5-230M trained?

Pre-training on 19 trillion tokens, including a 32K context extension phase. Post-training uses three stages: (1) supervised fine-tuning with distillation from LFM2.5-350M, (2) direct preference optimization (DPO), and (3) multi-domain reinforcement learning. The recipe preserves flexibility for downstream fine-tuning while delivering strong out-of-the-box tool-use and extraction capability.

What inference runtimes support LFM2.5-230M?

Day-one support across llama.cpp (GGUF for edge), MLX (Apple Silicon), vLLM and SGLang (GPU serving), and ONNX (cross-platform accelerators). Liquid AI also ships an internal GPU inference stack for low-latency enterprise deployments benchmarked against SGLang-served competitors.

How is Liquid AI using LFM2.5-230M in robotics?

As an early demo, Liquid AI deployed LFM2.5-230M on a Unitree G1 humanoid running entirely on-device on its onboard NVIDIA Jetson Orin. After a quick fine-tune, the model acts as a skill-selection layer: it takes natural-language instructions and decomposes them into tool calls that invoke pre-trained low-level skills from NVIDIA's SONIC framework — e.g. timed walking at target velocity and one-legged kneel sequences.

LFM2.5-230M: Liquid AI Edge Agent Model — 213 tok/s on Phone CPU | explainx.ai Blog

explainx.ainewsletter3.5k

workshops ↗

LFM2.5-230M: Liquid AI Edge Agent Model — 213 tok/s on Phone CPU | explainx.ai Blog | explainx.ai

On June 25, 2026, Liquid AI released LFM2.5-230M — its smallest foundation model yet, and one of the clearest 2026 statements about where the edge-AI market is heading: not bigger models in the cloud, but fast, open-weight models that run agentic tool loops on the device you already have.

Liquid AI's framing on X (@liquidai) and in the official blog post is explicit: LFM2.5-230M is built to run anywhere — cloud GPUs, phone CPUs, Raspberry Pi boards, and robot onboard computers — and to power data extraction pipelines and lightweight on-device agentic workloads, not frontier math or long-form creative writing.

TL;DR

Spec	LFM2.5-230M
Parameters	230M (smallest in LFM2.5 family)
Architecture	LFM2 (Liquid Foundation Model v2)
Pre-training	19T tokens + 32K context extension
Post-training	SFT (distilled from LFM2.5-350M) → DPO → multi-domain RL
Variants	LFM2.5-230M-Base, LFM2.5-230M (post-trained)
Availability

Model	GPQA Diamond	MMLU-Pro	IFEval	IFBench	Multi-IF
LFM2.5-230M	25.41	20.25	71.71	38.40	37.70
LFM2.5-350M	30.64	20.01	76.96	40.69	44.92
LFM2-350M	27.58	19.29	64.96	18.20	32.92
Granite 4.0-H-350M	22.32	13.14	61.27	17.22	28.70
Qwen3.5-0.8B (Instruct)	27.41	37.42	59.94	22.87	41.68
Gemma 3 1B IT	23.89	14.04	63.49	20.33	44.25

Model	CaseReportBench	BFCLv3	BFCLv4	τ²-Bench Telecom	τ²-Bench Retail
LFM2.5-230M	22.51	43.26	21.03	5.26	13.68
LFM2.5-350M	32.45	44.11	21.86	18.86	17.84
LFM2-350M	11.67	22.95	12.29	10.82	5.56
Granite 4.0-H-350M	12.44	43.07	13.28	13.74	6.14
Qwen3.5-0.8B (Instruct)	13.83	35.08	18.70	12.57	6.14

Platform	Hardware	Decode throughput
Samsung Galaxy S25 Ultra	Qualcomm Snapdragon Gen4 (CPU)	213 tok/s
Raspberry Pi 5	ARM CPU	42 tok/s

Runtime	Use case
llama.cpp	GGUF checkpoints for Raspberry Pi, phones, embedded
MLX	Apple Silicon (Mac, iPhone via future MLX ports)
vLLM / SGLang	GPU-accelerated production serving
ONNX	Cross-platform deployment across diverse accelerators

Model	Params	Strength	Weakness
LFM2.5-230M	230M	Speed, tool use, extraction, edge agents	Math, code, creative writing
MiniCPM5-1B	1B	Broad open-model intelligence at 0.5GB	Heavier than 230M for pure tool loops
VibeThinker-3B	3B	AIME 94.3, frontier verifiable reasoning	Too large for Pi-class real-time agents
Gemma 4 E2B	2B	Multimodal on-device (vision + speech)	Different deployment path (LiteRT)

Post	Connection
Gemma 4 + Open Duck Mini	On-device robot demo on Pi 5 and Jetson Orin
MiniCPM5-1B	Another open small-model breakthrough at sub-1B scale
VibeThinker-3B	Opposite end: frontier reasoning in a compact model
AI Model Quantization Guide	How sub-billion models run on phones and edge boards
NVIDIA N1X at Computex 2026	On-device AI compute trend on consumer hardware

LFM2.5-230M: Liquid AI's 230M Model Built to Run Agents on Phones and Robots

TL;DR

Related posts

Gemma 4 Powers Open Duck Mini: Meet Autumn, the On-Device AI Robot Duck

MiniCPM5-1B: The Tiny 1B Model That's Crushing 2B+ AI Models

Ternlight: 7 MB Embedding Model That Runs in the Browser (WASM SIMD Guide)

Why Liquid AI Built a 230M Model

Training Recipe

Stage 1: Supervised fine-tuning with distillation

Stage 2: Direct preference optimization (DPO)

Stage 3: Multi-domain reinforcement learning

Benchmarks: Beats Models Twice Its Size — on the Right Tasks

Knowledge and instruction following

Tool use and data extraction

CPU Speed: 213 tok/s on a Phone, 42 tok/s on a Pi

Inference Ecosystem: Day-One Support

Unitree G1 Demo: Natural Language → Robot Skills

Where LFM2.5-230M Fits in the Small-Model Landscape

Get Started

Summary

TL;DR

Related posts

Gemma 4 Powers Open Duck Mini: Meet Autumn, the On-Device AI Robot Duck

MiniCPM5-1B: The Tiny 1B Model That's Crushing 2B+ AI Models

Ternlight: 7 MB Embedding Model That Runs in the Browser (WASM SIMD Guide)

Why Liquid AI Built a 230M Model

Training Recipe

Stage 1: Supervised fine-tuning with distillation

Stage 2: Direct preference optimization (DPO)

Stage 3: Multi-domain reinforcement learning

Benchmarks: Beats Models Twice Its Size — on the Right Tasks

Knowledge and instruction following

Tool use and data extraction

CPU Speed: 213 tok/s on a Phone, 42 tok/s on a Pi

Inference Ecosystem: Day-One Support

Unitree G1 Demo: Natural Language → Robot Skills

Where LFM2.5-230M Fits in the Small-Model Landscape

Get Started

Related explainx.ai coverage

Summary