What is Apodex-1.0-mini?

Apodex-1.0-mini is a 35B-A3B open-weight model from Apodex AI, fine-tuned from Qwen3.5-35B-A3B for verification-centric deep research. Released under Apache 2.0 on Hugging Face (apodex/Apodex-1.0-mini), it powers Deep Research mode at apodex.ai and ranked #1 on the FutureX live prediction leaderboard at 59.17 on June 29, 2026 — ahead of Claude-Sonnet-4.6 (56.32), DeepSeek-V4-Pro (53.58), and GPT-5.5 (52.51).

What is FutureX and why does Apodex topping it matter?

FutureX (futurex.live) is a live benchmark for future prediction — questions whose answers are not written down yet. Apodex submitted experimental prediction harnesses built on Apodex-1.0-mini and held #1 three times over four weeks, with two entries at 59.17 and 58.42 on the June 29 leaderboard. It validates that open 35B weights plus a research harness can beat much larger closed models on evidence synthesis under uncertainty.

How is Apodex different from a normal chat LLM?

Apodex is positioned as a Self-Evolving Heavy-Duty Solver — a verification-centric agent team, not a single-turn chatbot. The lightweight mode runs Apodex-1.0-mini as a standard ReAct tool agent. Heavy-duty mode (Apodex-1.0-H) dispatches specialized sub-agents for retrieval and verification, routes findings through a shared evidence pool, and uses a global verifier before committing answers. Every claim is intended to trace to an auditable evidence chain.

What are Apodex-1.0-mini's deep-research benchmark scores?

On the public deep-research suite (ReAct setup, per Hugging Face model card): BrowseComp 71.5, BrowseComp-ZH 80.6, HLE-Text 46.8, DeepSearchQA 82.2. The heavier Apodex-1.0-H agent team reaches 90.3 BrowseComp and 94.4 DeepSearchQA — SOTA vs open and closed frontier systems in Apodex's published tables. Evaluations block benchmark-hosting sites to reduce leakage.

Can I self-host Apodex-1.0-mini?

Yes. Weights are on Hugging Face. Apodex recommends SGLang or vLLM with qwen3 reasoning and qwen3_coder tool parsers, 262K context, temperature 1.0. Full 35B-A3B deployment docs show tensor-parallel size 8 for datacenter GPUs. Community GGUF quantizations exist for smaller setups via Ollama/LM Studio links on the model card.

How does Apodex-1.0-mini compare to Agents-A1?

Both are ~35B MoE models on Qwen-family bases targeting agentic search. Agents-A1 (InternScience, June 30) emphasizes heterogeneous long-horizon benchmarks like GAIA 96 and Seal-0 SOTA. Apodex emphasizes verification-centric deep research and FutureX future prediction — BrowseComp 71.5 for mini vs Agents-A1's 75.51 on vendor tables, but different harnesses and goals. Apodex also ships open AgentHarness eval code and smaller 0.8B–4B SFT variants.

Apodex 1.0-mini — #1 FutureX, 35B Open Research | explainx.ai Blog

explainx.ainewsletter3.5k

workshops ↗

Apodex 1.0-mini — #1 FutureX, 35B Open Research | explainx.ai Blog | explainx.ai

June 29, 2026 — 9:31 PM: Apodex posted on X that since shipping Apodex 1.0, its open 35B Apodex-1.0-mini has been outperforming models many times its size on FutureX — three #1 finishes in four weeks, with fresh scores of 59.17 (#1) and 58.42 (#2), ahead of Claude-Sonnet-4.6 (56.32), DeepSeek-V4-Pro (53.58), and GPT-5.5 (52.51).

Apodex calls itself a Self-Evolving Heavy-Duty Solver — built for questions whose answer isn't written down yet. That is future prediction, geopolitics, market moves, and open research problems where retrieval plus verification beats param count.

TL;DR — what people asked on X

Question	Answer
What model is it?	Apodex-1.0-mini — 35B-A3B MoE, Qwen3.5 base, Apache 2.0
Where to try?	apodex.ai — Deep Research mode uses Apodex-1.0-mini

Component	Role
Apodex-1.0-mini	Open 35B-A3B weights — standard ReAct tool agent
Apodex-1.0-H	Heavy-duty mode — async agent team, shared evidence pool, global verifier
Apodex-1.0-0.8B / 2B / 4B SFT	Smaller open models trained on deep-research SFT data alone
AgentHarness	Open eval repo — reproduce BrowseComp, DeepSearchQA, HLE, etc.
AgentOS	Task-agnostic runtime for building and evaluating agent workflows
apodex.ai API	Product surface — Deep Research powered by mini

Rank (Jun 29)	Score	Model / harness
#1	59.17	Apodex harness (Apodex-1.0-mini)
#2	58.42	Apodex harness (Apodex-1.0-mini)
—	56.32	Claude-Sonnet-4.6
—	53.58	DeepSeek-V4-Pro
—	52.51	GPT-5.5

Model	BrowseComp	BrowseComp-ZH	HLE-Text	DeepSearchQA
Apodex-1.0-mini	71.5	80.6	46.8	82.2
Apodex-1.0-4B-SFT	48.8	63.5	32.9	69.9
Apodex-1.0-2B-SFT	27.9	35.0	18.2	49.9
Apodex-1.0-0.8B-SFT	13.9	10.7	11.2	25.8

	Apodex-1.0-mini	Agents-A1
Ship date	Apodex 1.0 + FutureX post Jun 29	ModelScope Jun 30
Base	Qwen3.5-35B-A3B	`qwen3_5_moe`
Pitch	Verification-centric deep research	Heterogeneous long-horizon agent
BrowseComp	71.5 (mini ReAct)	75.51 (vendor table)
Standout	FutureX #1, DeepSearchQA 82.2	GAIA 96, IFEval 94.8 SOTA
Open eval	AgentHarness	`Agents-A1/evaluation`
Product	apodex.ai Deep Research	ModelScope / HF weights

bash

python3 -m sglang.launch_server \
  --model-path apodex/Apodex-1.0-35B-A3B \
  --tp 8 \
  --host 0.0.0.0 \
  --port 1234 \
  --context-length 262144 \
  --tool-call-parser qwen3_coder \
  --reasoning-parser qwen3

Parameter	Value
`temperature`	1.0
`top_p`	0.95
`repetition_penalty`	1.05
`max_context_length`	262144
`max_tokens`	32768

Concern	Notes
Harness vs model	FutureX scores reflect prediction harness + mini, not bare chat completion
Heavy vs mini gap	90.3 BrowseComp is Apodex-1.0-H team mode — not downloadable as one file
Benchmark leakage	Apodex blocks hosts; other labs may not — compare under AgentHarness when possible
Sonnet 4.6 naming	Verify exact API snapshot and harness parity on futurex.live
Self-evolving claim	Marketing term — read tech report for what actually updates (data, verifiers, policies)

Apodex 1.0-mini: 35B Open Model Tops FutureX — Beats Sonnet 4.6 and GPT-5.5

TL;DR — what people asked on X

Related posts

Agents-A1: InternScience 35B MoE Agent Model — Long-Horizon Search, GAIA 96, and vLLM Setup

LM Studio Bionic: Open-Model Agent for Code and Work Projects

Hermes WebUI: The Self-Hosted AI Agent Interface That Remembers Everything (2026 Complete Guide)

What Apodex 1.0 actually ships

Verification-first design

FutureX — why the June 29 leaderboard matters

Deep-research benchmark table (standard ReAct)

Base model lineage — Qwen3.5-35B-A3B

Apodex vs Agents-A1 — two 35B agent launches, same week

How to run Apodex-1.0-mini

SGLang (recommended)

vLLM

Recommended sampling (agentic tasks)

Reproduce public benchmarks

Wire to harnesses

Product — Deep Research at apodex.ai

Skepticism to keep

Bottom line

TL;DR — what people asked on X

Related posts

Agents-A1: InternScience 35B MoE Agent Model — Long-Horizon Search, GAIA 96, and vLLM Setup

LM Studio Bionic: Open-Model Agent for Code and Work Projects

Hermes WebUI: The Self-Hosted AI Agent Interface That Remembers Everything (2026 Complete Guide)

What Apodex 1.0 actually ships

Verification-first design

FutureX — why the June 29 leaderboard matters

Deep-research benchmark table (standard ReAct)

Base model lineage — Qwen3.5-35B-A3B

Apodex vs Agents-A1 — two 35B agent launches, same week

How to run Apodex-1.0-mini

SGLang (recommended)

vLLM

Recommended sampling (agentic tasks)

Reproduce public benchmarks

Wire to harnesses

Product — Deep Research at apodex.ai

Skepticism to keep

Bottom line

Related on explainx.ai