explainx.ainewsletter3.4k
trending🔥loopsskills
pricing
workshops ↗
explainx.ai

Learn to lead teams that combine humans and agents. Platform access, live workshops, bootcamps, and 50+ courses — plus skills, tools, and MCP to practice what you learn.

follow us

custom AI agents

[email protected]

get started

Join · $29/moUpcoming workshop

learn

platform · $29/moupcoming workshopworkshopsbootcampscoursescertificationscertification testsexplainx universitycorporate trainingfacilitatorshackathonslearn skills & mcp

discover

skillstoolsagentsmcp serversdesignsllmsagiranks

content

releasesvisionmissionaboutteamcareersresourcespromptsgenerators hubgenerator SEO hubprompt templatesprompt guidesblogfor LLMsdemo

Sister Products

Infloq

Infloq

Influencer marketing

BgBlur

BgBlur

Privacy-first blur

Olly Social

Olly Social

Social AI copilot

Ceptory

Ceptory

Video intelligence

BgRemover

BgRemover

Background removal

newsletter · weekly

Get AI news, tools, and insights in your inbox.

contactsupportprivacytermsdata rightssubmission guidelines

© 2026 AISOLO Technologies Pvt Ltd

← Back to blog

explainx / blog

Tencent Hunyuan HY-World 2.0: 3D world models, WorldMirror 2.0, and open-source plan

HY-World 2.0 from Tencent Hunyuan: multi-modal 3D worlds (3DGS/meshes) vs pixel-only video world models, WorldMirror 2.0 reconstruction, pipeline roadmap—GitHub, Hugging Face, install notes.

May 6, 2026·4 min read·Yash Thakker
TencentHunyuanWorld models3DGaussian SplattingWorldMirror
Tencent Hunyuan HY-World 2.0: 3D world models, WorldMirror 2.0, and open-source plan

HY-World 2.0 is Tencent Hunyuan’s open multi-modal world model stack: it ingests text, single-view images, multi-view images, and video, and targets persistent 3D outputs—meshes, 3D Gaussian Splattings (3DGS), and point clouds—not just another mp4. The team positions it as “building a playable world” versus “watching a movie that ends.”

This post summarizes the public GitHub README and docs as of early May 2026; weights, APIs, and benchmarks should be re-checked on the repo and DOCUMENTATION.md before you freeze a reproduction.

Product try (vendor): 3d-models.hunyuan.tencent.com/world — the README notes demand can be high.

newsletter3.4k

Curated AI updates on agents, skills, and MCP — delivered to your inbox. Unsubscribe anytime.


TL;DR

TopicTakeaway
Core pitch3D assets (3DGS / mesh / points) with engine import, vs non-editable video world models
Reconstruction (shipping)WorldMirror 2.0 — multi-view / video → 3D, ~1.2B params, HF weights, Python API + CLI + Gradio
Generation (roadmap)Four-stage pipeline: HY-Pano 2.0 (panorama) → WorldNav (trajectory) → WorldStereo 2.0 (expansion) → WorldMirror 2.0 + 3DGS learning
Open todayTechnical report, WorldMirror 2.0 code & checkpoints per README April 16, 2026 news block
Not open yetFull world generation inference, HY-Pano 2.0, WorldStereo 2.0, WorldNav (all listed coming soon)
Live WorkshopAug 1–2, 2026 · 2 days

Claude for Work

Use Claude as a thought partner for writing, research & decisions — no coding required. 2 live sessions with Yash Thakker.

Register now→

Claude for Work is a 2-day live workshop on using Claude to supercharge your daily work — writing, research, analysis, and decision-making — without any coding required. Learn how to set up Claude Projects with custom instructions, run deep-research sprints, co-write documents that sound like you, and build repeatable prompt systems for your team. August 1–2, 2026. Hosted by Yash Thakker, founder of AISOLO Technologies, instructor to 350,000+ students.

Includes 1-year access to all session recordings, a personal prompt library, Discord community access, and a certificate of completion. No coding or technical background required. Designed for managers, marketers, founders, and writers.


Two capabilities: generation vs reconstruction

World generation (per README): turn text or a single image into a navigable scene via the staged pipeline above—panorama, planning, stereo expansion, then composition with WorldMirror 2.0 and 3DGS training.

World reconstruction: WorldMirror 2.0 is the feed-forward workhorse—one forward pass estimates depth, surface normals, camera parameters, point clouds, and 3DGS-style attributes from multi-view stills or casual video, with flexible resolution (README cites 50K–500K pixels).


Architecture (high level)

The README diagrams a systematic pipeline for generation: HY-Pano 2.0 → WorldNav → WorldStereo 2.0 → WorldMirror 2.0 + splatting—turning language or a single rgb input into a composed 3D world. Technical details live in their report (linked from the repo); this article does not reproduce proprietary figures.


Open-source plan (checklist from README)

ItemStatus in README
Technical reportReleased
WorldMirror 2.0 code & checkpointsReleased
Full world generation inference (WorldNav + composition)Planned
HY-Pano 2.0 weights & codePlanned (HunyuanWorld 1.0 noted as interim)
WorldStereo 2.0 weights & codePlanned (WorldStereo as interim)
WorldNavPlanned

Treat checkboxes as intent; license, export rules, and GPU support still gate real adoption.


Getting started with WorldMirror 2.0

The README’s minimal Python shape:

from hyworld2.worldrecon.pipeline import WorldMirrorPipeline

pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline('path/to/images')

Optional priors (camera / depth) are passed as paths; the repo points to a prior preparation guide in DOCUMENTATION.md.

CLI (single GPU):

python -m hyworld2.worldrecon.pipeline --input_path path/to/images

Multi-GPU uses torchrun with --use_fsdp --enable_bf16. Important operational constraint: input image count ≥ GPU count (e.g. 8 images for 8 processes).

Gradio:

python -m hyworld2.worldrecon.gradio_app

Environment: conda Python 3.10, CUDA 12.4, torch 2.4.0 + cu124 wheels, pip install -r requirements.txt, and FlashAttention (v3 build or pip install flash-attn path).


Benchmarks (as reported—verify in the report)

The README includes tables for:

  • WorldStereo 2.0 — camera metrics and single-view-generated reconstruction on Tanks-and-Temples / MipNeRF360 vs baselines such as SEVA, Gen3C, Lyra, FlashWorld.
  • WorldMirror 2.0 — point map accuracy / completeness on 7-Scenes, NRGBD, DTU at low / medium / high inference resolutions, with and without prior injection; comparisons include Pow3R and MapAnything under varying prior conditions.

Rule of thumb: read the technical report for protocol detail—leaderboard numbers without split / preprocessing context mislead buyers and paper reviewers alike.


Why teams care (strategic, not hype)

Game / sim / robotics: Persistent 3D fits Unreal / Unity / Isaac pipelines better than frame dumps. One-time reconstruction cost plus cheap real-time rendering matches interactive RL and digital-twin workflows—if export and license terms align.

Caution: World generation end-to-end is not fully open yet; most hackers will live in WorldMirror reconstruction until WorldNav / HY-Pano 2.0 / WorldStereo 2.0 ship.


Related on ExplainX

  • WebGPU complete guide (2026) — browser-side 3D/GPU context
  • How diffusion image generation works — complementary generative-media primer
  • AI tools directory — discover utilities by task
  • Agent skills registry — repo-native agent playbooks

Primary sources

  • Repository: github.com/Tencent-Hunyuan/HY-World-2.0
  • Documentation: DOCUMENTATION.md (English) · DOCUMENTATION_zh.md (中文)
  • Model hub: README cites WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0') — confirm the exact Hugging Face card from the repo’s Model Zoo table
  • Product page: 3d-models.hunyuan.tencent.com/world

Citation (from README)

@article{hyworld22026,
  title={HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds},
  author={Team HY-World},
  journal={arXiv preprint},
  year={2026}
}

HY-World 2.0 is a fast-moving research release. Treat this ExplainX article as May 6, 2026 orientation text—validate LICENSE, weights, and CLI flags on the official repository before production use.

Related posts

Jun 4, 2026

NVIDIA Cosmos 3: Open Physical AI World Models for Robots and Autonomous Systems

NVIDIA's Cosmos 3 release turns Cosmos from a broad world-model platform into an open developer stack for omnimodal Physical AI. This guide explains the Reasoner and Generator surfaces, the model family, supported inputs and outputs, setup paths, benchmarks, and where the limits still are.

May 19, 2026

What Are World Models? The AI Systems That Simulate Reality (Starchild-1 and Beyond)

World models represent a fundamental shift in AI—from systems that process text to ones that understand physics, space, and causality. This guide covers how they work, why they matter, and the leading examples shaping the field in 2026.

Jun 25, 2026

Is AI Conscious? The Philosophy Behind the Question Everyone Is Afraid to Ask

Is AI conscious? The honest answer is: we do not know — and the uncertainty is not a gap in AI research, it is a gap in philosophy. Here is the full map: from Chalmers's hard problem to the Chinese Room, from Integrated Information Theory to Anthropic's model welfare team.