Models.MoE

A lightweight landing page that keeps track of latest open-source MoE LLMs.

Qwen3-VL-235B-A22B-Instruct

Large-capacity multimodal MoE — Instruct edition for general VLM tasks, tool use, and agent pipelines.
Open on Hugging Face
Qwen3-VL-30B-A3B-Instruct

Mid-size multimodal MoE — Instruct edition for production VLM and prototyping.
Open on Hugging Face
Qwen3-VL-235B-A22B-Thinking

Large-capacity multimodal MoE — Thinking edition for stronger step-by-step visual reasoning and long-horizon video.
Open on Hugging Face
Qwen3-VL-30B-A3B-Thinking

Mid-size multimodal MoE — Thinking edition balancing compute with robust visual reasoning.
Open on Hugging Face

DeepSeek-V3.2-Exp

Official experimental weights; builds on V3.1-Terminus with DSA for better long-context efficiency.
Open on Hugging Face

DeepSeek-R1

Primary reasoning model; full paper and evaluations; suitable as an open reasoning teacher or for heavy-duty inference.
Open on Hugging Face