DeepSeek releases V4-Pro and V4-Flash on Hugging Face: 1.6T-parameter open-weight models with 1M context and MIT license

No audio yetJun 7, 2026published Jun 14, 2026

DeepSeek has released DeepSeek-V4-Pro and DeepSeek-V4-Flash on Hugging Face, its most capable open-weight models to date. The V4-Pro variant is a 1.6 trillion total parameter mixture-of-experts model with 49 billion active parameters and a full one-million-token context window, released under the MIT license and available for public download.

What's new

DeepSeek-V4-Pro is built around a hybrid attention architecture combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA), designed to sustain performance across extremely long contexts without proportional compute overhead. The model introduces manifold-constrained hyper-connections and uses the Muon optimizer during training.

Key benchmark results:

MMLU-Pro: 87.5%
LiveCodeBench: 93.5% — one of the highest publicly reported scores on this benchmark
GSM8K: 92.6%
Codeforces: 3206 rating

The model supports three reasoning effort modes: Non-think (fast inference), Think High, and Think Max. This tiered reasoning structure mirrors the approach DeepSeek pioneered with R1-series models, now integrated into the V4 generation.

DeepSeek-V4-Flash is a smaller companion variant optimized for speed and cost-efficient inference. Exact parameter counts for the Flash variant were not disclosed at launch.

Both models are available on Hugging Face under the MIT license, meaning they can be used commercially, fine-tuned, and redistributed without royalties.

Context

DeepSeek has moved faster than most Western labs on open-weight model releases. The V3 generation — released late 2025 — stunned benchmarks and triggered significant debate about the compute efficiency of Chinese AI labs. V4 continues that pattern, adding a 1M-token context window and explicitly targeting coding and agentic workloads.

DeepSeek's technical report for V4 is titled "DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence," signaling that the primary engineering focus was sustaining intelligence at scale across long contexts, not just maximizing top-line benchmark scores.

The lab also continues to release under MIT, maintaining a policy of broad permissive access that distinguishes it from labs like Mistral (which gates larger models) or Meta (Apache 2.0 with use-case restrictions).

Why it matters

A 1.6T-parameter open-weight model with 1M context at MIT license is a significant data point in the open vs. closed frontier debate. Enterprise developers who previously needed to use proprietary APIs for long-context coding and agentic tasks now have a self-hostable alternative with strong benchmark parity.

The LiveCodeBench score of 93.5% is particularly striking — LiveCodeBench tests models against real competition problems released after most models' training cutoffs, making it harder to game through memorization. A score in this range puts DeepSeek-V4-Pro alongside or ahead of several closed-source frontier models on coding tasks.

For the wider AI ecosystem, V4's release continues to compress the time gap between closed-source frontier capabilities and what's available to the open-weight community.

Corroborating sources

Huggingface.co
https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
“DeepSeek-V4-Pro-Max...achieves top-tier performance in coding benchmarks and significantly bridges the gap with leading closed-source models on reasoning and agentic tasks.”

What's new

Key benchmark results:

MMLU-Pro: 87.5%

LiveCodeBench: 93.5% — one of the highest publicly reported scores on this benchmark

GSM8K: 92.6%

Codeforces: 3206 rating

DeepSeek-V4-Flash is a smaller companion variant optimized for speed and cost-efficient inference. Exact parameter counts for the Flash variant were not disclosed at launch.

Both models are available on Hugging Face under the MIT license, meaning they can be used commercially, fine-tuned, and redistributed without royalties.

Context

Why it matters

For the wider AI ecosystem, V4's release continues to compress the time gap between closed-source frontier capabilities and what's available to the open-weight community.