Image: The Verge

DeepSeek Releases Preview of DeepSeek V4 Flash and V4 Pro Large Language Models

24 April, 2026.Technology and Science.8 sources

Key Takeaways

Preview versions DeepSeek-V4-Pro and DeepSeek-V4-Flash released.
1,000,000-token context windows offered across both versions.
Aims to compete with OpenAI, Google, and Anthropic.

DeepSeek V4 Preview Launch

Chinese AI lab DeepSeek released preview versions of its newest large language model, DeepSeek V4, on Friday, presenting two options—DeepSeek V4 Flash and DeepSeek V4 Pro—as an update to last year’s V3.2 model and the accompanying R1 reasoning model.

“China’s DeepSeek has unveiled the latest versions of its signature artificial intelligence-powered chatbot, a year after its flagship model sent shockwaves through the global tech scene”

Al Jazeera

TechCrunch says both V4 models are “mixture-of-experts models with context windows of 1 million tokens each,” describing the design as activating only a certain number of parameters per task to lower inference costs.

Al Jazeera

In TechCrunch’s account, the Pro model has “a total of 1.6 trillion parameters (49 billion active),” while the smaller V4 Flash has “284 billion parameters (13 billion active).”

The Verge similarly frames the release as a preview of V4 on Friday, saying DeepSeek claims the open-source model can compete “toe-to-toe with leading closed-source systems” from US rivals including Anthropic, Google, and OpenAI.

DW adds that DeepSeek said the V4 “features an ultra-long context of one million words,” and it describes V4 as available in a pro version and a cheaper flash version.

Le Figaro ties the long-context claim to a concrete capability, saying DeepSeek says the model can process “one million characters at once” and that “by feeding it hundreds of pages of text” it can answer questions about the entire corpus “without forgetting anything from the beginning to the end.”

Across outlets, DeepSeek’s positioning is consistent: it is offering an open-source preview, with V4 Pro and V4 Flash pitched as cost-effective ways to match or narrow gaps with leading systems.

Performance Claims and Pricing

DeepSeek’s V4 preview is built around a set of benchmark and cost claims that multiple outlets repeat with specific figures and comparisons.

TechCrunch says DeepSeek claims the new V4 models have “almost “closed the gap” with current leading models, both open and closed, on reasoning benchmarks,” while also stating that V4 “seem[s] to fall slightly behind frontier models in knowledge tests.”

Asia Times

It reports that DeepSeek says its V4-Pro-Max “outperforms its opensource peers across reasoning benchmarks,” and that it “outstrips OpenAI’s GPT-5.2 and Gemini 3.0 Pro on some tasks.”

For coding competition benchmarks, TechCrunch quotes DeepSeek saying the V4 models’ performance is “comparable to GPT-5.4,” while it also notes that knowledge tests show a lag behind “OpenAI’s GPT-5.4 and Google’s latest Gemini 3.1 Pro.”

TechCrunch also provides a detailed pricing comparison, saying the smaller V4 Flash costs “$0.14 per million input tokens and $0.28 per million output tokens,” and the larger V4 Pro costs “$0.145 per million input tokens and $3.48 per million output tokens.”

CNBC adds that DeepSeek’s V4 preview is “a serious flex,” quoting Neil Shah of Counterpoint Research saying it offers “lower inference costs than previous models,” and it describes inference costs as “the computational and financial expenses of running a trained AI model to generate outputs.”

The Verge adds that DeepSeek highlights compatibility with domestic Huawei technology, and it notes that DeepSeek has not disclosed V4’s training costs or what hardware it was trained on.

US Alarm on Distillation

The DeepSeek V4 preview arrives alongside a US government warning about how Washington says foreign entities are extracting capabilities from American AI models.

“Chinese artificial intelligence startup DeepSeek on Friday released a preview version of its long-awaited V4 large language model, allowing users to test its new capabilities and features”

CNBC

Asia Times reports that the White House Office of Science and Technology Policy (OSTP) said on Thursday, April 23, that “foreign entities, principally based in China, are engaged in deliberate, industrial-scale campaigns to distill US frontier AI models.”

It quotes Michael Kratsios, an assistant to the president for science and technology director, OSTP, saying “Leveraging tens of thousands of proxy accounts to evade detection and using jailbreaking techniques to expose proprietary information, these coordinated campaigns systematically extract capabilities from American AI models.”

Asia Times adds that Kratsios said “Models developed from surreptitious, unauthorized distillation campaigns like this do not replicate the full performance of the original,” while “They do, however, enable foreign actors to release products that appear to perform comparably on select benchmarks at a fraction of the cost.”

The memorandum described in Asia Times says the Trump administration would “share intelligence with US AI companies on attempts by foreign actors to carry out unauthorized, industrial-scale distillation,” and it would “enable closer coordination across the private sector to counter such activities.”

DW frames the same US-China dispute by quoting the White House allegation that Chinese entities were engaging in “industrial-scale distillation campaigns to steal American AI.”

TechCrunch notes that DeepSeek has been accused by Anthropic and OpenAI of “distilling,” describing it as “essentially copying, their AI models.”

DeepSeek’s Methods and Technical Approach

While US officials warn about distillation, DeepSeek’s own disclosures described in Asia Times focus on how it says it trained V4 using distillation techniques and a specific method called On-Policy Distillation (OPD).

Asia Times says DeepSeek, a Zhejiang-based company, was explicit about its methods, and it describes that “In late January 2025, it said it used knowledge distillation techniques to train its V3 model.”

It explains the process as “often likened to a student learning by asking a teacher many questions and absorbing the answers,” and it says DeepSeek’s research paper published on Friday described an advanced approach using “a technique known as On-Policy Distillation (OPD) to train V4, drawing on the outputs of 10 separate “teacher” models.”

Asia Times adds that OPD works by letting a model “first generate its own responses before consulting multiple teachers to refine and correct them, accelerating the learning cycle.”

In the same account, DeepSeek’s claims about V4 performance are quoted, including that “Through the expansion of reasoning tokens, DeepSeek-V4-Pro-Max demonstrates superior performance relative to GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks.”

TechCrunch provides a parallel technical framing by describing the mixture-of-experts approach and the “context windows of 1 million tokens each,” while it also reports that DeepSeek says its V4 models are more efficient and performant than V3.2 due to “architectural improvements.”

CNBC adds that DeepSeek’s V4 is open-source and that it allows developers to “download the code, run it locally and modify it in most cases,” and it says DeepSeek optimized V4 for agent tools such as “Anthropic's Claude Code and OpenClaw.”

Global Reactions and Market Stakes

The V4 preview is being received as part of a broader AI competition between the US and China, with outlets describing both technical implications and geopolitical fallout.

“DeepSeek, the Chinese AI star, launches a brand-new model”

Le Figaro

Al Jazeera says DeepSeek launched preview versions of DeepSeek-V4-Pro and DeepSeek-V4-Flash on Friday and it quotes DeepSeek’s announcement that V4-Pro “beats all rival open models for maths and coding,” while it trails only “Google’s Gemini 3.1-Pro, a closed model, for world knowledge.”

Le Figaro

Al Jazeera also repeats DeepSeek’s estimate that performance “marginally short” of OpenAI’s GPT‑5.4 and Gemini 3.1-Pro “suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months.”

It adds that DeepSeek’s arrival prompted blowback in some countries, saying “Multiple US states, Australia, Taiwan, South Korea, Denmark and Italy introduced bans or other restrictions on DeepSeek-R1 shortly after its release,” citing privacy and national security concerns.

DW similarly notes that DeepSeek has been accused by the United States and its American competitors of improper and illegal conduct, and it says Beijing rejected what it called “the baseless allegations,” adding that China “attaches great importance to the protection of intellectual property rights.”

CNBC connects the stakes to markets and chip supply, quoting that Huawei confirmed its Ascend AI processors can support DeepSeek’s V4 model, while it says it remains unclear how extensively Huawei’s chips were used in training compared with Nvidia.

CNBC also reports that after DeepSeek announced its V4 release, shares of Chinese contract chip manufacturers rose in Hong Kong, with SMIC and Hua Hong Semiconductor surging 9% and 15%, respectively.

More on Technology and Science

Trump Justice Department Moves State-Licensed Medical Marijuana From Schedule I to Schedule III

18 sources compared

Benjamin Netanyahu Says He Received Treatment for Early-Stage Prostate Cancer

42 sources compared

Porsche Begins Selling All-Electric Cayenne Coupe Electric In Late Summer 2026

16 sources compared

Wendy Duffy Dies In Switzerland Assisted Suicide After Son Marcus’s Death

11 sources compared