DeepSeek V3.2 Released: Impressive Performance and Cost Advantage

DeepSeek V3.2 Released

On December 1, DeepSeek surprised users with the launch of version 3.2, now available for all users and also uploaded to various open-source communities for local deployment. According to official test results, DeepSeek V3.2’s inference capabilities are now comparable to OpenAI’s GPT-5, but at a much lower cost, which is exciting for many.

Stronger Inference at a Lower Cost

DeepSeek V3.2 comes in two versions: the free version available on the DeepSeek website and the DeepSeek V3.2-Speciale, which supports API access. The Speciale version features enhanced inference capabilities, primarily designed to explore the limits of the model’s reasoning abilities.

The V3.2-Speciale actively enters a “long-thinking enhancement” mode and incorporates the theorem-proving capabilities of DeepSeek-Math-V2, enhancing its instruction-following, mathematical proof, and logical verification abilities. In official tests, V3.2-Speciale’s inference benchmark scores rival those of the latest Gemini-3.0-Pro.

DeepSeek also tested the V3.2-Speciale on finals from competitions like IMO 2025 (International Mathematical Olympiad), CMO 2025 (Chinese Mathematical Olympiad), ICPC World Finals 2025 (International Collegiate Programming Contest), and IOI 2025 (International Olympiad in Informatics), achieving gold medal results.

Notably, in the ICPC and IOI tests, it reached levels comparable to the second and tenth human competitors, showcasing significant advancements in programming capabilities. In comparative tests, DeepSeek V3.2-Speciale outperformed GPT-5 High, catching OpenAI off guard.

According to the official technical documentation, the main breakthrough of DeepSeek V3.2 is the introduction of the DeepSeek Sparse Attention (DSA) mechanism, designed to meet different inference needs through a dual-version approach.

The deployment of the DSA mechanism fundamentally addresses efficiency issues in attention for large AI models. Specifically, while traditional attention mechanisms compute relationships between all elements in a sequence, DSA selectively calculates relationships between key elements, significantly reducing the amount of data needed for computation.

Similar technology was hinted at in a paper released earlier this year, where the new attention mechanism NSA was discussed. However, NSA had not been publicly implemented in subsequent DeepSeek model updates, leading to speculation about difficulties in its deployment.

Now, it appears that DeepSeek has found a better implementation method. The NSA mechanism was more like creating an index for a library when handling long text data, quickly locating the relevant area for information retrieval. In contrast, DSA functions like a search engine, performing a quick full-text read and establishing a “lightning indexer” for rapid data retrieval based on keywords, making it smarter and more precise while consuming fewer resources.

With the DSA mechanism, the inference cost for a 128K sequence can be reduced by over 60%, and the inference speed can increase by approximately 3.5 times, with memory usage decreasing by 70%, all while maintaining model performance. This fundamentally changes the performance of large AI models in the attention domain.

Official data shows that during AI model testing on the H800 cluster, the cost per million tokens during the pre-fill phase dropped from $0.70 to about $0.20, while the decoding phase cost decreased from $2.40 to $0.80, making DeepSeek V3.2 potentially the lowest-cost model for long-text inference among its peers.

Not Just Thinking, But Using Tools

In addition to the DSA mechanism, another core upgrade in DeepSeek V3.2 is its ability to invoke tools during its thinking mode. The official statement indicates that the process of invoking and using tools requires no training, giving DeepSeek V3.2 enhanced general performance and better compatibility with user-created tools as an open-source model.

To validate DeepSeek V3.2’s new features, I designed some questions to test its response capabilities, starting with its performance in thinking mode:

Question: A is three years older than B, and B is two years older than C. In five years, A’s age will be exactly twice that of C. How old are the three individuals now?

Answer:

The answer is correct, but the key lies in the thought process:

DeepSeek verified the answer multiple times after calculating the result, considering whether the answer remained correct under different circumstances. Before outputting the final answer, DeepSeek conducted three rounds of answer verification.

While this may seem like a waste of computational power, such multiple verifications are necessary to ensure the accuracy of responses under the DSA mechanism, as the sparse architecture of DeepSeek could lead to a higher error probability compared to other AIs.

I also designed a multi-step task chain:

Search for today’s temperature in Beijing.
Convert the temperature to Fahrenheit.
Invoke a tool to check if your conversion is correct.
Summarize in one sentence whether today is suitable for outdoor activities.

Note: You must decide when to invoke the tool, not complete it all at once.

Let’s look at DeepSeek’s thought process:

It understood the requirements of the question well and began to use search and math tools step by step to solve the problem, ultimately providing the answer:

Overall, the answer followed the steps correctly, and it even automatically chose a math tool to confirm the conversion result. However, there was an oddity where DeepSeek lost the answer to the question about summarizing whether today is suitable for outdoor activities. Nonetheless, the thought process indicates that DeepSeek indeed possesses the ability to make autonomous decisions on which tools to use.

In comparison, another AI faced with the same question, while understanding the requirement to “invoke tools,” ended up directly searching for corresponding data to fill in the answer:

In fact, the tool invocation tutorial in DeepSeek’s thinking mode also features similar questions, demonstrating how to improve the final answer quality through multi-turn dialogue and invoking multiple tools.

You can think of it this way: DeepSeek used to rely on memory (model parameters) to combine answers when you asked a question. Now, it can break down the problem, ask individual questions, and use different tools (such as search, math, programming, etc.) to provide better solutions, finally integrating all answers into a complete response.

Due to time constraints, I didn’t design more challenging questions to test DeepSeek, but interested users can log on to the DeepSeek website to try it out themselves.

The Strongest Open Source? OpenAI and Google May Face Challenges

Is DeepSeek V3.2 powerful? It certainly is, but it does not have a dramatic lead. Test results show it is competitive with GPT-5 High and Gemini 3.0 Pro. However, when a model can match GPT-5 and Gemini 3.0 Pro across multiple authoritative benchmarks while having inference costs that are only one-third or even lower than mainstream models, and is released fully open-source, it can significantly impact the entire market—this is the fundamental logic behind DeepSeek’s ability to disrupt the industry.

Previously, there was a prevailing notion in the industry: “Open-source models are always eight months behind closed-source models.” While this conclusion is debatable, the release of DeepSeek V3.2 clearly puts this debate to rest. DeepSeek continues to insist on full open-source, especially with the introduction of DSA, which significantly reduces costs and enhances long-text capabilities, effectively transforming the role of open-source models from ‘followers’ to ‘challengers’ that force closed-source giants to adapt.

More importantly, the cost revolution brought by DSA will have a significant impact on the commercialization of large AI models, as the training and inference of AI models still face high costs. A statement like “costs reduced by 60%” relates not only to operational costs but also to initial deployment costs, meaning that even small enterprises can leverage DeepSeek to train stronger models.

With the inference costs for long-text interactions being sufficiently low, advanced AI applications (agents, automated workflows, long-chain reasoning, etc.) will no longer be limited to the enterprise market but can be better promoted for consumer use, potentially accelerating the trend of “AI tools replacing traditional software,” allowing AI to truly penetrate everyday use at the operating system level.

For the average user, it may just seem like an additional free and useful model, but in a few months or half a year, you may notice a qualitative improvement in various hardware and software AI experiences, likely thanks to DeepSeek’s contributions.