Token usage increased by more than 10 times! Doubao Large Model 2.1 is now online, Seedance 2.5 is expected to officially launch in early July.
Volcano Engine has launched three models in rapid succession, comprehensively targeting the production-level AI market with an intensive product pace and aggressive pricing strategy.
On Tuesday, Volcano Engine officially released the Doubao large model 2.1 series, including the flagship Doubao-Seed-2.1-Pro and the lightweight Doubao-Seed-2.1-Turbo, with API fully available on Volcano Ark starting today. Meanwhile, the video generation model Seedance 2.5 announced it will be officially released in early July, and the audio generation model 1.0 has simultaneously begun invitation testing, marking Doubao's ecosystem's comprehensive extension from language understanding to multimodal content production.
Doubao large model 2.1 Pro is priced at 6 RMB per million token input and 30 RMB per million token output; for coding and agent scenarios, the combined cost drops to just 1.96 RMB per million tokens, directly targeting enterprise production environments. Volcano Engine has also launched the continuously updating Doubao-Seed-Evolving, which rolls out updates 2-4 times per month, allowing enterprises to obtain the latest model capabilities without changing API integration nodes.

At this conference, Volcano Engine president Tan Dai disclosed the latest numbers: As of June this year, Doubao large model’s daily token calls have exceeded 180 trillion, over ten times the growth compared to last year. Meanwhile, in China's public cloud MaaS service market, Volcano Engine ranks first with 49.5% market share.
This product combination will directly influence the domestic enterprise AI procurement landscape. Doubao large model 2.1 has already integrated with partners such as WPS, Dedao, and Unity, and plans to cover hundreds of millions of Doubao users. In several recognized benchmark tests, Doubao large model 2.1 Pro’s performance in Coding and Agent tasks is approaching, or even surpassing, international top models like OpenAI GPT-5.5 and Anthropic Claude Opus 4.7.
Coding Capability Crosses Production-Level Threshold
Doubao large model 2.1 Pro demonstrates abilities comparable to international flagship models in multiple recognized programming benchmarks. On Terminal Bench, Doubao large model 2.1 Pro is comparable to Claude Opus 4.7, able to complete full engineering tasks end-to-end in command line environments; on the long-form software development benchmark SWE-Pro, its performance is close to GPT-5.5.
In the natural language to repository-level code transformation benchmark NL2Repo-Bench, Doubao large model 2.1 Pro surpasses GPT-5.5. In the scientific computing code evaluation SciCode, Doubao 2.1 Pro scores 59.8, exceeding Claude Opus 4.7 and GPT-5.5. This test covers real scientific questions in mathematics, physics, chemistry, biology, and materials, making it one of the high-value benchmarks in AI for Science.

In crowdsourced developer testing, over 60% of developers believe Doubao large model 2.1 Pro’s output quality in real coding tasks is superior to Claude Opus 4.6. Volcano Engine also disclosed a chip design RTL case: Doubao large model 2.1 Pro ran continuously for nearly 18 hours, underwent 9 rounds of iteration, generated 6 core modules and 1303 lines of RTL code, passed simulation, testing, and comprehensive checks, and finally completed production-level coding delivery via handwritten digit recognition validation.
Agent Capability Leaps, Covering High Economic Value Tasks
In general agent capability, Doubao large model 2.1 Pro achieved the highest score on OpenAI's GDPval benchmark, which covers real-world economic value tasks in 9 industries and 44 professions. On the newly released Agents' Last Exam (ALE) benchmark in June 2026, Doubao large model 2.1 Pro surpassed Claude Opus 4.7—the benchmark covers 13 industry clusters and over 1,000 high economic value real tasks. Recently released, it is difficult to optimize specifically for it, offering a genuine measure of the model’s generalization in new scenarios.
Regarding tool invocation, Doubao large model 2.1 Pro comprehensively outperformed Claude Opus 4.7 and GPT-5.5 on the MCP-Atlas benchmark, showing more stable performance using real MCP Server and various tools. Volcano Engine presented a typical use case: a developer used the model to manage more than 500 agents working collaboratively, triggered over 1,000 tool calls, and finally completed the construction of more than 100 uniquely shaped buildings on a large 3D city map.
Multimodal Understanding Maintains Global Lead
In image understanding, Doubao large model 2.1 has surpassed GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro in MMUU-Pro and several other rankings, achieving global SOTA. In video sequence understanding, Doubao 2.1 Pro is far ahead of Gemini 3.1 Pro on industry-authoritative TOMATO and LVBench benchmarks.

For GUI Agents, Doubao large model 2.1 Pro's desktop capabilities are close to Claude Opus 4.7, while its mobile capabilities are significantly better and comprehensively surpass GPT-5.5, achieving SOTA globally. Volcano Engine demonstrated an end-to-end video editing use case: Doubao large model 2.1 Pro processes over two hours of video at once, automatically performs oral script generation, precise segment positioning, audio synthesis, and subtitle output, with no human intervention required throughout.
Seedance 2.5 and Audio Model Expand Territory
According to WallstreetCN, Doubao video generation model Seedance 2.5 is currently in the final stages of internal testing and is expected to be officially released in early July. The new model supports a maximum single video generation duration of 30 seconds with greatly improved shot coherence; it also allows input of up to 50 fully modal materials jointly, touted as the world's most by officials; additionally, it offers more flexible and controllable video editing capabilities, aiming to further improve creator efficiency and product quality.
On the same day, Volcano Engine officially released the Doubao audio generation model 1.0 (Doubao-Seed-Audio 1.0), supporting multimodal inputs such as text and reference audio, able to generate complete audio works with multi-role dialogue, background music, and environmental effects, eliminating the need for multi-track editing, alignment, mixing, and other post-production steps in traditional workflows. The model supports creating 2-minute audio pieces at a time, and can extend and keep tone consistency using reference input. The API is open for invitation testing from today via Volcano Ark and plans to connect to products like Jianying, Jimeng, and Tomato.

Pricing Strategy and Large-Scale Commercial Deployment
Doubao large model 2.1’s pricing takes into account flagship performance and large-scale deployment needs. Pro version costs 6 RMB per million token input, 30 RMB per million token output, with input dropping to just 1.2 RMB under cache hit conditions; the Turbo version is similar in capacity to the Pro version, but price is halved, making it more suitable for high-frequency scenarios. In combined coding and agent contexts, the actual cost of the Pro version is compressed to just 1.96 RMB per million tokens.
For product integration, Doubao large model 2.1 is fully compatible with mainstream frameworks like Claude Code and Codex, and is already available in developer tools such as TRAE, TRAE WORK, and Button. Among partners, WPS states the model has formed a stable usable chain for core office tasks like PPT generation and spreadsheet delivery; Dedao reports it achieves zero violations in following business rules and enforcing key bans; Unity believes the model's upper limit for script logic tasks exceeds top models. Volcano Engine states Doubao products will soon integrate Doubao large model 2.1 Pro, serving office and productivity scenarios for hundreds of millions of users.
Risk Warning and DisclaimerThe market carries risks and investments must be made cautiously. This article does not constitute individual investment advice and does not consider the special investment goals, financial conditions, or needs of individual users. Users should consider whether any opinions, views, or conclusions herein fit their particular circumstances. Investments are at your own risk.