Byte Doubao 2.0 Released: Inference Cost Reduced by an Order of Magnitude, Directly Competing with GPT-5 and Gemini 3
ByteDance’s Doubao large model has officially entered the 2.0 phase, releasing a systematic upgrade aimed at the Agent era. **The new version maintains performance comparable to GPT-5.2 and Gemini 3 Pro, while reducing inference costs by about an order of magnitude,** providing a more competitive solution for complex task execution in large-scale production environments. On February 14, ByteDance announced that the Doubao 2.0 series includes three general Agent models—Pro, Lite, and Mini—and a specialized Code model. **The flagship Doubao 2.0 Pro directly benchmarks GPT-5.2 and Gemini 3 Pro, achieving industry-leading results in most visual understanding benchmark tests, and winning gold medals in math competitions like IMO, CMO, and programming contests like ICPC.** The series of models is now fully available. Doubao 2.0 Pro has been integrated into the Doubao App, desktop, and web "Expert" modes, while the Code version has been incorporated into the AI programming product TRAE. Volcano Engine has simultaneously launched API services for enterprises and developers. Analysts believe that in complex real-world tasks, **because large-scale inference and long-chain generation consume massive amounts of tokens, Doubao 2.0's cost advantages will become a key competitive edge.** This marks an important step for ByteDance in the commercial application of large models. ## Multimodal abilities reach world-class standards Doubao 2.0 has comprehensively upgraded its multimodal capabilities, performing outstandingly in tasks such as visual reasoning, perception skills, spatial reasoning, and long context understanding.  In **dynamic scene understanding**, the model leads in crucial benchmarks like TVBench and even surpasses human scores on EgoTempo, showing superior stability in capturing information about changes, actions, and rhythms. For **long video scenarios**, Doubao 2.0 outperforms other top models in most evaluations, and excels in several streaming real-time Q&A video benchmarks. This enables it to act as an AI assistant for real-time video stream analysis, environmental perception, proactive error correction, and emotional companionship, upgrading interaction from passive Q&A to active guidance, and making it suitable for companion scenarios such as fitness and fashion. ## Inference ability benchmarks top models, with distinct cost advantages Doubao 2.0 Pro strengthens long-tail domain knowledge, scoring higher than GPT-5.2 on SuperGPQA, ranking first on HealthBench, and achieving overall science field results equivalent to Gemini 3 Pro and GPT-5.2. In reasoning and Agent ability assessments, the model has won gold medals in IMO, CMO math Olympiads and ICPC programming competitions, and outperformed Gemini 3 Pro on the Putnam Bench. On HLE-text (the “human’s last exam”), Doubao 2.0 Pro scored a record 54.2 points, with excellent results in tool use and instruction compliance tests.  **More importantly,** ByteDance states that **while maintaining results comparable to top industry large models, Doubao’s token pricing has been reduced by about an order of magnitude. This cost advantage becomes critical for large-scale inference and long-chain generation scenarios.** Based on the OpenClaw framework and the Doubao 2.0 Pro model, ByteDance has built an intelligent customer service Agent within Feishu. This Agent can call various skills to handle customer dialogues, proactively form groups to seek help from real colleagues when facing difficult problems, arrange home repair appointments for customers, and actively follow up and recommend products after repairs. ## Code model boosts developer efficiency Doubao 2.0 Code is optimized for programming scenarios based on the 2.0 base model, enhancing code base interpretation and application generation, and strengthening error correction ability in Agent workflows. **This model is now live as an embedded model in TRAE’s China version, supporting image understanding and inference.** In practice, developers using TRAE with Doubao 2.0 Code can build the basic architecture and scenes of the "TRAE Spring Festival Town · Year of the Horse Temple Fair" interactive project with just one round of prompts, and complete the entire project in five rounds. > This project includes 11 NPCs powered by large language models, which can chat naturally according to their roles, greet customers, haggle on site, and AI tourists can autonomously decide which stall to visit, what to buy, and what to say. Related prompts and materials have been open sourced on GitHub for developers to test. Currently, Doubao 2.0 Pro is available to end users on the Doubao App, desktop, and web Expert modes; for enterprises and developers, Volcano Engine has launched Doubao 2.0 series model API services. ByteDance indicates it will continue to iterate models for real-world scenarios, exploring the limits of intelligence. Risk Warning and Disclaimer The market has risks, and investment should be cautious. This article does not constitute individual investment advice, nor does it consider the special investment objectives, financial status, or needs of any individual user. Users should consider whether any opinions, viewpoints, or conclusions in this article are suitable for their specific situation. Investment based on this is at your own risk.