Qujing Technology raises hundreds of millions in Pre-A round, seizing the token infrastructure market.
On May 20, AI token production service provider Qujing Technology announced the completion of a Pre-A round financing worth several hundred million RMB. This round was co-led by Xinglian Capital and Huakong Technology, with continued investment from existing shareholder Hillhouse Venture, and follow-up investments from Honghui Capital, Tianhao Energy, among other institutions. According to reports, the funds will be mainly invested in computing power reserves and the construction of underlying inference systems.
Against the backdrop of increasingly cautious AI investment in the primary market, this sizable early-stage financing delivers a clear industry signal: market attention is shifting substantially from large-scale model training to refined inference efficiency and business implementation.
To understand the business logic of Qujing Technology, we must trace the technical lineage of its founding team. CEO Ai Zhiyuan holds a Ph.D. from Tsinghua University and has long been engaged in distributed system optimization and parallel computing. Before starting the company, he served as a core R&D supervisor at Sangfor.
This resume, combining “academic bottom-level engineering + government/enterprise IT implementation”, led Qujing Technology to avoid the cash-burning, highly competitive large-scale model battles, instead entering from the perspective of underlying computing power scheduling.
The high-concurrency inference process of large models is inherently a highly complex distributed system engineering challenge. The team’s accumulated experience in parallel storage and computing forms a tight logical loop with their current focus on low-latency inference business.
Currently, as AI is deployed within enterprise production environments, the assessment criteria in the B-end market have shifted from solely model parameters and usability to the stability of calls and overall costs.
Over the past two years, the industry has promoted MaaS (Model-as-a-Service), while Qujing Technology has taken a differentiated approach with TaaS (Token-as-a-Service), treating tokens as the key production element linking models and costs.
The greatest pain point facing enterprise AI inference is extreme imbalance in resource usage: traditional inference pipelines rely heavily on GPU memory, resulting in low utilization of CPU and large-capacity memory, with overall system hardware utilization often below 20%—and serious wasted computing power.
In response to this pain point, Qujing Technology adopts an engineering approach of few models and deep optimization. Public information shows it has launched the “Liuhe” heterogeneous inference architecture and the “Mooncake” storage-exchange computing technology, which focus on reconstructing the KV Cache caching mechanism; by expanding the cache pool, reliance on costly GPUs is greatly reduced.
Qujing Technology’s attention from capital markets is rooted in its clear industrial logic.
On one hand, domestic large model capabilities are rapidly improving, and applications such as AI Agents and multimodal systems are moving from concept to implementation, with Token consumption increasing exponentially.
In March 2026, leading vendors Tencent Cloud, Alibaba Cloud, and Baidu Intelligent Cloud successively raised AI computing power service prices, with some models seeing increases of more than 460%. The supply-demand imbalance upstream has created a clear business space for infrastructure service providers focused on inference optimization.
On the other hand, Qujing Technology has built a certain scale of customer base.
Currently, the company provides inference services to enterprise clients such as Zhipu GLM via its ATaaS platform, with daily processed token volumes approaching one trillion. Through long-term verification in high-concurrency business scenarios, it has developed a scalable delivery capability.
The funds from this financing round will mainly be used to expand computing power reserves and build the underlying inference system. In the current window period of rising computing power prices, constructing large-scale inference production capacity in advance will help gain market discourse power.
However, Qujing Technology still faces multiple challenges.
The AI Infra sector has many participants, including startups such as Silicon Flow and Wuwen Xin Qiong, as well as cloud providers like Alibaba Cloud, Huawei Cloud, and Byte Volcano Engine offering full-stack AI Infra capabilities.
Qujing Technology needs to build sufficient barriers in both technical depth and customer stickiness to compete differently from large cloud vendors during the window period.
Overall, Qujing Technology has seized the industry node transitioning from “model training is king” to “scale inference delivery”. Its Tsinghua-driven technical heritage, “few models, deep optimization” engineering approach, and experience serving major clients constitute its current core competitiveness.
However, in the context of rapid industry development and unstable competitive landscape, whether the company can turn its technological advantage into sustained business growth remains to be seen in the market.
Risk Warning and DisclaimerThe market has risks; investment requires caution. This article does not constitute personal investment advice, nor does it take into account the specific investment objectives, financial status, or needs of individual users. Users should consider whether any opinions, views, or conclusions in this article fit their particular situation. If you invest accordingly, you assume full responsibility.