7 Billion Yuan in the Gap of Computing Power: Who Is Paying for Wu Wen Xin Qiong’s “Middleware” Business?
```
On May 7, domestic AI infrastructure service provider Infinigence AI announced the completion of a new round of financing amounting to over 700 million yuan.
This round was jointly led by Hangzhou Gaoxin Jin Investment Group and Huiyuan Capital, with participating investors including Guoxing Capital, Qinhuai Data, GF Qianhe, Lihe Qingtong, Zhongbao Investment, AEF NextGen, Tengrui Capital, Kai Lite, CITIC Construction Investment Capital, and Kuande Intelligent Learning Laboratory. Existing shareholders Junlian Capital, Shanghai Guotou Futeng, and Yuanzhi Future also made additional investments.
Against the industry backdrop of supply-demand contradictions in computing power and fragmented underlying chip architectures, this financing reveals the real commercial value long underestimated in the "AI infrastructureservice" track.
The company has precisely defined its business model as a "computing power operator," i.e., it operates computing power resources, and its developed software platform is a tool for efficient resource scheduling.
Currently, the core pain point faced by the large model industry is the "M×N" adaptation problem: upstream, there are N types of AI chips with different architectures and ecosystems; downstream, there are M types of large models with varied structures.
Because the operator libraries and compiling environments of different hardware are mutually incompatible, model vendors face high time and R&D costs when migrating computing power.
Infinigence AI is targeting this "software-hardware decoupling" middleware gap. Its core products are the Agentic MaaS large model service platform and supporting hardware-software joint optimization toolchain.
If we see large models as the "electricity" of modern industry and chips as "generators," then Infinigence AI acts as the "substation" and "smart grid."
By pooling and virtualizing heterogeneous computing power, Infinigence AI shields underlying hardware differences, providing standardized computing interfaces for upper-layer applications, thus generating income through token throughput, computing power leasing, and privatized deployment services.
Keeping pace with the AI boom, Infinigence AI was founded in May 2023. With a clear commercial path, the company quickly attracted capital,and with the newly announced round, its cumulative financing now nears 2.2 billion yuan.
The list of investors behind this new round of over 700 million yuan indicates a tightly knit industry collaboration logic.
On one hand, the lead investment by Hangzhou Gaoxin Jin Investment Group clearly signifies industrial infrastructure layout intentions. As intelligent computing centers are built at large scale across regions, local governments not only need physical data centers but also technical operators capable of activating these heterogeneous computing power assets, to improve overall computing device utilization.
There is also upstream and downstream industry binding; among the co-investors are Qinhuai Data (IDC data center service provider), Kai Lite (video image display control hardware provider), and other real enterprises. This shows that Infinigence AI's business scope is extending toward physical infrastructure and terminal hardware, aiming to build stronger upstream and downstream collaborative relationships in the computing supply chain.
Capital's concentrated preference is also due to its solid technicalfoundation.
It is understood that Infinigence AI's core team comes from the NICS-EFC laboratory of Tsinghua University's Department of Electronic Engineering. The founder, Wang Yu, and Xia Lixue—a mentor-mentee duo—have long-term data and technical accumulation in deep learning hardware-software co-optimization, EDA (electronic design automation), and AI chip architecture.
Capital's heavy bet on Infinigence AI is based on the sustained realization of its core operating indicators.
In terms of execution accuracy, its Agentic MaaS platform’s alignment rate with original model precision exceeds 99.9% when supporting complex toolchains. For computational efficiency, the platform improves overall system throughput by 2–3 times, reduces system latency by 50%, and strictly controls first-token delay within 500 milliseconds.
Additionally, data from business scale expansion has validated the strong demand in the computing power scheduling market.
On a macro level, as of March 2026, China's daily token call volume has surpassed 140 trillion, up more than 40% compared to last year-end. Under this industry dividend, as of the end of April this year, Infinigence AI's large model service platform achieved an explosive growth of over 20 times in daily token call volume compared to the end of last year.
This data explosion reveals a fundamental restructuring of industry billing logic.
As large model API price wars approach marginal cost, the market’s procurement benchmark for computing power is shifting from charging by "GPU rental hours" to focusing on "token throughput and generation efficiency" as the core of token economics.
In this new paradigm, Infinigence AI's commercial closed loop can run through: it does not earn hardware resale margins, but by operator reconstruction and model compression, delivers an effective token count far exceeding the average under equal hardware depreciation. This extreme cost reduction and efficiency optimization is the underlying financial logic supporting its 700-million-plus yuan financing.
After this round, industry attention will focus on how Infinigence AI digests its valuation and further expands its business boundaries.
Infinigence AI has proposed the "AI productivity formula," defining it as the product of "intelligent scale × token production efficiency × token value transformation," signaling its attempt to raise industry narrative from "computing power scheduling" to establishing a uniquely Chinese "token economics."
In actual business implementation, AI infrastructure providers still have to cross several visible thresholds.
The primary challengecomes fromthe ecological convergence of hardware manufacturers. As leading companies such as NVIDIA keep strengthening CUDA's ecosystem, and domestic chip giants successively launch native integrated hardware-software solutions, third-party middleware vendors must prove their irreplaceability through deeper underlying compilation and operator reconstruction to avoid being "pipelined" by upstream and downstream manufacturers.
A longer-term variable is the decentralization of computing architecture.
As demand for computing power shifts from purely cloud-based to edge or even physical AI, computing networks’ complexity will grow exponentially. In this regard, Infinigence AI has already made moves, with its "Infinity" terminal intelligent system providing an integrated "terminal model + terminal software + terminal IP" solution.
Under even stricter power consumption, heat dissipation, and latency limits in future car-end or robot-side terminals, how to realize efficient scheduling across cloud, edge, and terminal computing power will be the key test determining whether this star company can truly become the "Android system" of the AI era.
Risk warning and disclaimer termsThe market involves risk, so investment requires caution. This article does not constitute personal investment advice and does not consider specific users' unique investment goals, financial status, or needs. Users should consider whether any opinions, viewpoints, or conclusions herein suit their particular circumstances. Investments made accordingly are at one’s own risk. ```