Huawei τ Scaling: Chip Competition Shifts from "Node" to "Time"

```

Huawei has changed the narrative around chip performance improvement. Previously, the industry’s most common comparison was who could advance to a more advanced process the fastest; this time “τ-scaling” moves the metric from “how many nanometers” to “how much time”. Transistor switching, signal transmission, compute access, system communication—all are included in the same time-optimization framework.

On May 25, Huawei semiconductor head He Tingbo published a paper detailing Huawei’s viral “chip” technology. The core judgment can be summarized in one sentence: nodes are not phased out, but packaging, interconnects, memory bandwidth, protocol stacks, and system architecture beyond the node are starting to move to a more prominent position.

Huawei simultaneously disclosed three sets of key information: in the past six years, 381 chips have been designed and mass-produced based on this approach; the new generation of Kirin chips coming this fall will adopt LogicFolding for the first time; and by 2031, high-end chips designed on this roadmap will reach transistor densities equivalent to 1.4nm process technology.

This path is not just about phone chips. On phones, the focus is on time-compression within a single SoC; for AI, it’s about communication latency across thousands or tens of thousands of chips. The market’s real focus should not just be the next generation Kirin’s benchmark scores, but whether advanced packaging, hybrid bonding, 3D design tools, memory and logic synergy, and system interconnect will enter a stage of validation and expansion.

Nodes haven’t exited, but node alone can’t explain performance growth anymore

Over decades, the main line in the chip industry is direct: shrink transistors, pack more devices per unit area, higher frequency, and for a long time, power and cost could also be diluted. Advanced process thus became the hardest performance metric in the race.

τ-scaling cuts into another issue: Even if transistors keep shrinking, plenty of time consumption inside chips isn’t on the transistors themselves. Signals take time to traverse, compute units need time to move data, chips communicating take time. Geometric scaling solves “making it smaller”; τ-scaling tries to solve “running faster, waiting less”.

Huawei’s framework covers four layers: device, circuit, chip, system. It doesn’t just tweak a single circuit module, but incorporates delays at different levels into the optimization target. This means the value center won’t only fall on front-end manufacturing; packaging, interconnect, memory, and system architecture all take on bigger weight.

This is the key to “replacing geometric scaling with time scaling”. Replace doesn’t mean advanced process technology isn’t needed, just that performance improvement can’t bet only on the next node.

LogicFolding: Kirin breakthrough at fixed node

In engineering, the most convincing example of τ-scaling comes with this fall’s mass-produced Kirin 2026.

LogicFolding’s design logic breaks up traditional planar layout physical boundaries, splits digital, analog, and memory circuits across multiple vertically stacked active layers, and uses ultra-fine-pitch hybrid bonding interconnect to greatly shorten signal propagation distance along key paths.

Measurement results show transistor density jumps from 155 million/mm² to 238 million/mm² within a single generation, up 55%, equivalent to three years’ worth of geometric scaling; SoC performance-core power efficiency is up 41%, max frequency up nearly 13%, CPU main core frequency returns to 3.1 GHz. On the SRAM side, working frequency is up over 40%; on representative processor cores, clock buffer count is down over 50%, clock skew drops 25%, wire length shortens about 30%.

Huawei’s self-assessment is that Kirin 2026’s implementation is “deliberately conservative”: hybrid bonding pitch is 1.5 microns, folding applied selectively only along key paths. According to the roadmap, Kirin CPU max frequency is expected to rise to 3.39 GHz in 2027, hit 3.71 GHz in 2028, break 4GHz in 2029; transistor density expected to surpass 400 million/mm² before 2031, matching 1.4nm process levels. He Tingbo’s paper calls the roadmap “feasible and economically viable”.

This is not “bypassing lithography machines,” but breaking down performance sources

Understanding τ-scaling as “bypassing lithography machines” misses the point. Huawei’s publicly stated background: geometric scaling is pushing physic limits, cost returns are weakening, and performance boosts can’t rely solely on more advanced nodes.

This means advanced processes remain important, but are no longer the only variable. Internal circuit efficiency, data movement distance, memory access speed, and system communication latency may all become new sources of performance.

In other words, the biggest question used to be “who gets the next node first”; now we also need to ask: who can optimize node, packaging, interconnect, memory, and system organization together?

This shift affects industry division of labor. Once-supporting advanced packaging, hybrid bonding, 3D toolchains, memory interface, system interconnect now get stronger mainline status. They no longer just “assemble chips” or “connect chips,” but play direct roles in boosting performance.

AI system bottlenecks are more a ‘time issue’ than phones

Phone chips address time within a single chip; AI systems address time across sets or cabinets of chips. The bigger the model and compute scale, the more prominent the cost of moving data between chips, memory, and interconnect networks.

Huawei’s public framework mentions UnifiedBus—its goal is to unify memory addressing and native memory semantics, compressing system communication latency. This relates not to single chip performance, but to system-level data scheduling efficiency.

Applied in SuperPoD-like systems, the direction is clear: boosting single chip speed is just a first step, bigger performance gains may come from compressing whole compute system latency. AI computing’s bottlenecks often aren’t “lacking compute,” but “compute can’t wait for data”.

This is why τ-scaling is more imaginative in AI scenarios. When data movement and communication delays are high enough, system-level optimization may bring more gains than single-point process upgrades.

Market is watching not concepts, but three rounds of delivery

The roadmap is already out; market focus will quickly shift to delivery.

The mass production of Kirin 2026 this fall is the first externally verifiable node for τ-scaling: How much independently verifiable performance and efficiency data LogicFolding delivers in mass-produced products will be the first public test of this framework’s credibility. Second, whether Huawei further discloses full methodology and engineering details, to drive broader industry collaboration. Third, industry chain response-—plans for capacity expansion, order trends and customer validation in advanced packaging, hybrid bonding, 3D toolchain, will be key signals for whether this roadmap can become industry consensus.

From the current node to 2035, τ-scaling’s full validation spans three levels: phones solve intra-chip time optimization, AI solves time optimization among thousands of chips, and the industry side addresses value shift from manufacturing to packaging, interconnect, and system architecture. The direction is set; product and supply chain delivery are the core pricing variables for the coming years.

Risk warning and disclaimerThe market involves risk and investment should be cautious. This article does not constitute personalized investment advice and does not consider the specific investment goals, financial status, or needs of individual users. Users should consider whether any opinions, viewpoints, or conclusions herein fit their particular circumstances. Investment based on this article is at your own risk. ```