From Algorithms to Manufacturing Capability: Zhiyuan’s Mid-Game Assessment of the Humanoid Track
```

The atmosphere of the embodied intelligence track in 2026 is completely different from two years ago.
Recently, at the Annual Conference on Humanoid Robots and Embodied Intelligence Standardization held in Beijing, Peng Zhihui, co-founder of Zhiyuan Robotics, made a clear judgment: “There are now over 140 domestic humanoid robot manufacturers, with 330 products launched. The industry has moved from laboratory showmanship and demo displays into the second half: engineering competition and scenario-based competition.”
This statement can actually be seen as a redefinition of the industry's stage.
Over the past few years, humanoid robots have been most likely to gain attention via videos. Whoever walks most like a human, runs more steadily, backflips, or scales walls, will go viral on social media and attract investment. That was the era of "showcase mode." But in this speech, Peng Zhihui repeatedly emphasized a term—"deployment mode."
“From 2024 to early 2025, everyone was still competing to see whose robot walked straighter and more naturally. Now, the flexibility of the main body has reached a practical stage, and the next competition is about who’s better at getting real work done—not just domestically, but also compared to top overseas companies, seeing who can truly land in ‘deployment mode.’”
When a tech industry shifts from "can it move?" to "can it work?", it steps from the conceptual phase into the engineering phase. For investors, this signals a change in risk structure.
If it used to be a competition of algorithms and financing, now it’s a competition of system engineering capability.
Peng Zhihui was candid in his opening: “The entire embodied intelligence industry is still exploring together. No single company can give the right answer alone. We have to do the right thing at the right time.” While sounding humble, there’s an implicit judgment—technical windows have opened, the era of single-point breakthroughs is over, and the era of integration has begun.
Why now?
Peng Zhihui gives a simple answer: “The root cause is breakthroughs from AI technology development.” He breaks the past decade’s AI evolution into three stages: perception intelligence brought by deep learning, cognitive intelligence brought by large models, and now, the physical intelligent world driven by AI combined with robots.
“We scaled digital AI in the past few years, now the challenge is to scale physical AI, moving from the digital world to the physical world.”
It’s a longer, tougher road.
In the digital world, "when code fails you can reboot"; but in the physical world, "there is physical cost and failure cost." When a robot falls, it may mean hardware damage and cash flow drain. Thus, he proposes an engineering paradigm—"one body, three intelligences."
The so-called "one body" refers to the robot main body.
“The main body is the constraint interface of AI in the real world,” Peng Zhihui emphasized, “The real physical world is full of friction, collision, deformation, errors, aging, noise. Main body design isn’t just hardware pileup, but a synthesis of reliability engineering, supply chain engineering, and safety engineering.”
For capital markets, this pulls the track logic from "algorithm stories" back to "manufacturing capability."
He breaks down core components quite clearly: “The two most important current components are: joints, which decide the upper limit of movement capability; and dexterous hands, which decide the upper limit of manipulation. These two components make up most of the total machine cost.”
This is almost a cost structure diagram.
In the early industry, actuator tech paths were varied—hydraulic, linear drives, high-speed/high-rigidity schemes coexisted.
But “since 2023, solutions have started converging to new type of joints.” He even offers an analogy: “The hardware tech for humanoid robots is quite similar to new energy vehicles—the core is the so-called 'three-electric system.'”
The difference lies in complexity. Auto motors have relatively simple working conditions, but robots “require high dynamics, high-frequency forward/reverse rotation, with dozens or even hundreds of degrees of freedom.” The spec differences between joints are huge—torque requirements for fingers and thighs are totally different.
“If you design bespoke specs for each joint, it’ll be a disaster for mass production,” Peng Zhihui said bluntly.
Zhiyuan’s solution is series and standardization. “We consolidated five major series and nearly 10 product models to 8 series joint designs. These 8 joints are used across all products, meeting all body part needs—that’s the benefit of standardization.”
When a company begins discussing "series planning" instead of individual product performance, it’s preparing for scale.
The challenge of dexterous hands is even more complex. "On one hand, you must fit 10–20 degrees of freedom into a space smaller than a human hand; on the other, you require extremely high-dimensional sensing ability, especially touch.”
Peng Zhihui gives an intuitive judgment: “About 80% of tasks humans do well—and traditional automation doesn’t—are strongly related to touch.” Assembly workers judge success by the “click” sound; how to digitize such experience is the industry bottleneck.
“For vision, we first have standardized sensors, then standardized data formats, then standardized datasets, finally algorithm explosion,” Peng Zhihui stated, “but for tactile sensors, the tech path hasn’t converged and there aren’t standards yet.”
For investors, what does this mean? It means once touch becomes standardized, it will bring new turning points in tech and cost.
If "one body" is the torso, "three intelligences" is the soul.
Motor intelligence has made leaps in the past two years. Peng Zhihui summarized: “First, paradigms shifted from model-driven to reinforcement learning; second, simulation frameworks spread, allowing massive parallel training; third, joint tech converged, reducing control difficulty.” The combined benefit greatly improved dynamic performance.
But movement is only the foundation. Peng Zhihui pointed out, “Interactive intelligence provides emotional value, operational intelligence provides productivity value.”
Interactive intelligence heavily reuses advances from large models, but "the robot of the future can’t just understand voice commands, it has to see your emotions, understand your tone, even predict your intentions.” In his view, emotional value “is bigger than many imagine”—this is why robots in Spring Festival Gala performances prompt widespread discussion.
What truly determines commercial value is operational intelligence.
To lower training barriers, Zhiyuan launched the “Lingchuang Platform.” “We simplify movement training flow to the level of posting on TikTok: upload a video, the platform auto-completes key point detection, movement migration, training and deployment,” Peng Zhihui said. This means the industry will move from researchers' "development mode" to mass participation "creation mode," and ultimately achieve low-cost "deployment mode."
Deployment is the core term running through the whole speech.
Speaking on scenario selection, he suggested an “lay eggs along the way” strategy. “We split tasks into scenario complexity and task complexity. Scenario complexity is a constraint, not a value; task complexity reflects value.”
Autonomous driving operates simple tasks in complex settings, while humanoid robots now are more suited to “complex tasks in simple settings”—like high-DOF operations in structured factories.
“Eventually, autonomous driving and embodied intelligence will move to complex tasks in complex environments,” Peng Zhihui said, “but now, we must choose practical paths.” This is a phased pragmatism, not a romantic final solution.
In the latter part of the speech, Peng Zhihui used a vivid analogy to make the humanoid path’s logic clear: “Computer use is the humanoid interface for the digital world; humanoid robots are the universal interface for the physical world.”
In theory, AI directly generating underlying code would be more efficient, but real-world software ecosystems are designed for mouse and keyboard. Therefore, interface operation is the most universal path.
Similarly, the height of door handles, stair dimensions, tool forms are all designed for human bodies. “Since environments are built around humans, for AI to maximize universality and compatibility, the end form will most likely resemble a human,” Peng Zhihui said, “It might not be the most efficient, but definitely the most compatible.”
This statement actually answers the capital market’s most common question: Why must it be humanoid? Because it’s the interface.
Finally, he defined the industry mid-stage as “infrastructure, not single-point products.” “The key to scaling physical AI is standardizing closed-loop data, reliability engineering and maintainability,” Peng Zhihui said, “We have to run fast, but also run steady.”
When a robot company starts emphasizing data governance, evaluation systems, maintenance experience and standard co-building, its focus is no longer product release, but the industrial system.
For investors, 2026 may not be the year of explosion, but the year of differentiation.
Showcase mode ends, deployment mode begins; emotional premium fades, engineering capability pricing rises. Joint yield, series planning, real scenario operating hours, data feedback efficiency—these metrics will gradually replace stage performance difficulty as valuation core.
Moving from digital AI to physical AI is a far longer industrial migration. Peng Zhihui said at the end: “Standardization is not just technical norms, but a catalyst for industrial landing.”
When the industry starts talking standards, it’s preparing for scale. And scale is never shouted out—it’s slowly built, joint by joint, dexterous hand by dexterous hand, supply chain by supply chain, maintenance system by maintenance system.
Risk Warning and DisclaimerThe market has risks, invest cautiously. This article does not constitute personal investment advice, nor does it take into account the special investment objectives, financial situation or needs of individual users. Users should consider whether any opinions, views or conclusions in this article are suitable for their specific situation. Investing accordingly is at your own risk. ```