AI developing AI--the most important and also the most dangerous technology in the world, and the goal of many AI giants.
A capability that could change the trajectory of technological development is quietly taking shape within AI labs. When AI systems begin autonomously developing more advanced AI, humanity’s understanding and control of technological evolution will face unprecedented challenges.
According to Wind Trading Desk, a workshop report by the Center for Security and Emerging Technology (CSET) released in January 2026, this process has already begun and may accelerate in the coming years, bringing "major strategic surprises."
OpenAI has publicly announced plans to create “real automated AI researchers” by March 2028. The report shows that leading AI companies are currently using their most advanced models internally to accelerate research and development, and these models are often used for internal R&D before being released externally.
A machine learning researcher who attended the meeting revealed that on carefully selected tasks, AI models can complete work in 30 minutes that previously took him several hours. As model capabilities improve, the range of R&D tasks that can be automated is continuously expanding.
The core risks of this technology are twofold: First, human supervision of the AI research and development process will decrease; second, the speed of AI capability improvement may exceed humanity’s ability to respond.
The report warns that in the most extreme scenarios, AI-driven technological improvements could create a self-reinforcing loop, leading to a "capability explosion"—productivity leaps from 10 times that of humans to 100 times, 1000 times. AI systems would fully dominate the R&D process, human participation would approach zero, and the resulting system capabilities could far exceed those of humans.
Some leading figures in the AI field have warned that this could result in "irreversible loss of human control over autonomous AI systems," even "large-scale loss of life and the marginalization or extinction of humanity."
Although experts widely disagree on the likelihood of these extreme scenarios, a key consensus reached at the workshop is that: These scenarios could indeed happen and are worth taking preventive action against now.
Due to different views being based on differing assumptions of AI R&D operations, new empirical data may be insufficient to resolve these conflicts. This means it may be difficult to detect or rule out extreme "intelligence explosion" scenarios in advance.
Leading Companies Already Use AI to Assist AI Development
Automated AI R&D is no longer a theoretical concept. The workshop found that top AI companies are already using their best models to help build better ones, and AI’s contribution to R&D grows over time. Each time researchers get a newer, more advanced model, these models can undertake more tasks that previously required humans.
Engineering tasks are currently the area where AI provides the greatest value, especially in programming. Although the precise productivity improvement from AI-assisted programming is unclear, technical staff at leading AI companies actually spend a lot of time using AI tools to help with work.
Anthropic described in public materials that its infrastructure team’s new data scientists provide Claude Code with the entire codebase to quickly get started, the security engineering team uses Claude Code to analyze stack traces and documentation, and problems that used to take 10 to 15 minutes to solve are now resolved three times faster.
Besides coding assistants, AI systems also assist AI R&D in various other ways. For example, the “LLM-as-judge” paradigm has been integrated into many aspects of AI research. This technique uses large language models to evaluate AI-generated outputs, performing tasks that previously required human judgment, and is now widely used in training data filtering, safety training, and scoring problem solutions.
An employee from a leading AI company who attended the workshop described using internal AI tools to generate about 1,000 new reinforcement learning environments to train future models—far more than he could create alone.
Explosion or Stagnation: Two Contrasting Expectations
The future trajectory of automated AI R&D is the core issue of the report. How high will the degree of automation be? How fast will progress be? How will it affect society? Workshop participants with strong opinions tend to fall into two groups: those expecting rapid progress reaching high automation and advanced capability, or those expecting slower progress plateauing at a relatively early stage.
The report describes several possible development dynamics:
1. Productivity Multiplier Model (Explosive) - Assumes the proportion of AI R&D automated by AI increases steadily, productivity rising from 120% of human R&D to 10 times, 100 times, 1000 times. As improvements compound, progress accelerates further, human involvement and understanding drop to zero, and AI system capability far exceeds humans.
2. Productivity Multiplier Model (Decaying) - Suggests that although AI R&D is increasingly automated, the scientific output from a given level of input (like compute) is insufficient to drive further compounded improvement. AI R&D becomes increasingly automated, but capability reaches a plateau at a relatively early stage.

3. Amdahl’s Law Model - Argues that AI can only automate certain specific areas of AI R&D (coding and running experiments are automated, but proposing new research projects or operating data centers are not). Even if automation speeds up parts of the R&D process, overall progress is still constrained by the bottleneck of activities AI cannot automate and thus full automation is unattainable.

4. Expanding Pie Chart Model - Suggests that as AI automates certain R&D activities, human researchers repeatedly discover that continued progress requires new types of contributions that AI systems cannot yet automate. AI R&D could progress rapidly, but humans remain the core of the research process.

The divergent expectations of which dynamics will dominate are closely related to contrasting answers about the “shape of the AI progress curve”: How fast will AI R&D progress? Will progress accelerate due to compounding improvements or slow down due to diminishing returns? What is the likelihood AI capability will reach the levels of top human AI researchers? If AI matches expert human capability, where are the performance limits for different tasks beyond that point? Are there bottlenecks that could impede AI R&D progress?
A key finding of the report is: It is very difficult to use empirical evidence in advance to arbitrate between two conflicting views of AI R&D automation—one expects rapid progress leading to highly advanced AI systems (“superintelligence”), the other expects slower progress and a plateau before reaching human-level performance in some key domains.
Both viewpoints rely on certain assumptions that let them explain why, even if the opposite evidence is observed, the situation will eventually revert to expectations. For example, one side may point to existing bottlenecks, while the other may argue these are temporary and quick improvement will follow once solved.
There Is Urgent Need to Build Monitoring Indicators
Even though interpreting new evidence presents challenges, participants unanimously agree that collecting and understanding metrics for the trajectory of automated AI R&D will be very valuable. Existing empirical evidence (including current benchmark assessments) is insufficient to measure, understand, and predict the trajectory of automated AI R&D.
The report suggests focusing on three categories of indicators.
The first category is metrics for broad AI capabilities, including executing tasks that take humans a long time, carrying out “messy tasks” (with imprecise specifications, lots of context, requiring interaction with people or dynamic systems), and the ability to absorb new facts, skills, and ideas instantly. Apart from time-span measurements tracked by non-profit METR, almost no current metrics capture progression in these abilities.
The second category is dedicated AI R&D benchmarks, arranged in a “ladder” by increasing complexity: software and hardware engineering (coding, debugging, performance optimization, etc.), conducting experiments (implementation, data collection, and analysis), creative ideation (proposing experiments and identifying key points), strategy and leadership (determining direction, prioritization, and coordination).
Only after the upper-tier tasks are automated can AI R&D become fully automated, but we may not see extensive data on upper-tier progress until shortly before full automation. Currently, there are no benchmarks for the top two tiers.
The third category is signs of AI R&D automation progress inside leading AI companies, including allocation of R&D spending, employment patterns in R&D, scale and complexity of tasks delegated to AI systems, gaps between internally deployed and publicly released cutting-edge AI models, measurements of AI R&D progress, and qualitative impressions from AI researchers.
Transparency Becomes the Policy Core
Given the high uncertainty of the automated AI R&D trajectory, improving access to relevant empirical evidence is a valuable near-term policy goal. At present, anyone interested in empirical evidence on automated AI R&D has to heavily rely on voluntary information releases from leading AI companies. While companies do choose to publish some relevant data, it is often scattered and incomplete.
Reasons include: companies often lack motivation to invest heavily in collecting information; even if companies do gather information, these data may be sensitive (commercial or otherwise); companies may selectively share information to support certain narratives for investment attraction.
A few laws and regulations related to transparency on cutting-edge AI development have recently been passed (most notably, the EU’s General AI Code of Conduct and California’s Frontier Artificial Intelligence Disclosure Act SB 53). However, these measures have so far barely created transparency on automated AI R&D indicators.
Policy options proposed by the report include: disclosure of key indicators (voluntary or mandatory, disclosure to governments or the public), targeted whistleblower protections, and other policy impacts.
On risk management, the report notes that several AI companies have incorporated automated AI research and development capabilities as triggers for enhanced safety measures in their safety frameworks, but these frameworks are still at an early stage. Policymakers should consider whether and how to regulate internal deployment, not just external deployment, when designing broad regulatory frameworks.
Advanced automation of AI R&D will also increase the importance of computing power advantages for companies and countries. If AI R&D is highly automated, obtaining computing power could be a key factor determining how much an organization can accelerate its AI research. From the perspective of preparing for this possibility, compute control could allow the US and its allies to slow down competitors’ large-scale automated AI R&D capabilities.
Former OpenAI policy director and Anthropic co-founder Jack Clark pointed out, if AI R&D allows AI systems to evolve 100 times faster than human-built systems, “then you’ll ultimately end up in a world with time travelers who are accelerating away from everyone else.” This could result in power rapidly shifting to systems and organizations able to move faster. As long as it’s impossible to rule out such acceleration, automated AI R&D may be the most existentially important technological development on Earth.
~~~~~~~~~~~~~~~~~~~~~~~~
The above brilliant content comes from Wind Trading Desk.
For more detailed interpretations, including real-time analysis and frontline research, please join【Wind Trading Desk · Annual Membership】
Risk Warning and DisclaimerThe market has risks, and investments should be made cautiously. This article does not constitute individual investment advice, nor does it consider the special investment goals, financial situations, or needs of individual users. Users should consider whether any opinions, viewpoints, or conclusions in this article suit their specific circumstances. Investing accordingly is at your own risk.