Bernstein Senior Analyst: This is the first true "chip supercycle" of my career; "bottlenecks" are the real wealth machines.

```

Bernstein star analyst Stacy Rasgon believes that as investments in AI infrastructure approach 4.4% of U.S. GDP in a frenzy, the semiconductor industry is entering an unprecedented real "supercycle."

On June 21, the Tech Surge Deep Tech Podcast, which focuses on frontier emerging technologies, released its latest interview transcript—Celesta Capital founding managing partner Michael Marks had an in-depth conversation with well-known Bernstein (Bernstein) chip analyst Stacy Rasgon.

In a dialogue that lasted nearly an hour, the two delved into topics including semiconductor revenue growth driven by AI, the leap from AI training to inference, capacity bottlenecks in all supply chain segments, and the long-term sustainability of industry growth.

(Left: Michael Marks; Right: Stacy Rasgon)

Unlike most Wall Street analysts, Rasgon holds a PhD from MIT and is a pure engineer by training, making him pay more attention to proven physical laws and capital flows.

Rasgon clearly points out that the semiconductor industry is currently experiencing the greatest explosion in demand he has ever seen in his career. Last year, the total semiconductor industry revenue surpassed $800 billion, and this year it is racing towards the $1.3 trillion mark.

Rasgon remarked in the interview:

Throughout my career, I have always heard the word 'supercycle.' This may be the first real one I have ever seen. The only thing we’re hearing now is that no one has enough computing power.

Rasgon emphasized that the current market focus is shifting from "model training" to "AI inference," which is the core for realizing commercialization. Meanwhile, capacity bottlenecks are spreading from GPUs to HBM memory, semiconductor equipment, and even electricity supply.

In the future, custom chips (ASICs), as represented by Broadcom, and Nvidia’s GPUs will long-term coexist in an ever-expanding incremental market, jointly digesting this bottomless demand for computing power.

The supply chain's "whack-a-mole" game: the whole industry is being forcefully dragged by AI

As the bottomless pit of AI computing demand is opened, the market is showing a peculiar "whack-a-mole" effect—capacity bottlenecks are breaking out one by one along the industry chain.

Rasgon analyzes this phenomenon in detail:

Everything is being pulled along by this insatiable demand for AI computing power. In my career, I have never seen anything on this scale. The situation has spread from accelerators to memory, semiconductor manufacturing equipment, networking and optical devices, power semiconductors, and now even CPUs are in short supply.

Take memory as an example, the industry is experiencing the strongest upcycle ever, with prices doubling every quarter. The main driver behind this is HBM (High Bandwidth Memory). Rasgon revealed a key data detail:

In the silicon area of an AI chip, over 85% may be HBM.

Even more critical is the "trade ratio" issue. He said:

Due to yield loss in stacking technology and space taken by the logic die, making 1GB of high bandwidth memory requires about four times the silicon area of standard DRAM.

This means that even if fabs ramp up expansion frantically, the actual increase in memory capacity (in bits) will still be highly limited.

This extreme demand has even unexpectedly benefited weak companies. Talking about Intel’s server CPU business, Rasgon bluntly pointed out that current server demand is so strong that even Intel’s profit margin is on the rise because of it:

Demand is so strong that they are even selling off inventory that had previously been written off and thrown in a corner like garbage. Customers now say: 'We don’t care, we’ll take it—please just sell it to us.'

Turning Point: "You can’t make any money just training models"

Even though hundreds of billions of dollars are pouring in, the market’s biggest worry is: is this growth sustainable? What’s the true room for imagination?

Rasgon points directly to “Inference” as the breakthrough. He stressed that much of the money previously went into large-model training, but that is not the end-game of commercialization. Rasgon said:

You can’t make any money just training models... You have to be able to use the models, and that is inference.

This shift has already begun to show up in astonishing numbers from startups. Rasgon cited data in the interview: companies like Anthropic are seeing their annualized revenue run-rate rising vertically,

Last December it was $9 billion, this January $14 billion, and recently (April) already $30 billion.

Moreover, as Nvidia recently acquired Groq, the segmentation of inference market demand is becoming increasingly clear. Rasgon points out that not all data "tokens" are of equal value.

For certain inference tasks that need extremely low latency and extremely fast responses, custom chips or specially designed inference architectures are often more economical than general-purpose GPUs.

Custom ASICs and GPUs are not a zero-sum game

With inference demand booming, custom chips (ASICs) are challenging the absolute dominance of GPUs. Broadcom has become the biggest beneficiary of this trend.

When Rasgon mentioned Broadcom, he said:

Before all of this started, Broadcom used to say that semiconductors were a mature industry, only providing low to mid-single-digit growth. But now everything has exploded. (Broadcom) says next year they expect to make $100 billion in AI revenue.

Why do major cloud service providers insist on developing their own ASICs? Rasgon believes it’s not just about performance optimization, but about having leverage in negotiations in the face of Nvidia’s sky-high 75% gross margin. Rasgon said:

At the very least, when you sit at the table negotiating the contract for next year with Jensen Huang, you want at least a card in your pocket.

But Rasgon stressed, it’s not a zero-sum game. If ASICs take a larger share, it’s because the overall pie is getting bigger.

For massive, stable, in-house developed workloads, ASICs can provide a lower total cost of ownership; but if the model structure changes, the programmability and flexibility of GPUs is irreplaceable. Rasgon thinks:

The real question is: is the opportunity in front of us still growing? If it’s big enough, both will thrive; if not, everyone is in trouble.

The ultimate ceiling of the future: the grid might not hold up

When asked about what risks the market may be ignoring, Rasgon shifted the focus from code and chips back to the physical world infrastructure—electricity.

Currently, cloud giants’ capex this year has reached $600 billion. If future infrastructure spending follows Nvidia’s expectation of $3 to $4 trillion a year, humanity’s current energy system will collapse.

Rasgon shared a model he had previously built:

Do we really have enough electricity to do this? The grid might not hold up. America’s electric capacity would need to grow by about 5% annually in the next decade. But as far as power equipment analysts are concerned, a 5% annual growth rate is simply impossible.

This means the next wave of AI innovation and bottleneck breakthroughs will inevitably fall in areas like energy generation, cooling, and nuclear power. As he has always believed:

Never underestimate human ingenuity; if there’s money to be made, engineers will always find a way.

Overall, as long as AI demand doesn’t collapse off a cliff, the "supercycle" of the semiconductor industry supply chain will continue, and capital markets must keep up with the constantly moving "capacity bottlenecks" in every segment.

The following is the full podcast (AI-assisted translation):

Michael Marks: There’s a word that gets thrown around in the chip industry every few years—supercycle. Most of the time, it's overhyped. Sometimes, it’s real. Right now, the numbers are hard to dispute: the four major cloud giants are expected to spend about $600 billion this year, most of it on AI infrastructure. Last year, the semiconductor industry’s revenue passed $800 billion and is moving toward $1.3 trillion.

My guest today is Stacy Rasgon, a well-known chip industry analyst at Bernstein. Unlike most Wall Street analysts, he is an engineer—MIT PhD, involved in chip manufacturing equipment R&D before writing his first report. This background deeply affects the way he interprets the industry—less speculation, more focus on proven physical laws and capital flows.

Today we’ll dig into: is the AI cycle truly different? Where are the bottlenecks? When the dust settles, who will actually profit?

Michael Marks: AI is driving huge numbers and is always in the news spotlight, something that's never happened before. Is this a real change or just another, bigger cycle?

Stacy Rasgon: That’s the key question. It's funny—I've heard "supercycle" my entire career, but maybe this is the first time I’m actually witnessing a real one.

There are several types of semiconductor cycles:

Supply cycle: Most common in memory. When supply tightens, prices rise, so everyone brings new capacity online. But when that comes online, demand has already dipped—since customers often over-order when things are in short supply, causing serious mismatches. These cycles usually last about four years, peak to trough.

Inventory cycle: End of the supply chain, so even small swings get magnified. If demand slips, customers reduce orders and run down inventory, so chip shipments drop below real demand; when they rebuild inventory, shipments suddenly outpace demand. These are often quarterly swings, lasting a few quarters.

Product cycle: If I’m a mobile chip supplier, whether I win the next generation iPhone design directly impacts my revenue volatility.

What we’re seeing now seems to be a demand cycle, and I’ve never seen anything like it in scale or speed before.

More notably, the explosion in AI semiconductors is now dragging the whole industry up. Nearly all semiconductor sub-sectors, whether stock or earnings, are doing well. You can clearly see the driver path—from accelerators to memory, to semiconductor equipment, to networking and optical, to power semiconductors, now even CPUs—everything is being dragged by that voracious demand for AI compute.

The only consensus now is: nobody has enough compute.

Michael Marks: Memory has always been a cyclical product. Now, makers have strong demand and long sell-outs—can this last years? There are hardly any new entrants, and capacity is fixed.

Stacy Rasgon: Let’s talk memory—this may be the strongest memory cycle ever. Just 18 months ago, we were coming off the worst memory downturn since the dot-com bubble.

You mentioned fewer players, yes. Two or three decades ago, there were more than 30 memory manufacturers. Today, depending on the sector, it's 3-6. Last downturn, they hung on despite losses, but during the dot-com bust, there were big bankruptcies. The current structure is much healthier.

Right now, memory prices are doubling every quarter, demand is red-hot, and the root cause is AI.

There are two main memory types—DRAM (dynamic random access memory) and NAND (flash memory):DRAM is like your PC’s RAM, for system operation;NAND is like storage chips in your phone, for storing photos and data.

AI chips use a special DRAM: HBM (high-bandwidth memory), stacking multiple DRAM dies together. Each chip needs a huge HBM package.

If you count the total silicon area in an AI chip, HBM can be over 85%.

Even more, there’s a “conversion ratio” issue: stacking reduces yields, plus you need to reserve space for connectors. To build the same memory capacity, you need about 4 times more silicon area for HBM than standard DRAM. So even a fab expansion brings only limited actual bits.

As long as AI demand keeps rising, this situation won’t change. If it collapses, everyone is toast, but I don’t see that happening.

Michael Marks: You’ve watched this industry a long time—is innovation catching up to relax supply pressure, or is demand just too strong?

Stacy Rasgon: Innovation is always happening. For 60 years, semiconductor’s main theme was Moore’s Law: every two years, you double chip density for the same cost, basically getting twice compute for the same price or half the price for the same compute, with higher performance, lower power.

But Moore’s Law broke down over a decade ago. It doesn’t mean innovation stopped, but cost advantage is gone. Now, transistors cost more—not less. That’s a new reality.

Some thought that would kill the industry. But it actually triggered a renaissance, because now if you want better performance, you have to pay for it, not just get it for free from Moore’s Law.

Broadcom’s co-founder, then-CTO Henry Samueli raised this in 2012, showing TSMC’s per-transistor cost curve, and that 28nm was the cost floor. His point: this is the first time in 40 years the sector is being rational.

Moore’s Law’s end didn't stop innovation, just moved it:New transistor structures: device architecture keeps evolving;Chiplets: chips divided into smaller pieces by function and process tech, packaged together and breaking single-chip mask limits (around 830mm²);Advanced packaging: Nvidia’s Blackwell GPU is actually two stacked GPUs plus loads of HBM, all integrated together via interposer for maximum area and performance.

These methods are not cheap, but as long as clients have a reason to pay, they will—hence why Nvidia can maintain a 75% margin.

Michael Marks: If things start to slow, what signals would we see?

Stacy Rasgon: A few things:

First, hyperscaler (cloud giant) capex numbers. The latest earnings show spend still rising. Honestly, by the time this dips, it might be too late to react.

Second, AI application company revenue. Here there's a bigger issue: the gap between model training and usage. Anthropic, for example, was at a $30B run-rate as of April (up from $14B in January, $9B in December, and single billions a year before)—this curve is nearly vertical, showing real paid usage.

Third, hyperscaler cloud revenue and its acceleration.

Fourth, Asian data points: wafer order numbers, CoWoS (TSMC advanced packaging) bookings, capacity saturation at each supply chain link.

All these show positive signals. The general consensus: we’re still compute constrained.

Michael Marks: I care more about long-term structural changes than short-term stock moves. On the supply chain—this time is different, the scale of cloud vertical integration is unprecedented. They design chips, build data centers, financial structures, control cloud entry. For decades, the world was deeply vertically integrated, now it seems to be going the other way—what does this mean?

Stacy Rasgon: Frankly, cloud hyperscalers developing their own chips isn’t new.Google’s TPU (Tensor Processing Unit) is now in its 8th generation, 14 years in the works, 9th generation coming;Amazon’s homegrown Trainium (for training) and Inferentia (for inference) chips have been around five or six years;Amazon’s Graviton server CPUs have six to eight years of history.

So cloud providers vertically integrating to chips isn’t new, but the scale is much greater.

Further upstream, Apple is the classic: moving their entire PC line from Intel x86 to home-grown ARM, modem (baseband) chips, Bluetooth, Wi-Fi, all moving in-house—over ten years doing it.

Even for data centers, Google doesn’t buy straight from Dell, but white-labels: they specify, ODMs (original design manufacturers) build it, using companies like Broadcom for standard chips. Very deep hardware know-how.

Michael Marks: What drives vertical integration? Solving bottlenecks, margin, or defense?

Stacy Rasgon: All three. Bottlenecks matter, but they don't manufacture themselves, so bottlenecks don't go away. More importantly, it’s performance optimization—their gigantic in-house steady workloads mean they know exactly what they need, and custom chips are a perfect fit. There is also a competitive angle: with Nvidia negotiating next year’s price, it’s good to have an alternative in your pocket.

Michael Marks: Now the industry is shifting to inference, an effective reshuffle. Can you discuss this?

Stacy Rasgon: Most spending now is on training—locking in the billions or trillions of parameters, requires huge capital and GPUs.

People ask me: when does spending shift from training to inference? I always say: the sooner, the better, because the problem is: training models itself isn't profitable. Nvidia makes money selling chips, but building a model doesn't earn revenue—you have to use it, and that’s inference.

We’re starting to see this shift. Now the exciting thing is not just chatbot or video gen, but agentic inference: models actually carrying out real-world tasks. The most apparent application: coding. A lot of Anthropic's revenue growth, for example, comes from agent-based programming, where the AI helps or writes code. Clearly, people are paying.

Now even CPU supply is tight—another proof of the inference shift, since inference also eats up lots of compute, CPU and otherwise.

Of course, ROI and economics are still debated. We’re in a heavy investment phase, high capex, burning money on paper. But as revenue scales fast, I’m increasingly optimistic.

Michael Marks: I have heard people are shocked when they see the bills.

Stacy Rasgon: Yeah, it’s ironic—cutting staff, spending more on AI, and finding that spending on AI tokens costs more than the jobs AI replaces. Of course, if AI truly boosts efficiency, it could make sense.

Michael Marks: There’s a crop of inference-focused startups now, like Groq, Cerebras, Sambanova, Tenstorrent, coexisting with Nvidia and Broadcom. Where do you see them going? I suspect the giants will acquire them.

Stacy Rasgon: Funny you say that—Nvidia just bought Groq. At GTC, they spoke about it, which made things clearer to me.

I used to have a simple view: a token is a token, all the same. Tokens are the basic data units—like a word or bit of info, model’s input/output. Compute is often sold per million tokens.

But Jensen Huang said something ordinary-sounding but actually profound: not all tokens are equal; some carry way higher value—especially those requiring ultra-low latency and ultra-fast response. If you’re a new cloud provider, offering low-latency capacity earns much higher returns, that’s what Groq is targeting.

GPUs aren't always the best for all use cases—ultra-quick, ultra-low latency workloads can do better on specialized inference chips. That’s the logic behind buying Groq.

And Jensen acknowledges GPUs aren’t perfect, he’s big enough to admit what needs filling in.

Michael Marks: In the PC era, from 50 companies we ended up with three; same for mobile. AI feels like it’s consolidating too—hence this vertical integration trend.

Michael Marks: Semiconductor equipment rarely gets talked about. Stock has risen, but not as much as stars like Nvidia. What's going on?

Stacy Rasgon: Actually, it’s up a lot. Lam Research (etch equipment) went from $70 to triple or quadruple. Applied Materials doubled, ASML about doubled too.

It just hasn’t jumped tenfold like memory or cloud names. Memory ASP may soar tenfold, but equipment’s ASP can’t.

More importantly, equipment is in a constrained boom: tools are being built but fabs aren’t ready. You need to build the fab, set up cleanrooms, before bringing in gear. These are under construction—next year they arrive—so WFE (wafer fab equipment) spend, while strong, is bottlenecked by physical factory build rates.

This acts as a built-in buffer: physical limits on expansion speed reduce risk of boom/bust. If cleanroom capacity were unlimited, shipments would be far higher. This natural constraint lowers over-expansion risk.

Michael Marks: Will semiconductor equipment see a long upcycle filling the years-long backlog?

Stacy Rasgon: If AI demand doesn’t collapse, yes. If it does, everyone’s toast, but there’s no point entertaining that for now.

Michael Marks: What is Broadcom doing to profit so much in this AI wave? Many know the name, but not the details.

Stacy Rasgon: Broadcom’s management team is great, very strong execution.

Before the AI wave, Broadcom was a diversified company—it’s actually Avago acquiring the old Broadcom, keeping the name, ticker is still AVGO.

The original Broadcom did iPhone RF filters, Bluetooth/Wi-Fi chips, set-top box and cable modem chips, storage chips, and above all, networking—custom networking chips and merchant silicon from acquisitions.

Avago then moved even more into software: bought CA Technologies (mainframe), Symantec enterprise security, and more recently VMware.

Prior to AI, they were about 60% semis, 40% software, with very high margins. CEO Hock Tan always said semis were a mature, mid-single-digit sector—cash rich, but low multiples.

Broadcom had always done custom (ASIC) networking and compute chips—Google’s TPU was a 14-year joint project. But before the AI explosion, it was small, maybe $1-$2B.

Then, ChatGPT launched in November 2022, Nvidia had its earth-shaking earnings, revenue leaped. Google and the rest jumped on.

In H2 2023, Hock Tan started talking up AI. Frankly, you could sense he didn’t yet know how big it’d get—but he honestly reported what was happening in networking and compute.

Then it exploded.

On the last earnings call, Hock Tan said next year Broadcom expects $100 billion in AI revenue alone. The entire company was only a few tens of billions in turnover before—now just AI could surpass $100B, it’s mind-blowing.

The background: Nvidia dominates, but many large users—usually developing their own chips—want deeper customization for their workload, for cost (TCO) and efficiency. And with Nvidia’s 75% margin, it pays to have a backup at the negotiation table.

GPU vs ASIC is a false dichotomy. ASICs suit huge, stable workloads—big, to amortize design cost; stable, because if you change architecture, need new chips, whereas GPUs are flexible and programmable.

Right now, ASIC’s share of AI chips is about (low) teens of percent; will that rise to 25% or even 30%? In a bigger pie, probably. But I don't see an outright replacement of GPUs.

The real key: is the pie big enough? If so, both thrive; if not, both suffer.

Michael Marks: Let’s talk about Intel—especially foundry. Global foundry capacity is tight, geopolitics, TSMC’s pivotal role make this a key battleground. What’s your view on Intel Foundry?

Stacy Rasgon: Stepping back, the foundry push is Pat Gelsinger’s strategy—I think the direction is fine. For national security and industry, we do need Intel foundry back. But Pat’s execution fell short.

He ramped up headcount by 21,000, then had to lay everyone off. Worse, he arrived all upbeat, making it sound flawless.

Now, Stacy Smith’s approach is the right playbook: come in and reset market expectations instead of hyping; fix cost structure; say “I’m happy to be back, but it’ll take time.” That’s the right tone.

I’ve known Smith for years, respect him greatly. He’s doing the right things. Not a miracle worker, but excellent, and just what Intel needs. Deep technical background, knows how to run a foundry—he founded Cadence, was on their board for 20 years; involved deeply in many startups, most big TSMC clients, so he knows what customers want, and has a vast network and can pick up the phone to call TSMC’s CC Wei or Jensen Huang anytime.

Still, it’s a tough fight, but at least Smith isn’t sugarcoating it.

Michael Marks: Intel now has a very interesting investor base—the US government, Nvidia, and SoftBank are all shareholders. What does that mean for Intel as a public company?

Stacy Rasgon: Honestly, for a long time, Intel’s balance sheet was a real risk—market worried if they’d hold together, some thought a breakup might be necessary. Now that concern is basically gone—these investments provided critical support.

Not that I personally love government holding, but it reassures other investors—especially when even Trump tweets about it, it boosts confidence.

If Intel pushes the foundry pivot, they’ll need tons of cash. From that angle, state and strategic investment is a clear positive.

Michael Marks: You raised a point I care about—balance sheet health; not many watch this. Intel’s debt load was a real issue, with maturities looming, but no one discussed it. Now, big strategic investors improve their balance sheet and reduce risk.

Stacy Rasgon: On the positive side, Intel’s new products are pretty good too. 18A process and Panther Lake look very competitive—yields are better than expected, and performance is strong. If volume ramps, that’s a good sign.

On server CPU, they’re still behind, Smith admits it—it’ll be Coral Rapids before they challenge, so that’s around 2028.

The environment helps though: server CPU demand is extremely strong. Last quarter, Intel was selling previously written-off, corner-of-the-warehouse chips—zero on the books, but customers will buy anything available. If that continues, Intel may benefit. Luck, strength, either way, same outcome.

Michael Marks: Everyone’s bullish. What risks are underappreciated?

Stacy Rasgon: ROI remains an open debate. Demand is huge, but how will all this capex eventually monetize? I don’t think it’s settled yet.

Encouraged to see revenue bases growing, but the overall business model is still unclear—right now it’s an intense investment phase; long-term profit still to be clarified.

Also, the consolidation of foundational model companies is worth watching—there are so many right now, some are bound to fail. When that happens, can remaining compute be absorbed smoothly? That’s open.

But the core constraint is energy. Jensen Huang once said we may spend $3-4 trillion annually on infrastructure. It sounded wild, but now we’re near $1T, so $3T may not be far-fetched. But the question is: where does the electricity come from?

18-24 months ago, I studied this with a power equipment specialist: if Nvidia’s prediction comes true, how fast would the US grid need to grow? Result: about 5% new generation capacity per year for a decade.

He reacted like I had two heads—power sector just doesn’t do 5% annual growth. This means local generation, on-site power, becomes indispensable. Even restarting Three Mile Island is on the table.

Of course, never underestimate human ingenuity—engineers are smart, and if there’s profit, they’ll find a way.

Michael Marks: As a VC, what market gaps do you see that need more entrepreneur attention?

Stacy Rasgon: Energy efficiency is critical—everyone faces power constraints.

I’m also really happy to see more semiconductor startups. Ten years ago, I tried to do a VC landscape piece for semiconductors, and it fizzled—there was barely anything to write about. Only CVC (corporate venture capital) like Intel Capital, no true private VC. It’s simple: chips are hard, costly, take years, whereas SaaS is easy.

So regardless of how these chip startups pan out, I’m genuinely happy to see the semiconductor startup ecosystem come alive again. Hard tech is cool again, and it's the right time.

Michael Marks: If I have you back in a year and the world develops as you expect, what will it look like? Who’s up? Who’s down? What did you get right or wrong?

Stacy Rasgon: I hope the industry will be bigger and better.

But what I most want is to see real inference revenue and usage—beyond engineers, do people like you and me really use AI daily and gain value? If that’s visible, that’ll be very exciting.

Michael Marks: I look forward to having you back!

Stacy Rasgon: I’ve done this 18 years. My rule has always been: as long as it’s interesting, I’ll keep going. Once it’s not, I’ll find something new. So far, I love it every day.

Michael Marks: Great, thank you for coming!

Risk DisclaimerThe market has risks, please invest cautiously. This article does not constitute personal investment advice and does not take into account the specific investment objectives, financial situation, or needs of any particular user. Users should consider whether any opinions, viewpoints, or conclusions in this article are suitable for their circumstances. Invest accordingly, at your own risk. ```