Computing Power “Strictly Stratified”? Big Companies Prioritize Internal Supply, Small Companies Have Nothing to Work With, Silicon Valley Faces a New Wave of GPU “Supply Cutoff”

Microsoft, Amazon, and other cloud giants are prioritizing internal teams and major clients for NVIDIA GPU allocation, leaving small and medium AI startups in a situation where "chips are hard to obtain." The battle for computing resources is sparking a new round of structural crisis in Silicon Valley. According to The Information, this round of supply shortages has affected many AI startups supported by top institutions such as Sequoia Capital, Founders Fund, General Catalyst, and Andreessen Horowitz. General Catalyst’s managing partner Hemant Taneja has sent out a questionnaire to founders in its portfolio, asking about their access to computing power, and stated in the survey: "We've heard from a lot of people that computing power—especially GPU access—is one of the biggest bottlenecks this year." The tightened supply has directly driven up rental prices, boosting cloud service providers’ profit margins, but also significantly raising operating costs for startups. Meanwhile, Microsoft Azure has explicitly told internal staff that clients should expect long waiting times at least until the end of 2026. This reshaping of the computing landscape is profoundly affecting the entire AI startup ecosystem. History repeats itself, but this time the intensity is greater This round of GPU shortages is very similar to what happened in early 2023—back then, cloud providers also pulled computing resources from their cloud services, prioritizing internal teams and core clients like OpenAI. Venture capital firms such as Andreessen Horowitz and Index Ventures ultimately had to build their own GPU resource pools to alleviate urgent needs within their portfolios. However, the severity of the current situation is even greater. The Information points out that the explosive demand for AI programming tools is exacerbating the shortage; large AI developers like Anthropic are seeing a surge in demand for computing power, further squeezing the resources available to smaller clients. Another structural factor aggravating the shortage is that the two-to-three-year cloud service contracts signed by many AI startups are expiring, giving cloud providers the opportunity to raise prices or reallocate computing resources to buyers willing to pay more. Microsoft: "Use it or lose it," with clear tiered allocation Microsoft’s computing power allocation mechanism has formed a clear tiered system. According to a Microsoft employee familiar with the matter, Azure divides its clients into three tiers: - Tier 1 includes about 1,000 clients with the highest cloud spending, enjoying priority access; - Tier 2 consists of clients with significant spending and dedicated sales representatives; - Tier 3 encompasses smaller enterprises whose relationships are managed through CDW and other Microsoft distributor partners. As for chip access, Microsoft has recently started requiring customers wanting NVIDIA Blackwell chips to commit to renting at least 1,000 chips for a minimum of one year, and the contract amount must reach tens of millions of dollars. Even for older-generation NVIDIA chip rentals, customers must wait weeks or even months. More noteworthy is Microsoft’s "use it or lose it" policy: For clients paying on-demand for GPU access, Microsoft tracks their usage, and if a server is idle even for a few hours, access may be revoked. Startups that receive free computing credits through the "Microsoft for Startups" program face the same rule—failure to fully utilize the chips will lead to GPU access being withdrawn. Startups: Price hikes, ghosting, unable to compete with major clients The experience of Krea, an AI image generation startup, is quite representative. Established four years ago and having raised $83 million from Andreessen Horowitz, Bain Capital Ventures, and others, the company signed a six-month contract for hundreds of NVIDIA Blackwell chips at $2.80 per chip per hour six months ago. However, when Krea recently sought more servers to retrain new models from scratch, the situation changed abruptly. Co-founder and CEO Victor Perez said some cloud service representatives simply would not answer calls; when they finally returned calls, prices had increased sharply, and they demanded a three-year contract to even discuss further. "Some just disappeared, some said there’s no supply, others tried to get us to accept extremely harsh terms," Perez said. Eventually, Krea signed a one-year contract at $3.70 per chip per hour—up 32% from their last contract. Meanwhile, another startup founder seeking to rent a tightly clustered group of nearly 1,000 GPUs said NVIDIA representatives told him last week that finding such a cluster at major cloud providers is extremely difficult—the daily rental cost would exceed $70,000. Data from GPU cloud service provider Lightning AI also confirms this supply-demand tension: The company currently lists about 40,000 GPUs online, but pending orders from about 40 clients collectively seek about 400,000 GPUs. CEO Will Falcon said prices have risen by more than 25% in the past six months, from around $1.60 per hour to over $2 per hour, and sometimes even higher. Some founders opt for "building their own" instead of cloud Faced with long waits and rising rental costs, some startup founders are considering bypassing cloud service providers by purchasing their own GPUs. Collin McLelland, founder of AI agent startup Collide, said his company is considering spending around $500,000 to purchase NVIDIA GPUs for self-hosting. Collide completed a $14 million seed round last year and focuses on developing AI agent products for the oil and gas industry. McLelland plans to rent data center or cloud provider space to host their own GPUs, avoiding the uncertainties and delays of the rental model. "For us, not having computing power when we need it is the greatest risk," McLelland said. "Most people are just afraid of hardware. I’ve owned oil wells, so I’m numb to this sort of thing." Though the short-term cost of buying GPUs is much higher than renting, he believes that over multiple years, the overall cost could be lower and will eliminate dependence on cloud providers. Cloud vendors benefit from profits, but ecosystem risks emerge For cloud service providers, this supply shortage has brought long-awaited profit improvement. Previously, some cloud vendors faced pressure in their GPU business profitability, but the current imbalance between supply and demand allows them to raise rental prices and improve marginal profit margins. However, the long-term impact on the AI startup ecosystem is not to be underestimated. The concentration of computing resources among major clients means that small and medium startups will face higher barriers and greater uncertainty in model training and product iteration. General Catalyst is exploring ways to help its portfolio companies access GPU resources, such as by creating shared computing pools or negotiating directly on behalf of startups—an approach similar to what venture firms did in 2023 by building their own GPU pools, reflecting how computing access has become an unavoidable structural challenge in the AI investment ecosystem. Risk warnings and disclaimers The market carries risks, and investment should be approached with caution. This article does not constitute personal investment advice, nor does it take into account the specific investment objectives, financial situations, or needs of individual users. Users should consider whether any opinions, viewpoints, or conclusions in this article suit their own particular circumstances. Investments made accordingly are at your own risk.