"Lobster" gives "memory" a major "extension of life"?

```

Agentic AI tools represented by OpenClaw are driving the demand logic of the memory market into a whole new paradigm. According to Zhuifeng Trading Desk, the latest report released by Morgan Stanley on March 18 points out: As AI moves from "thinking" to "executing," DRAM will replace HBM as the most difficult chip bottleneck to overcome in AI infrastructure, and the memory cycle will thus see a far longer lifespan than expected.

Channel surveys show that by Q2 2026, server DRAM DDR5 prices are expected to rise by more than 50% quarter-on-quarter, with some Chinese hyperscale cloud vendors bidding even higher; DDR4 contract prices are expected to increase by 40%-50%, and enterprise-level NAND SSD quotes are expected to rise by no less than 40%-50%. Morgan Stanley believes that we are currently in the mid-stage of a memory upcycle, and supply tightening is more severe than previously judged—“Wall Street’s earnings forecasts will have to catch up with reality.”

This judgment has already been directly reflected in target price adjustments: SK Hynix’s 2026-2027 EPS forecasts are raised by 24% and 32% respectively, with the target price from 1.1 million KRW to 1.3 million KRW, implying 43% upside from the current price; Samsung Electronics common stock target price is raised to 251,000 KRW. Both stocks maintain the “overweight” rating.

Morgan Stanley’s core assessment is: the market is accustomed to linear thinking, while AI agent layer’s capability is expanding exponentially—as AI shifts from “generating answers” to “completing tasks,” the scale of memory demand will leap accordingly, and this transformation has only just started to accelerate.

"Doing things" consumes more memory than "thinking things"

The logical starting point of Morgan Stanley's report is a seemingly simple yet deeply insightful judgment: “Doing things requires more DRAM than thinking things.”

The traditional Large Language Model (LLM) work mode is a GPU-dominated linear process: receiving a question, batch processing all input tokens (prefill stage), then generating replies token by token (decode stage); the CPU is responsible for converting results into text output. In this process, GPU computing power is the decisive bottleneck, and DRAM only needs to assist with buffer read/write.

The advent of agentic AI completely changes this logic. Taking OpenClaw as an example, this open-source self-hosted AI assistant can access over 50 messaging platforms such as WhatsApp, Telegram, Slack, Signal, etc., while possessing system-level permissions like browser automation, file manipulation, command line execution, API calls. It doesn’t “answer questions,” but “completes tasks”—searching the Internet, reading documents, invoking external tools, executing code, ultimately delivering a set of multistep, collaboratively-generated action results.

The core technical implication of this paradigm shift is: Workflows have expanded from single GPU inference to multistep orchestration, tool invocation, and coordination processes, with CPU compute time often contributing more to overall latency than GPU. At the same time, multiple agents must continuously share context, unload KV caches (Key-Value Cache), and store and retrieve every intermediate result—memory rises from the back-end of the compute chain to the core bottleneck position.

OpenClaw: The Extreme Magnifier of Memory Demand

Morgan Stanley conducted a detailed quantitative analysis of OpenClaw’s memory demand, concluding that: in such agent tools, DRAM trumps everything else, with other hardware constraints taking a back seat.

This tool has two distinctly different operating modes:

Lightweight gateway mode (remotely calling external APIs such as Claude or GPT-4): even so, the bottleneck is no longer in the GPU or CPU, but in Node.js runtime’s DRAM usage. In practice, a minimum of 2GB DRAM is needed, with 4GB recommended for production-grade stability.

Local model mode (directly loading and running the AI model on the device): DRAM and graphics HBM both become constraints. Morgan Stanley recommends 32GB system DRAM; running models with 7-8 billion parameters requires an extra 8GB graphical DRAM, models with 13-70 billion parameters need 16-24GB, while large models like Llama 3 70B, Qwen 72B need more than 80GB.

The report emphasizes: lack of memory doesn’t merely degrade performance, but causes direct crashes—JavaScript will throw “heap out of memory” errors, leading to installation failures and runtime interruptions. This highlights the hard constraint attribute of memory in agent scenarios: insufficient memory means not slow, but "dead".

Compute Bottleneck Shift: From HBM to System Memory

The memory demand features of OpenClaw are a microcosm of a larger structural change.

Morgan Stanley points out that AI compute bottlenecks are undergoing a systemic migration: from compute itself to data movement, from HBM to system memory (DRAM), and the whole memory hierarchy is evolving from HBM-centric to a multi-tier structure that combines HBM, DRAM, and NVMe NAND SSD.

One technical driver of this change is the rapid expansion of long context requirements. KV cache grows linearly with the number of tokens and, under distributed inference (prefill-decode disaggregation), must be transmitted via the network, greatly increasing the CPU’s I/O management burden. RAG retrieval, context management, and other agent core operations all involve intensive memory I/O.

Market corroboration is also clear. According to Morgan Stanley, both Intel and AMD have recently confirmed that high-core server processors are facing material supply shortages; AMD EPYC CPU’s share of total server CPU revenue has exceeded 40% for the first time, and cloud instances deployed with EPYC grew over 50% YoY. Nvidia launched its standalone Vera CPU, reached a multi-year agreement with Meta, and for the first time, deployed standalone CPUs at scale to support personal agent operations.

Price Acceleration: Mid-cycle, Space Remains

This structural transformation is already reflected in pricing.

For DRAM, Q2 2026 server DDR5 has seen limited spot deals transact at +50% quarter-on-quarter, and hyperscale cloud vendors have accepted this price, with some Chinese cloud vendors bidding even higher. By the end of February, 64GB RDIMM contract prices had climbed to $910-$920, about 20% higher than the Q1 average of $800. LPDDR and consumer-related DRAM Q2 prices are expected to rise at least 40%-50%; DDR4 contract prices are expected to rise 40%-50%. HBM3E, previously expected to drop 20%-25%, turned into a single-digit percent increase for ASIC renewals.

As for NAND, enterprise SSD Q2 prices are expected to rise 40%-50% quarter-on-quarter; consumer-end products are expected to see not less than a 60% increase, with eSSD prices potentially doubling again in certain scenarios.

Morgan Stanley believes that the YoY price acceleration is continuing, and we are still in the mid-stage of the upcycle. Once the market adjusts its earnings forecasts to reflect the current unprecedented capacity constraints, relevant targets have significant room for recovery; the potential for higher capital returns could further support outperformance.

~~~~~~~~~~~~~~~~~~~~~~~~

The above wonderful content is from Zhuifeng Trading Desk.

For more detailed interpretation, including real-time analysis and front-line research, please join [Zhuifeng Trading Desk · Annual Membership]

Risk Disclaimer and Exemption ClauseThe market has risks, investment needs caution. This article does not constitute individual investment advice and does not take into account the special investment objectives, financial situation, or needs of individual users. Users should consider whether any opinions, viewpoints, or conclusions in this article are suitable for their particular situation. Investment based on this is at your own risk. ```