SK Group Chairman Chey Tae-won announced at Computex that SK Hynix will double wafer capacity over the next five years to relieve a memory shortage he said could persist until 2030. Jensen Huang underlined the pressure by stopping at the SK Hynix booth and writing “Please Make More” on an HBM4E wafer display. SK Hynix currently holds 58% of the global HBM market, and its capacity decisions set the supply ceiling for the entire AI infrastructure build.
The doubling announcement signals confidence in sustained demand — but the timeline means the scarcity economics of the next two to three years are largely locked in. A fire at SK Hynix’s Cheongju plant the same day hospitalized seven workers and evacuated 3,600 employees before being contained, a reminder of how concentrated advanced memory production remains. Watch whether Samsung’s competing HBM4E ramp and the risk of an 18-day union strike shift the near-term balance.
Bloomberg’s report from the Computex floor has the Chairman’s full remarks and shortage context.
Microsoft named Baseten as a distribution partner for MAI-Thinking-1 at Build 2026 (June 2, San Francisco). The structural detail: Baseten holds model weights for optimization, while fine-tuned checkpoints remain under customer control and are not visible to Microsoft. The weight-holding arrangement positions Baseten as a customization and governance layer between hyperscaler model providers and enterprise end users — differentiated in a market where most inference providers are neutral pass-throughs.
Jensen Huang declared Marvell “the next potential trillion-dollar company” at Computex while Nvidia disclosed a $2 billion strategic investment — but the endorsement landed on concrete financials: Q1 FY27 revenue of $2.418B (+28% YoY) and Q2 guidance of $2.7B (+35%). The same day, Marvell launched the Teralynx T100, described as the industry’s first 102.4 Tbps AI-optimized switch chip, positioning the company directly in the networking layer Nvidia needs to scale its AI factory buildout.
Cadence announced ChipStack AI Super Agent— billed as the first fully autonomous virtual engineer for chip verification — with Nvidia’s engineering teams already deploying it in production. The tool compresses chip verification from roughly five weeks to under 24 hours. If the throughput holds at scale, it reduces tape-out cycle time for every lab using it, compounding across every product generation.
KV-cache efficiency gains compress inference costs. Researchers at Peking University submitted “Multi-Segment Attention: Enabling Efficient KV-Cache Management for Faster Large Language Model Serving” (arXiv:2606.02964, June 1). The system, AsymCache, combines a new attention mechanism for non-contiguous cached context with a cost-aware eviction policy. Results: time-to-first-token reduced 1.90–2.03× over baselines; time-per-output-token improved 1.62–1.71×; in agent serving systems, average job latency dropped up to 18.1%. Memory pressure is the binding constraint on inference profitability at scale — advances that reduce KV-cache overhead translate directly to lower per-token costs for any provider running long-context or multi-turn workloads.
Broadcom Q2 FY2026 earnings after close. First major AI chip supplier report post-Computex. Q1 AI revenue was $8.4B (+106% YoY). Watch guidance and any commentary on AI ASIC customer concentration and the Anthropic debt facility.
Astera Labs press conference at Computex on the Scorpio X-Series 320 Lane Smart Fabric Switch and optical connectivity details.