Huawei’s AI Counterattack Plan: Replacing NVIDIA in China’s Market Within 3 Years?

GeekHok · October 9, 2025, 2:19pm

At Huawei Connect 2025 in Shanghai on September 18, Rotating Chairman Eric Xu delivered a keynote titled “Pioneering SuperPoD Interconnect: Leading a New Paradigm for AI Infrastructure.” He outlined Huawei’s strategy to challenge NVIDIA’s dominance in China’s AI market by 2028, leveraging advanced interconnection technology and massive scale to overcome semiconductor manufacturing limitations.

Chapter 1: Breaking the Chip Barrier

Huawei’s Mate 60 Pro, powered by the 7nm Kirin 9000S in 2023, marked a milestone under U.S. sanctions. However, AI chips demand more. NVIDIA’s H100 and Blackwell series, built on 3nm or better, dominate globally. Huawei’s Ascend 910C, deployed in over 300 Atlas 900 A3 SuperPoD systems for 20+ clients, lags in single-chip performance.Xu acknowledged: “China’s semiconductor manufacturing will trail for years, likely capped at 5nm without EUV lithography.” SMIC struggles with yield and capacity, while NVIDIA’s high-priced chips remain in demand. Huawei’s response is scale, not single-chip superiority.The three-year AI chip plan (2026-2028) includes:

Ascend 950 Series: Q1 2026 launches the 950 PR (optimized for prefill and recommendation) with 128GB proprietary low-cost HBM (HiBL 1.0), 1.6 TB/s bandwidth, 1 PFLOPS FP8, and 2 PFLOPS FP4. Q4 2026 brings the 950 DT (decoding and training), with 144GB memory and 4 TB/s bandwidth.
Ascend 960: Q4 2027, doubling compute, memory, and interconnect ports, supporting Huawei’s HiF4 precision format for enhanced inference.
Ascend 970: Q4 2028, targeting 8 PFLOPS FP4, 4 TB/s bandwidth, and doubled memory. Huawei’s self-developed HBM (HiBL 1.0 to HiZQ 2.0) reduces reliance on Samsung and SK Hynix. Kunpeng 950 CPUs, launching Q1 2026 with 96 cores/192 threads (high-performance) or 192 cores/384 threads (high-density), act as “commanders” (one per two Ascend chips). By 2028, Kunpeng 960 will exceed 256 cores, rivaling AMD EPYC and Intel Xeon.

Huawei plans to double Ascend 910C production to 600,000 units next year. Chinese tech giants, ordered to halt NVIDIA purchases, are shifting to Huawei, with NVIDIA’s H20 and RTX Pro 6000D facing restrictions.

Chapter 2: Core Strategy—Super Nodes and Super Clusters

Huawei’s strategy hinges on “Super Nodes” and “Super Clusters” to offset chip-level deficits. Xu stated: “Since single-chip performance can’t match NVIDIA, we’ll scale clusters to meet domestic demand.”The key is the Lingqu (Unified Bus) interconnect protocol, inspired by the ancient Chinese Lingqu Canal. It offers 2.1-microsecond latency, 200-meter range, 100x reliability, and terabyte-scale bandwidth, connecting heterogeneous components like memory pools, SSDs, and NICs. Huawei plans to open-source Lingqu 2.0 to build an ecosystem.

Atlas 950 Super Node: Q4 2026, integrating 8,192 Ascend 950 DT cards across 160 cabinets (128 compute, 32 communication), delivering 8 EFLOPS FP8, 16 EFLOPS FP4, and 16 PB/s interconnect bandwidth over 1,000 square meters. This aligns with NVIDIA’s Scale Up (vertical expansion).
Atlas 950 Super Cluster: Also Q4 2026, combining 64 Super Nodes with over 520,000 Ascend cards, spanning 64,000 square meters, achieving 524 EFLOPS FP8 and 1 ZFLOPS FP4. This mirrors NVIDIA’s Scale Out (horizontal expansion).
Atlas 960 Super Cluster: 2027, scaling to “million-level” GPUs and 4 ZFLOPS FP4.

Xu claimed the Atlas 950 Super Node offers 6.7x NVIDIA’s NVL144 compute, 15x memory, and 62x bandwidth, while the Super Cluster surpasses xAI’s Colossus by 1.3x. Over 300 Atlas 900 systems already deployed validate Huawei’s scaling capability.

Chapter 3: NVIDIA’s Strategy vs. Huawei’s Domestic Focus

NVIDIA’s Jensen Huang outlined a three-dimensional strategy: Scale Up (single-server expansion), Scale Out (multi-server clusters), and Scale Across (global infrastructure). Huawei adopts the first two—Super Nodes for Scale Up, Super Clusters for Scale Out—but skips Scale Across, citing immature global AI infrastructure and geopolitical complexities.In China, NVIDIA’s H20 is restricted, and RTX Pro 6000D testing has stalled. Huawei’s CloudMatrix 384 (384 Ascend 910C cards) rivals NVIDIA’s GB200 NVL72, occasionally outperforming it. Alibaba, Baidu, and others are adopting Ascend, with DeepSeek’s R1 model, trained on Huawei chips, matching GPT-4 performance.NVIDIA’s CUDA ecosystem remains a hurdle. Huawei’s open-source CANN framework is gaining traction, but developer migration is slow.

Chapter 4: Ecosystem and Challenges

Huawei is building an ecosystem with over 10,000 partners across ISP, telecom, and manufacturing. At Connect 2025, Ascend Computing President Zhang Dingxuan announced open-source shared memory for SuperPoD chip pooling. The TaiShan 950 SuperPoD, using Kunpeng, pioneers general-purpose computing, boosting GaussDB performance 2.9x over traditional mainframes.Challenges persist: SMIC’s 5nm yield is limited, and Huawei’s HBM scaling needs time. Geopolitical risks, like tighter U.S. sanctions, could disrupt the plan.

Conclusion: A New AI Era by 2028?

Xu concluded: “If we achieve chip mass production with viable yield and cost, and deploy Lingqu interconnect successfully, replacing NVIDIA in China’s AI market is realistic and inevitable.”By 2028, Huawei envisions million-scale Ascend clusters powering data centers, enabling models like DeepSeek-R2 to rival global leaders. While NVIDIA may dominate the West, Huawei could claim China’s AI throne, driven by scale, interconnect innovation, and domestic policy. The stage is set for a seismic shift in the AI landscape.