// Architecture page — NVIDIA GPU Compute Network Reference Architecture // Card-based technical articles. Clicking a card opens a slide-in canvas // from the right; clicking the backdrop (or pressing Esc) closes it. const { useState, useEffect } = React; /* ---------- Scale profile data (GB200 NVL72, 4-rail) ---------- */ const GB200_PROFILES = [ { gpus: 144, servers: 2, nics: 144, leaf: 1, tor: 2, gpuRack: 2, swRack: 1, power: 249.7 }, { gpus: 216, servers: 3, nics: 216, leaf: 2, tor: 4, gpuRack: 3, swRack: 1, power: 380.0 }, { gpus: 432, servers: 6, nics: 432, leaf: 4, tor: 8, gpuRack: 6, swRack: 2, power: 759.0 }, { gpus: 864, servers: 12, nics: 864, leaf: 8, tor: 16, gpuRack: 12, swRack: 3, power: 1518.0 }, { gpus: 1728, servers: 24, nics: 1728, leaf: 16, tor: 32, gpuRack: 24, swRack: 6, power: 3036.0 }, { gpus: 3456, servers: 48, nics: 3456, leaf: 27, tor: 64, gpuRack: 48, swRack: 12, power: 6055.0 }, { gpus: 6912, servers: 96, nics: 6912, leaf: 54, tor: 128, gpuRack: 96, swRack: 24, power: 12111.0 }]; /* ---------- Scale profile data (GB300 NVL72, TH5 800G, Spine/Leaf) ---------- */ const GB300_PROFILES = [ { gpus: 144, servers: 2, nics: 144, spine: 3, leaf: 6, oOsfp: 144, oMpo: 288, oOsfp2: 144, uOsfp: 144, uMpo: 288, uOsfp2: 144 }, { gpus: 216, servers: 3, nics: 216, spine: 3, leaf: 6, oOsfp: 216, oMpo: 432, oOsfp2: 216, uOsfp: 216, uMpo: 432, uOsfp2: 216 }, { gpus: 432, servers: 6, nics: 432, spine: 6, leaf: 12, oOsfp: 432, oMpo: 864, oOsfp2: 432, uOsfp: 432, uMpo: 864, uOsfp2: 432 }, { gpus: 864, servers: 12, nics: 864, spine: 12, leaf: 24, oOsfp: 864, oMpo: 1728, oOsfp2: 864, uOsfp: 864, uMpo: 1728, uOsfp2: 864 }, { gpus: 1728, servers: 24, nics: 1728, spine: 24, leaf: 48, oOsfp: 1728, oMpo: 3456, oOsfp2: 1728, uOsfp: 1728, uMpo: 3456, uOsfp2: 1728 }]; /* ---------- Scale profile data (H200/B200 8-rail) ---------- */ const SCALE_PROFILES = [ { gpus: 128, servers: 16, nics: 128, leaf: 0, tor: 1, gpuRack: 8, swRack: 1, power: 163.2 }, { gpus: 256, servers: 32, nics: 256, leaf: 2, tor: 4, gpuRack: 16, swRack: 1, power: 339.5 }, { gpus: 512, servers: 64, nics: 512, leaf: 4, tor: 8, gpuRack: 32, swRack: 2, power: 680.2 }, { gpus: 1024, servers: 128, nics: 1024, leaf: 8, tor: 16, gpuRack: 64, swRack: 3, power: 1358.4 }, { gpus: 2048, servers: 256, nics: 2048, leaf: 16, tor: 32, gpuRack: 128, swRack: 6, power: 2716.8 }, { gpus: 4096, servers: 512, nics: 4096, leaf: 32, tor: 64, gpuRack: 256, swRack: 12, power: 5433.6 }, { gpus: 8192, servers: 1024, nics: 8192, leaf: 64, tor: 128, gpuRack: 512, swRack: 24, power: 10867.2 }]; const fmt = (n) => n.toLocaleString('en-US'); /* ---------- Article 1: H200/B200 8-Rail Architecture body ---------- */ const ArticleH200B200 = () => <> {/* Executive Summary */}

Executive Summary

This reference architecture defines an 8-rail, non-blocking RoCEv2 compute fabric for NVIDIA H200 and B200 GPU servers, using 400G/OSFP end-to-end with a two-tier Leaf/Spine topology of 128×400G platform switches. The design is rail-aligned: each of a server's 8 GPU NICs lands on a dedicated TOR ("rail") so that same-rank collectives traverse a single switch hop, eliminating cross-rail oversubscription for NCCL ring/tree primitives at intra-pod scale.

The architecture scales linearly from a 128-GPU pod (single TOR per rail) to a 8,192-GPU AI factory (16 pods × 8 rails) under a single addressing and routing domain, with deterministic optics and power budgets at every step. Total fabric power tracks at ~1.33 kW per GPU node-equivalent across the scale band.

{/* Topology */}

Fabric Topology

Each Server Group consists of 64 GPU servers (512 GPUs) front-ended by 8 TOR switches, one per rail. NIC on every server in the group connects only to TOR_i — the rail-aligned plane. The 8 TORs of a group fan up north to the Leaf layer in a full mesh, with up to 16 groups (MAX) per pod under the 64-leaf footprint shown below.