What’s the actual host PCIe topology on 8-GPU HGX H100/H200 servers?
Trying to nail down how independent the CPU-to-GPU PCIe paths really are on shipping 8-GPU SXM HGX boxes (H100, H200, B200), and the public documentation keeps half-answering.
A few things I’d like to confirm from people who have hardware in front of them or have benchmarked these:
1. Is the host upstream to each PCIe switch a single x16 Gen5, or two x16 Gen5 ports landing on the same switch? Does it vary by OEM, by NVIDIA reference, or by cloud deployment?
2. Does any commercial 8-GPU SXM design break the “two GPUs per PCIe switch” pattern, so each GPU has an independent host path?
3. For people on cloud bare-metal H100/H200 nodes — which provider gives you what topology? nvidia-smi topo -m and lspci -tv outputs would be incredibly useful if you can share them.
I have my own reading of the OEM block diagrams but I’d rather not anchor the thread on it. Mostly looking for ground truth from people who can actually run the commands or have direct vendor confirmation.
Thanks.