HPE AI Factory with NVIDIA expands for the era of agents with Vera Rubin NVL72 and 128-GPU rack configurations

No audio yetJun 16, 2026published Jun 16, 2026

Hewlett Packard Enterprise and NVIDIA announced a significant expansion of the HPE AI Factory program on June 16, 2026, targeting enterprise deployments of agentic AI workloads. The update introduces support for NVIDIA's Vera Rubin NVL72, rack-scale configurations supporting up to 128 Rubin GPUs, and NVIDIA Spectrum-X networking delivering 1.6x higher throughput versus standard Ethernet.

What's new

Vera Rubin NVL72 support: HPE's systems now support the Vera Rubin NVL72 configuration, enabling enterprises to run frontier-scale models exceeding one trillion parameters on-premises or in colocation
HPE Compute XD700: Supports up to 128 Rubin GPUs per rack, enabling dense, high-throughput inference clusters
NVIDIA Vera CPU coming 2027: Paired with HPE ProLiant Compute DL394 Gen12 servers for integrated compute and AI acceleration
NVIDIA Spectrum-X networking: Delivers 1.6x higher networking performance versus standard Ethernet, addressing the inter-node bandwidth bottleneck in multi-node inference
Unleash AI program expansion: Nearly a dozen new AI software partners joined for the agentic era, expanding the validated software ecosystem

Context

HPE AI Factory is a joint infrastructure program combining HPE's server portfolio with NVIDIA's compute, networking, and software stack. The program has been targeting enterprises that want validated, turnkey AI infrastructure without assembling individual components independently.

The shift to agentic AI changes infrastructure requirements meaningfully. Unlike batch inference or request-response AI workloads, agents are persistent, long-running processes that hold state, call tools, and interact with external systems continuously. This places different demands on hardware: lower latency, higher memory bandwidth, and sustained throughput rather than burst capacity.

Why it matters

Access to trillion-parameter model inference has historically required hyperscaler API calls. The Vera Rubin NVL72 configurations bring that scale into enterprise infrastructure — relevant for organizations with data residency requirements, security mandates, or economics that favor owned compute over API consumption. The networking improvements matter for multi-node inference, which is standard for the largest models. Spectrum-X's 1.6x improvement over standard Ethernet reduces the communication overhead that otherwise caps multi-GPU efficiency. The NVIDIA Vera CPU addition in 2027 signals a longer-term roadmap commitment that enterprise buyers need before committing to infrastructure investments.

Corroborating sources

Blogs.nvidia
https://blogs.nvidia.com/blog/hpe-ai-factory-agentic-enterprise/
“Enterprises are moving agentic AI from proof of concept to production — and the next generation of AI factories are built for the era of agents.”