At Google I/O, Google Cloud revealed a new A3 supercomputer virtual machine to tackle the needs of LLMs and generative AI. The system is powered by eight Nvidia H100 GPUs, four fourth-generation Intel Xeon Scalable processors, two terabytes of host memory, and 3.6 terabytes of bisectional bandwidth between the eight GPUs through NVSwitch and NVLink 4.0, two Nvidia technologies designed to maximize throughput across numerous GPUs. The devices can generate up to 26 exaFlops of power, which could reduce the time and expense associated with training bigger machine learning models. Google has revealed the A3 virtual machines, which will operate on Google’s Jupiter data center networking fabric. This enables full-bandwidth reconfigurable optical connections with on-demand topology adjustment.
Customers have the option of operating the A3 VMs themselves or as a managed service, with the do-it-yourself solution utilizing Google Kubernetes Engine (GKE) and Google Compute Engine (GCE). The A3 VMs are now only available by joining up for a preview waitlist.





