Sean Derrington, Product Manager, Storage at Google Cloud, introduced the AI Hypercomputer at AI Infrastructure Field Day, highlighting Google Cloud's investments in making it easier for customers to consume and run their AI workloads. The focus is on infrastructure with consideration to the consumption model and optimized software. The AI Hypercomputer encompasses optimized software and purpose-built hardware, with storage, compute, and networking at its foundation.
A key announcement was Google Cloud Managed Luster, a new offering based on their partnership with DDN and Exascaler. Managed Luster provides a scalable parallel file system ideal for AI and ML workloads, offering petabyte-scale storage, low latency (sub-millisecond), and high throughput (up to a terabyte per second). Google Cloud also announced Anywhere Cache, allowing users to keep data closer to accelerators and improve the performance of AI workloads. Anywhere Cache enables caching up to a petabyte of capacity within a given zone and delivers high bandwidth, up to 2.5 terabytes per second. Rapid Storage delivers high QPS, up to 20 million QPS per bucket, and throughput up to 6 terabytes per second for a given bucket.
The presentation also touched on advancements in computing and networking. Google Cloud announced new A4 and A4 Ultra machines within their GPU portfolio in partnership with NVIDIA, and their seventh-generation TPU, Ironwood, which offers significantly higher performance and memory than previous versions, deployed as a cluster with over 9,200 chips, offering 42.5 exaflops of compute capacity. Additionally, improvements to networking infrastructure with Cloud WAN, providing a fully managed service that enhances performance by up to 40% were discussed. Also, GKE inference helps improve AI training through intelligent routing.
Presented by Sean Derrington, Product Manager, Storage, Google Cloud. Recorded live in Santa Clara, California, on April 22, 2025, as part of AI Infrastructure Field Day. Watch the entire presentation at https://techfieldday.com/appearance/google-cloud-presents-at-ai-infrastructure-field-day-2/ or https://techfieldday.com/event/aiifd2/ for more information.
Up Next in AI Infrastructure Field Day 2
-
Storage Intelligence with Google Cloud
Manjul Sahay, Group Product Manager at Google Cloud Storage, presented on Storage Intelligence with Google Cloud, focusing on helping customers, both enterprises and startups, manage their storage effectively for AI applications. These customers often face challenges in managing storage at scale ...
-
AI Hypercomputer Cluster Toolkit with...
Ilias Katsardis, Senior Product Manager for AI infrastructure at Google Cloud, presented on the AI Hypercomputer Cluster Toolkit, addressing the complexities of deploying AI infrastructure on Google Cloud's compute engine and GKE. He highlighted the challenges customers face when trying to quickl...
-
Google Kubernetes Engine and AI Hyper...
Ishan Sharma, Group Product Manager in the Google Kubernetes Engine team, presented on GKE and AI Hypercomputer, focusing on industry-leading infrastructure, training quickly at mega scale, serving with lower cost and latency, economic access to GPUs and TPUs, and faster time to value. He emphasi...