AI Infrastructure Field Day 2

43 Episodes

AI Infrastructure Field Day 2 is returning for a four day event you won't want to miss, from April 22–25 in Santa Clara, offering live insights into the latest AI infrastructure advancements. Starting with a full day at Google Cloud learning about Google’s AI hypercomputing systems, we'll be continuing sessions with Juniper Networks, Solidigm, Phison, Netris, Nutanix, Aviz Networks, and Keysight. The event will cover critical AI infrastructure topics, including GPU and TPU acceleration, data management, liquid cooling, on-premises LLM solutions, networking automation, and AI data center optimization. Tune in and watch these presentations live on LinkedIn, on our website, or on Techstrong TV.

43 Episodes

04:47

Learn What it Takes to Power AI at AI Infrastructure Field Day 2
Episode 1

Learn What it Takes to Power AI at AI Infrastructure Field Day 2

Episode 1

AI Infrastructure Field Day 2 is returning for a four day event you won't want to miss, from April 22–25 in Santa Clara, offering live insights into the latest AI infrastructure advancements. Starting with a full day at Google Cloud learning about Google’s AI hypercomputing systems, we'll be cont...
24:41

54. AI Should Become Boring - Tech Field Day Podcast
Episode 2

54. AI Should Become Boring - Tech Field Day Podcast

Episode 2

Mature technologies deliver business value by integration into boring production applications, so AI needs to be boring. This Tech Field Day Podcast episode features Max Mortillaro, Guy Currier, Jay Cuthrell, and Alastair Cooke. AI has frequently been in the public news, many organizations are bu...
08:26

Validating Frontend Networks to Optimize and Secure Low- Latency LLM Data Flow with Keysight
Episode 3

Validating Frontend Networks to Optimize and Secure Low- Latency LLM Data Flow with Keysight

Episode 3

As large language models scale, new challenges emerge - not only in maximizing GPU performance but also in validating the infrastructure that fuels the data pipeline used for training. On the front end, this includes securely ingesting user data from distributed cloud and customer environments in...
38:41

Building Trust at Scale. How Crusoe Validates Network Infrastructure for AI Workloads with Keysight
Episode 4

Building Trust at Scale. How Crusoe Validates Network Infrastructure for AI Workloads with Keysight

Episode 4

In this session, Crusoe shares how they are actively testing frontend networks and inter-VM/host data transfers that feed their GPU clusters. By validating the performance, reliability, and scalability of its infrastructure early, Crusoe aims to identify and resolve issues internally, minimizing ...
13:00

Maximizing the Performance of AI Backend Fabric with Keysight
Episode 5

Maximizing the Performance of AI Backend Fabric with Keysight

Episode 5

This session provides an overview of the Keysight AI (KAI) Data Center Builder solution and how it supports each phase of AI data center design and deployment with actionable data to improve performance and increase the reliability of AI clusters. The presentation explains how KAI Data Center Bui...
29:03

Demonstrating Keysight's AI Fabric Test Methodology
Episode 6

Demonstrating Keysight's AI Fabric Test Methodology

Episode 6

This session provides an overview of the Keysight AI fabric test methodology, demonstrating key findings and improvements achieved through automated testing and the search for optimal configuration parameters. Alex Bortek, Lead Product Manager at Keysight Technologies, introduces the Keysight AI ...
11:50

Introduction to the AI Hypercomputer with Google Cloud
Episode 7

Introduction to the AI Hypercomputer with Google Cloud

Episode 7

Sean Derrington, Product Manager, Storage at Google Cloud, introduced the AI Hypercomputer at AI Infrastructure Field Day, highlighting Google Cloud's investments in making it easier for customers to consume and run their AI workloads. The focus is on infrastructure with consideration to the cons...
29:18

Storage Intelligence with Google Cloud
Episode 8

Storage Intelligence with Google Cloud

Episode 8

Manjul Sahay, Group Product Manager at Google Cloud Storage, presented on Storage Intelligence with Google Cloud, focusing on helping customers, both enterprises and startups, manage their storage effectively for AI applications. These customers often face challenges in managing storage at scale ...
24:48

AI Hypercomputer Cluster Toolkit with Google Cloud
Episode 9

AI Hypercomputer Cluster Toolkit with Google Cloud

Episode 9

Ilias Katsardis, Senior Product Manager for AI infrastructure at Google Cloud, presented on the AI Hypercomputer Cluster Toolkit, addressing the complexities of deploying AI infrastructure on Google Cloud's compute engine and GKE. He highlighted the challenges customers face when trying to quickl...
27:46

Google Kubernetes Engine and AI Hypercomputer with Google Cloud
Episode 10

Google Kubernetes Engine and AI Hypercomputer with Google Cloud

Episode 10

Ishan Sharma, Group Product Manager in the Google Kubernetes Engine team, presented on GKE and AI Hypercomputer, focusing on industry-leading infrastructure, training quickly at mega scale, serving with lower cost and latency, economic access to GPUs and TPUs, and faster time to value. He emphasi...
34:34

Overview of Cloud Storage Storage for AI, Lustre, GCSFuse, and Anywhere cache with Google Cloud
Episode 11

Overview of Cloud Storage Storage for AI, Lustre, GCSFuse, and Anywhere cache with Google Cloud

Episode 11

Marco Abela, Product Manager at Google Cloud Storage, presented an overview of Google Cloud's storage solutions optimized for AI/ML workloads. The presentation addressed the critical role of storage in AI pipelines, emphasizing that an inadequate storage solution can significantly bottleneck GPU ...
29:28

Intro to Managed Lustre with Google Cloud
Episode 12

Intro to Managed Lustre with Google Cloud

Episode 12

Dan Eawaz, Senior Product Manager at Google Cloud, introduced Managed Lustre with Google Cloud, a fully managed parallel file system built on DDN Exascaler. The aim is to solve the demanding requirements of data preparation, model training, and inference in AI workloads. Managed Lustre provides h...
25:36

The latest in high-performance storage, Rapid on Colossus with Google Cloud
Episode 13

The latest in high-performance storage, Rapid on Colossus with Google Cloud

Episode 13

Michal Szymaniak, Principal Engineer at Google Cloud, presented on Rapid Storage, a new zonal storage product within the cloud storage portfolio, powered by Google's foundational distributed file system, Colossus. The goal in designing Rapid Storage was to create a storage system that offers the ...
25:00

Analytics Storage and AI, Data Prep and Data Lakes with Google Cloud
Episode 14

Analytics Storage and AI, Data Prep and Data Lakes with Google Cloud

Episode 14

Vivek Sarswat, Group Product Manager at Google Cloud Storage, presented on analytics storage and AI, focusing on data preparation and data lakes. He emphasized the close ties between analytics and AI workloads, highlighting key innovations built to address related challenges. The presentation dem...
29:45

AI hypercomputer and GPU acceleration with Google Cloud
Episode 15

AI hypercomputer and GPU acceleration with Google Cloud

Episode 15

Dennis Liu, a Product Manager at Google Cloud specializing in GPUs, presented on AI hypercomputer and GPU acceleration with Google Cloud. Liu covered Google Cloud's AI hypercomputer, from consumption models to purpose-built hardware. Focus was given to Google's cluster director for managing GPU f...
31:37

AI Hypercomputer and TPU (Tensor) acceleration with Google Cloud
Episode 16

AI Hypercomputer and TPU (Tensor) acceleration with Google Cloud

Episode 16

Rose Zhu, a Product Manager at Google Cloud TPU, presented on TPUs for large-scale training and inference, emphasizing the rapid growth of AI models and the corresponding demands for compute, memory, and networking. Zhu highlighted the specialization of Google's TPU chips and systems, purpose-bui...
26:37

Cloud WAN Connecting networks for the AI Era with Google Cloud
Episode 17

Cloud WAN Connecting networks for the AI Era with Google Cloud

Episode 17

This presentation by Aniruddha Agharkar, Product Manager at Google Cloud Networking, centers on Cloud WAN, Google's fully managed backbone solution designed for the enterprise era and powered by Google's planet-scale network. Customers have historically relied on bespoke networks using leased lin...
24:45

Secure and optimize AI and ML workloads with the Cross-Cloud Network with Google Cloud
Episode 18

Secure and optimize AI and ML workloads with the Cross-Cloud Network with Google Cloud

Episode 18

Vaibhav Katkade, a Product Manager at Google Cloud Networking, presented on infrastructure enhancements in cloud networking for secure, optimized AI/ML workloads. Focusing on the lifecycle of AI/ML, encompassing training, fine-tuning, and serving/inference, and the corresponding network imperativ...
36:49

AI Unbound, Your Data Center Your Way with Juniper Networks
Episode 19

AI Unbound, Your Data Center Your Way with Juniper Networks

Episode 19

Praful Lalchandani, VP of Product, Data Center Platforms and AI Solutions at Juniper Networks, opened the presentation by highlighting the rapid growth of the AI data center space and its unique challenges. He noted that Juniper Networks, with its 25 years of experience in networking and security...
36:33

Maximize AI Cluster Performance using Juniper Self-Optimizing Ethernet with Juniper Networks
Episode 20

Maximize AI Cluster Performance using Juniper Self-Optimizing Ethernet with Juniper Networks

Episode 20

Vikram Singh, Sr. Product Manager, AI Data Center Solutions at Juniper Networks, discussed maximizing AI cluster performance using Juniper's self-optimizing Ethernet fabric. As AI workloads scale, high GPU utilization and minimized congestion are critical to maximizing performance and ROI. Junipe...
26:44

Securing AI Clusters, Juniper’s Approach to Threat Protection with Juniper Networks
Episode 21

Securing AI Clusters, Juniper’s Approach to Threat Protection with Juniper Networks

Episode 21

AI clusters are high-value targets for cyber threats, requiring a defense-in-depth strategy to safeguard data, workloads, and infrastructure. Kedar Dhuru highlighted how Juniper's security portfolio provides end-to-end protection for AI clusters, including secure multitenant environments, without...
04:59

GPYOU: Building and Operating your AI Infrastructure with Juniper Networks
Episode 22

GPYOU: Building and Operating your AI Infrastructure with Juniper Networks

Episode 22

AI infrastructure is a critical but complex domain, and IT organizations face the pressure to deliver results quickly. Juniper Networks shows Juniper Apstra as a solution to streamline the management of AI data centers, providing proven designs. Kyle Baxter emphasizes the necessity of a robust ne...
23:28

Day 0: Designing your AI data center with Juniper Networks
Episode 23

Day 0: Designing your AI data center with Juniper Networks

Episode 23

Juniper Networks' presentation at AI Infrastructure Field Day focuses on designing AI data centers using Apstra, specifically emphasizing rail-optimized designs and highlighting Apstra's ability to create a fully functional network architecture in just minutes, incorporating native modeling for t...
14:31

Day 1: Managing your AI data center at scale with Juniper Networks
Episode 24

Day 1: Managing your AI data center at scale with Juniper Networks

Episode 24

This presentation by Kyle Baxter focuses on how Juniper Networks' Apstra solution can manage AI data centers at scale. Apstra simplifies network configuration for AI/ML workloads by providing tools to assign virtual networks across numerous ports, an essential capability in environments with pote...

AI Infrastructure Field Day 2

43 Episodes

Share with your friends

43 Episodes

Learn What it Takes to Power AI at AI Infrastructure Field Day 2

Episode 1

54. AI Should Become Boring - Tech Field Day Podcast

Episode 2

Validating Frontend Networks to Optimize and Secure Low- Latency LLM Data Flow with Keysight

Episode 3

Building Trust at Scale. How Crusoe Validates Network Infrastructure for AI Workloads with Keysight

Episode 4

Maximizing the Performance of AI Backend Fabric with Keysight

Episode 5

Demonstrating Keysight's AI Fabric Test Methodology

Episode 6

Introduction to the AI Hypercomputer with Google Cloud

Episode 7

Storage Intelligence with Google Cloud

Episode 8

AI Hypercomputer Cluster Toolkit with Google Cloud

Episode 9

Google Kubernetes Engine and AI Hypercomputer with Google Cloud

Episode 10

Overview of Cloud Storage Storage for AI, Lustre, GCSFuse, and Anywhere cache with Google Cloud

Episode 11

Intro to Managed Lustre with Google Cloud

Episode 12

The latest in high-performance storage, Rapid on Colossus with Google Cloud

Episode 13

Analytics Storage and AI, Data Prep and Data Lakes with Google Cloud

Episode 14

AI hypercomputer and GPU acceleration with Google Cloud

Episode 15

AI Hypercomputer and TPU (Tensor) acceleration with Google Cloud

Episode 16

Cloud WAN Connecting networks for the AI Era with Google Cloud

Episode 17

Secure and optimize AI and ML workloads with the Cross-Cloud Network with Google Cloud

Episode 18

AI Unbound, Your Data Center Your Way with Juniper Networks

Episode 19

Maximize AI Cluster Performance using Juniper Self-Optimizing Ethernet with Juniper Networks

Episode 20

Securing AI Clusters, Juniper’s Approach to Threat Protection with Juniper Networks

Episode 21

GPYOU: Building and Operating your AI Infrastructure with Juniper Networks

Episode 22

Day 0: Designing your AI data center with Juniper Networks

Episode 23

Day 1: Managing your AI data center at scale with Juniper Networks

Episode 24