Cisco Reference Architectures for AI Networking with the Nexus Dashboard
18m
Cisco provides comprehensive reference architectures for AI networking, scalable from small 96-GPU clusters up to massive 32,000-GPU deployments. These designs, available on Cisco.com and Nvidia.com, are vendor-agnostic, supporting Nvidia, AMD, and Intel. The core focus is to simplify operations for customers, ensuring ease of design at scale while maintaining automation and end-to-end visibility. This is achieved through the Nexus Dashboard platform, which streamlines the complex requirements of AI infrastructure.
The Nexus Dashboard significantly simplifies AI networking management. It enables customers to quickly create AI fabrics, choosing between routed or VXLAN EVPN options, with best-practice configurations for lossless fabrics, including QoS, ECN, and PFC, automatically applied. The platform also enables easy activation of advanced features, such as Dynamic Load Balancing (DLB), with minimal clicks. It facilitates the discovery and onboarding of switches into the AI fabric, organizes them into scalable units, and provides guardrails against misconfigurations. Customers can manage their AI clusters seamlessly alongside traditional data center and storage fabrics, leveraging a unified dashboard that offers clear topology views and inventory details of switches, interfaces, and connected GPUs.
Beyond setup and management, the Nexus Dashboard provides critical visibility into AI jobs and troubleshooting capabilities. Integrating with workload managers like Slurm enables users to monitor AI jobs and correlate network performance with GPU and NIC issues. The dashboard offers an "at a glance" view of AI resources, highlighting anomalies and advisories. Users can drill down into specific jobs to visualize resource utilization and pinpoint performance bottlenecks. Detailed analytics provide insights into Ethernet interface drops, CRC errors, and GPU-specific metrics, including temperature, utilization, and power. The platform generates job-specific topologies, identifies anomalies down to individual links and GPUs, and provides actionable insights for root-cause analysis and resolution. For customers seeking integration with multi-vendor environments or custom automation workflows, the Nexus Dashboard also provides a comprehensive set of APIs that complement Cisco's broader AI Canvas for multi-domain orchestration.
Presented by Meghan Kachhi, Technical Marketing Engineering Technical Leader, Cisco, and Richard Licon, Principal Technical Marketing Engineer, Cisco. Recorded live at AI Infrastructure Field Day in Santa Clara on January 28th, 2026. Watch the entire presentation at https://techfieldday.com/appearance/cisco-data-center-networking-presents-at-ai-infrastructure-field-day/ or visit https://techfieldday.com/event/aiifd4/ or https://www.cisco.com/ for more information.