Ethernet and its Evolution to Support Nokia AI Networking
25m
Ethernet continues to evolve to meet the performance and scaling demands of modern AI networking architectures, progressing from RoCEv2 toward innovations driven by the Ultra Ethernet Consortium (UEC). This presentation discusses these requirements and introduces UEC Specification 1.0, with a focus on scale-out AI designs and the core philosophies shaping its development. Key Ethernet capabilities defined in UEC 1.0, both already implemented and forthcoming, are highlighted to show how Ethernet is being optimized for large-scale AI workloads. Alfred Nothaft explains that the primary challenge in AI fabrics is congestion management, particularly during the synchronization phases of training where thousands of GPUs simultaneously attempt to share massive amounts of gradient data. While legacy tools like ECN and PFC provide basic notification and pause mechanisms, they are often insufficient for the high-velocity requirements of current AI clusters.
The move toward UEC 1.0 represents a fundamental shift from network-centric congestion control to an end-node-centric philosophy. Under the RoCEv2 model, the network infrastructure is largely responsible for managing traffic flows and reacting to congestion. In contrast, UEC shifts the intelligence to the Network Interface Card (NIC) at the GPU endpoint. This allows for more granular, per-packet load balancing rather than traditional flow-based hashing, enabling the NIC to "spray" traffic across multiple paths and dynamically adjust based on real-time telemetry. Furthermore, the UEC transport (UET) is designed to be connectionless and includes native, hardware-level security and encryption from the outset, addressing data sovereignty and privacy concerns that were previously overlooked in backend fabrics.
UEC 1.0 introduces several sophisticated mechanisms to ensure job completion times are minimized. These include packet trimming, which reduces a packet to its header during congestion to signal the source without losing the stream's context, and advanced in-band telemetry for precise congestion signaling. The specification also features link-layer retransmission to quickly recover from localized bit errors and credit-based flow control to meter traffic before it ever saturates the fabric. By leveraging Ethernet's vast ecosystem and rapid bandwidth scaling, doubling speeds every two years toward 1.6 terabits, Nokia and the UEC aim to provide a highly flexible, vendor-neutral alternative to proprietary interconnects, supporting everything from local scale-out clusters to geodistributed scale-across environments.
Presented by Alfred Nothaft, Senior Director SRPG Product Management. Recorded live at Networking Field Day 40 in San Jose on April 8, 2026. Watch the entire presentation at https://techfieldday.com/appearance/nokia-presents-at-networking-field-day-40/ or visit https://TechFieldDay.com/event/nfd40 or https://Nokia.com/ for more information.