AI’s Dominance in Cloud and Communications Infrastructure Trends in 2024
In 2024, amidst the myriad trends shaping cloud and communications infrastructure, Artificial Intelligence (AI) stands out prominently. Particularly within the networking sector, AI is poised to revolutionize the construction of infrastructure to accommodate AI-driven applications.
AI possesses distinct characteristics that differentiate it from conventional cloud infrastructure. The training of large language models (LLMs) and similar applications necessitates ultra-low latency and exceptionally high bandwidth.
Generative AI (GenAI), which generates various outputs like text, images, and sounds from natural language queries, is propelling a shift towards highly distributed and accelerated computing platforms. These novel environments demand a sophisticated and robust foundational infrastructure that encompasses a broad spectrum of functionalities, ranging from chips to specialized networking components to distributed high-performance computing systems.
This heightened emphasis on networking has elevated its status as a pivotal component of the “AI stack.” Industry leaders such as Cisco have recognized this shift in their marketing strategies and investor communications. Notably, Nvidia and Arista Networks have garnered substantial investor attention due to their comprehensive networking offerings, including the BlueField networking platform and key networking solutions for AI providers like Microsoft. Additionally, the market features several intriguing private companies, which will be elaborated on shortly.
The influence of AI on networking manifests in two primary dimensions: “Networking for AI” and “AI for Networking.” Organizations must develop infrastructure optimized for AI while also integrating AI into their existing infrastructure to automate and enhance operational efficiency.
In essence, AI pervades nearly every facet of cloud infrastructure while serving as the linchpin for a new era of computational and networking paradigms.
Network Infrastructure Tailored for AI
The development of infrastructure to support AI services is a complex endeavor, especially within the realm of networking. It demands substantial investments and meticulous engineering to minimize latency and maximize connectivity, rendering traditional enterprise and cloud infrastructure comparatively rudimentary.
According to Shekar Ayyar, CEO of cloud-native networking firm Arrcus, customers are increasingly exploring ways to interconnect multiple AI clusters, extending them to inference nodes and edges. This necessitates a fundamental transformation of the infrastructure stack to accommodate LLMs for GenAI. Notably, Nvidia stands out as a dominant player in this space, offering a comprehensive infrastructure stack for AI encompassing software, chips, data processing units (DPUs), SmartNICs, and networking solutions.
A pertinent debate revolves around the role of InfiniBand, a specialized high-bandwidth technology commonly utilized in AI systems, compared to the expanding adoption of Ethernet. While Nvidia leads in InfiniBand technology, it has also ventured into Ethernet-based solutions. Ethernet’s appeal lies in its cost-effectiveness, albeit requiring software optimizations and integration with SmartNICs and DPUs. The Ultra Ethernet Consortium, comprising industry giants such as Arista, Broadcom, Cisco, HPE, Microsoft, and Intel, is actively targeting this market segment. Private entities like Arrcus and Enfabrica have also joined this consortium.
Prominent Startups Focusing on AI Networking
The emergence of Ethernet-based networking solutions as an alternative to InfiniBand presents numerous opportunities for nascent companies. Simultaneously, specialized AI service providers are carving a niche in constructing AI-optimized cloud environments.
Here are some noteworthy private enterprises making waves in this domain:
-
Arrcus: Offers Arrcus Connected Edge for AI (ACE-AI), a solution leveraging Ethernet to support AI/ML workloads within datacenter clusters processing LLMs. Targeting communications service providers, enterprises, and hyperscalers, Arrcus provides a software-based approach to flexibly network compute resources for AI infrastructure, circumventing the constraints of switching hardware. Recently, Arrcus joined the Ultra Ethernet Consortium, focusing on high-performance Ethernet solutions for AI.
-
DriveNets: Presents a Network Cloud-AI solution that deploys a Distributed Disaggregated Chassis (DDC) approach to interconnect GPUs from various brands in AI clusters via Ethernet. This highly scalable platform serves as a viable alternative to InfiniBand, showcasing improved job completion times in AI training clusters.
-
Enfabrica: A startup founded in 2020, Enfabrica has developed an accelerated compute fabric switch (ACF-S) that streamlines AI processing by enhancing connections between network components and AI systems, thereby reducing latency and TCO for AI systems. Enfabrica’s impressive investor lineup underscores its potential to disrupt the AI networking landscape.
AI-Driven Observability and Automation
AI’s impact extends beyond infrastructure to reshape the utilization of infrastructure tools, particularly in driving automation. Observability, a critical aspect involving the collection and analysis of IT system data, is witnessing a transformation catalyzed by AI.
Leading the charge are companies like Kentik and Selector, leveraging AI and machine learning to monitor and analyze IT infrastructure data for capacity planning, cost management, and troubleshooting. Additionally, the integration of WebAssembly (Wasm) holds promise in streamlining cloud application deployment, with companies like Fermyon at the forefront of leveraging Wasm for enhanced efficiency.
AI for Multicloud Networking
The rise of AI is fueling the demand for multicloud networking, necessitating seamless data exchange across diverse cloud environments. This trend also underscores the growing significance of edge data collection.
Networking firms like Aviatrix and Itential are pivotal players in this landscape, offering solutions for secure and integrated multicloud connectivity. AI’s role in optimizing networking operations and enhancing security features is paramount, as highlighted by industry experts.
In conclusion, AI’s pervasive influence on networking and infrastructure marks a defining theme for 2024, with industry players aligning their technological prowess to cater to this transformative trend. While the hype surrounding AI may subside, the enduring impact on networking and infrastructure deployments is poised to shape the future landscape significantly.