With the debut of the latest Qualcomm Cloud AI 100 Ultra, Qualcomm has expanded its presence in AI inference processing, building on its initial foray into cloud AI accelerator offerings.
While top-tier tech companies such as Lenovo, Hewlett Packard Enterprise (HPE), Inventec, Foxconn, Gigabyte, and Asus have long integrated Qualcomm’s Cloud AI 100 accelerator family, it is now gaining traction in the public cloud arena.
Amazon Web Services (AWS) recently introduced DL2q, the first instance type accelerated by Qualcomm, which incorporates the Qualcomm Cloud AI 100. The focus lies on the accelerator’s unique suitability for developing electrical ADAS and related applications, areas where Qualcomm is rapidly solidifying its foothold, despite the versatility of the new accelerator for general inference applications.
Qualcomm’s Cloud AI 100 Overview
In 2020, Qualcomm launched the Cloud AI 100 accelerator, specifically designed to enhance AI inference processing speed in cloud computing environments.
The Cloud AI 100 is tailored for inference, the phase of AI application where a trained model processes new data, crucial for tasks requiring rapid responses like speech recognition, language translation, image analysis, and real-time IoT data processing.
This accelerator strikes a balance between efficiency and performance, delivering the necessary horsepower for demanding AI workloads while showcasing a compelling total cost of ownership (TCO) proposition.
MLPerf 3.1 Advancements
In September 2023, Qualcomm made significant strides with its Cloud AI 100 inference capabilities when MLCommons released benchmark results for MLPerf Inference 3.1.
The results highlighted substantial enhancements in performance, power efficiency, and reduced latencies, especially for Natural Language Processing (NLP) and computer vision networks running on the Qualcomm Cloud AI 100.
Qualcomm demonstrated improved metrics across various categories, with notable boosts in power efficiency and performance for NLP and computer vision networks on a 2U datacenter server platform equipped with 16 Qualcomm Cloud AI 100 PCIe Pro accelerators.
The Cloud AI 100’s efficacy in applications spanning edge and data center domains was underscored by the MLPerf Inference v3.1 results, showcasing its prowess in key metrics like inference per second and per watt (I/S/W).
Introduction of Cloud AI 100 Ultra
In November 2023, Qualcomm expanded its Cloud AI 100 lineup with the launch of the Qualcomm Cloud AI 100 Ultra, catering specifically to the demands of generative AI and large language models (LLMs).
The Ultra variant builds upon the effectiveness of its predecessors, capable of handling up to 100 billion parameters on a single 150-watt card and supporting models with up to 175 billion features with dual cards, extendable for larger models by combining multiple AI 100 Ultra units.
Despite its high performance, the Cloud AI 100 Ultra maintains the power efficiency characteristic of the family, crucial for cost reduction in data centers and sustainability in AI operations.
Conclusion
Qualcomm’s strategic investment in AI inference technology, exemplified by the Cloud AI 100 series, positions the company at the forefront of the evolving AI landscape, particularly in advancing AI capabilities to the edge. The introduction of the Cloud AI 100 Ultra further solidifies Qualcomm’s commitment to addressing the needs of complex AI tasks while ensuring operational efficiency.
By combining high-performance, energy-efficient inference capabilities with a comprehensive IP portfolio, Qualcomm distinguishes itself in the competitive AI market landscape, offering a compelling alternative to industry giants like AWS, Google, and Microsoft, as well as NVIDIA, AMD, and Intel.
Qualcomm’s Cloud AI 100 product line not only showcases its prowess in high-end AI inference markets but also underscores its potential to revolutionize AI processing across diverse sectors, from healthcare to automotive and beyond, setting a benchmark that competitors strive to match.