Cloudflare, the leading sky protection platform and content delivery network, is now offering developers access to AI through its expanded border network. This network includes GPU-powered infrastructure and model-serving capabilities, allowing developers to leverage cutting-edge foundation models. Accessing Cloudflare’s AI system is as simple as making a straightforward REST API call.
In 2017, Cloudflare introduced Workers, a cloud computing platform at the edge. This platform empowers developers to create Browser Service Workers that operate locally across Cloudflare’s global edge locations. Developers can manipulate HTTP requests and responses, generate parallel requests, and even respond directly from the edge using a Worker. The API utilized by Cloudflare Workers aligns with the W3C Service Workers standard.
Cloudflare has enhanced its Workers with AI functionalities through the development of conceptualAI. This software introduces three key components to support AI capabilities:
- The utilization of NVIDIA GPUs within Cloudflare’s global network enables the cloud-based AI model of Workers AI. This approach allows users to focus on their programs while minimizing system management tasks, as they only pay for the resources they consume.
- Vectorize, a matrix database, facilitates scenarios requiring access to both pre-built models and customized data by enabling efficient vector indexing and storage at a low cost.
- Organizations can manage memory, control pricing, and monitor AI deployments through the AI Gateway, irrespective of the hosting environment.
Cloudflare has collaborated with industry giants such as NVIDIA, Microsoft, Hugging Face, Databricks, and Meta to advance GPU infrastructure and foundational designs. Additionally, the platform hosts embedding models to translate words into vectors. The Vectorize collection provides a framework for storing, indexing, and querying vectors, enhancing the performance of large language models (LLMs) and reducing errors in actions. The AI Gateway enhances observability, enforces rate limiting, and caches repetitive queries, thereby optimizing application performance and reducing costs.
The Workers AI design library houses the latest and most advanced foundation models, including Meta’s Llama 2, Stable Diffusion XL, and Mistral7B. Cloudflare leverages ONNX Runtime, a neural network exchange runtime developed by Microsoft, to enhance the efficiency of running models in resource-constrained environments. Microsoft employs similar systems to manage foundational models in Windows.
Developers can write Artificial inference code in JavaScript and deploy it to Cloudflare’s edge network, with the flexibility to invoke models using a simple REST API in any programming language. This seamless integration enables the incorporation of conceptual AI into a wide range of environments, including internet, desktop, and mobile applications.
Initially introduced in September 2023 with inference capabilities, Workers AI by Cloudflare aims to expand its coverage to 100 cities worldwide by the end of the year, striving for near-universal accessibility.
By integrating GPU-powered Workers AI, matrix databases, and an AI Gateway for deployment management, Cloudflare distinguishes itself as a pioneer among CDN and network providers in enhancing border networks with AI capabilities. The platform offers a rich design library and leverages ONNX Runtime efficiency through partnerships with industry leaders like Meta and Microsoft.