Written by 10:41 am AI, Stock

**Simplify Private AI Model Deployments with OctoStack by OctoAI**

OctoAI (formerly known as OctoML), today announced the launch of OctoStack, its new end-to-end solu…

OctoAI (formerly known as OctoML) has introduced OctoStack, a comprehensive solution designed for deploying generative AI models within a company’s private cloud infrastructure, whether on-premises or in a virtual private cloud offered by leading providers such as AWS, Google, Microsoft, Azure, Coreweave, Lambda Labs, Snowflake, and others.

Initially, OctoAI primarily focused on optimizing models for enhanced performance. Leveraging the Apache TVM machine learning compiler framework, the company later introduced its TVM-as-a-Service platform. This offering evolved into a complete model-serving solution that merged optimization capabilities with a DevOps platform. With the emergence of generative AI, OctoAI launched the fully managed OctoAI platform to assist users in serving and refining existing models. OctoStack essentially represents an extension of this platform tailored for private deployments.

Image Credits: OctoAI

Today, OctoAI’s CEO and co-founder, Luis Ceze, revealed that the platform boasts over 25,000 developers and hundreds of paying customers utilizing the service in production. While many of these companies are considered GenAI-native, there is a substantial market of traditional enterprises looking to adopt generative AI. Recognizing this opportunity, OctoAI is now targeting these enterprises with OctoStack.

Ceze highlighted the enterprise market’s transition from experimental phases to full-scale deployments. Enterprises are increasingly concerned about data security and are reluctant to transmit sensitive AI data over APIs. Moreover, many enterprises have invested in their compute resources, leading them to seek deployment solutions that offer greater control over their AI models.

OctoAI has been diligently developing the architecture to support both its SaaS and hosted platforms. While the SaaS platform is optimized for Nvidia hardware, OctoStack is compatible with a broader range of hardware, including AMD GPUs and AWS’s Inferentia accelerator. This hardware diversity presents optimization challenges but also plays to OctoAI’s strengths.

Deploying OctoStack is streamlined for enterprises, with OctoAI providing ready-to-deploy containers and associated Helm charts. Developers can seamlessly interact with the API, regardless of whether they are targeting the SaaS product or OctoAI within their private cloud environment.

The primary use case for enterprises involves leveraging text summarization and RAG for enabling users to interact with internal documents. Some companies are customizing these models to operate on their internal code bases, enabling them to run their own code generation models, akin to GitHub’s Copilot Enterprise offering.

OctoStack empowers enterprises to operate in a secure environment under their control, facilitating the transition of these technologies into production for their employees and customers. Dali Kaafar, founder and CEO at Apate AI, emphasized the importance of running customized models within flexible, scalable, and secure environments, a capability offered by OctoStack to meet the performance and security requirements of their use case.

Visited 4 times, 1 visit(s) today
Tags: , Last modified: April 10, 2024
Close Search Window
Close