IBM is broadening the capabilities of Ceph for both block and file functions, positioning it as a backend data repository for AI workloads behind the Storage Scale parallel file system. Ceph, an open-source scale-out storage software, provides file, block, and object interfaces on top of an object store, featuring self-healing and self-managing attributes. IBM acquired Ceph as part of its acquisition of Red Hat for $34 billion in 2019. Over a year ago, IBM shifted the Ceph product from its Red Hat division to its storage unit, rebranding it as Storage Ceph. Recent insights from IBM Storage General Manager Denis Kennelly shed light on IBM’s strategies for Ceph and its key focus areas.
Denis Kennelly, who supervises all IBM storage products, highlighted the company’s growth in the hardware storage market share, particularly with high-end DS8000 arrays and FlashSystem all-flash arrays. When discussing Ceph sales, Kennelly confirmed their growth. IBM Storage’s primary focus areas include hybrid cloud, AI, and data recovery and resilience, with Ceph playing a vital role in hybrid cloud and AI initiatives by enabling access to unstructured data for large language model processing systems.
In the field of data recovery and resilience, IBM’s position has been strengthened by the Cohesity-Veritas acquisition, enhancing its Storage Defender product through collaboration with Cohesity. Kennelly stressed the ongoing consolidation in the backup market, driven by strategic partnerships and acquisitions. Regarding Ceph’s market role, Kennelly emphasized its alignment with the increasing demand for software-defined storage solutions. He highlighted the integration of Ceph, OpenShift, and containers by Red Hat and IBM’s efforts to further enhance this integration with a comprehensive software-defined storage stack running on standard hardware from various vendors.
IBM has recently improved Ceph’s block storage capabilities by introducing NVMe/TCP support and enhancing usability. Kennelly highlighted the ease and scalability of adding storage capacity in the Ceph environment compared to traditional SAN setups, especially for AI projects requiring substantial storage. He mentioned the collaboration between the WatsonX team and Ceph, underscoring IBM’s dedication to utilizing Ceph for its generative AI platform, WatsonX.
Regarding the potential inclusion of GPUDirect support to Ceph, Kennelly expressed interest in exploring this option, citing IBM Storage’s existing capabilities in efficiently delivering data to GPU servers. He highlighted the integration of Storage Scale with Ceph, utilizing GPUDirect support for accelerated data processing. Additionally, Kennelly commended Storage Scale’s AFM (active file management) layer for scalable file system caching, enabling seamless data management across distributed clusters and sites globally.
IBM’s ongoing benchmarking endeavors with Storage Scale have shown promising results, with anticipated published outcomes later this year. The collaboration between Scale and Ceph aims to enhance data accessibility and query performance, aligning with IBM’s strategy of utilizing Storage Scale for rapid and efficient data retrieval. Kennelly emphasized the significance of leveraging Storage Scale for quick queries, pointing out the performance advantages over traditional NFS solutions.
In conclusion, Kennelly expressed optimism about the transformative potential of AI and the pivotal role that Ceph will play as a foundational data store, working in conjunction with Storage Scale to effectively support AI initiatives.