Stable Cascade: Efficient Training and Enhanced Prompt Compliance
By Emilia David, an AI journalist with expertise in technology, finance, and the economy.
Stability AI has introduced its latest innovation in image generation with the Stable Cascade model, boasting increased speed and power compared to its predecessor, Stable Diffusion, a foundational technology in various text-to-image AI applications.
Stable Cascade excels in producing images swiftly and providing diverse variations of the original image, as well as enhancing the resolution of existing photos. Its editing capabilities encompass features like inpainting and outpainting, enabling targeted modifications within images, along with canny edge functionality that facilitates the creation of new images based on existing edge patterns.
This new model is currently accessible on GitHub for academic research purposes, with restrictions on commercial utilization. Despite this limitation, it introduces a range of functionalities, coinciding with the emergence of image generation models from tech giants such as Google and Apple.
Diverging from Stability’s flagship Stable Diffusion models, Stable Cascade operates as a trio of distinct models leveraging the innovative Würstchen architecture. The initial phase, stage C, condenses textual prompts into latent representations, subsequently decoded by stages A and B to fulfill the request.
By segmenting requests into smaller components, the model optimizes memory usage, reduces training time on specialized GPUs, and enhances processing speed. This approach not only improves prompt alignment but also elevates the overall aesthetic quality of the generated images. The efficiency is evident, with image creation taking approximately 10 seconds, a notable advancement compared to the current SDXL model’s 22-second duration.
Stability AI, known for pioneering the stable diffusion technique, has faced legal challenges regarding alleged unauthorized training on copyrighted data, including a pending UK lawsuit filed by Getty Images against the company, scheduled for trial in December. In response, Stability AI initiated commercial licensing options through a subscription model in December to support ongoing research endeavors.