Artificial intelligence (AI) image generators are becoming increasingly powerful and rely on heavyweight large-scale language models (LLMs), typically running in the cloud. But the researchers say they have built a new system that can produce high-quality images with about a tenth of the processing steps.
The result is AI that is safer and more environmentally friendly than AI running in power-hungry data centers, but fast and efficient enough to run locally on your phone or laptop.
Article continues below
you may like
They outlined how the new model works in a study uploaded to the preprint arXiv database on September 25, 2025, and issued a statement on March 4 that Lenovo had licensed the model for integration into its upcoming on-device AI platform. This means that this system will soon be included in upcoming smartphones, tablets, and laptops.
The goal is simple but ambitious. It’s about bringing powerful generative AI from remote data centers to the devices people actually use. This not only has environmental and privacy implications, but also has the potential to make AI-based image generation faster than ever before.
Why most AI image generators are slow
Most modern text-to-image systems rely on a technique called diffusion. These AI models start with random noise (essentially a grid of pixels filled with random values) and are gradually refined into images through a long series of steps.
This process typically takes 30 to 50 iterations to create a finished image, with each step requiring significant computational power. As a result, many popular AI image generation tools run on large clusters of graphics processing units (GPUs) on remote servers via the cloud, rather than locally on mobile phones or laptops.
Achieving this level of efficiency is technically challenging because it requires compressing the diffusion model so that it can be executed in just a few steps while maintaining quality.
Humrishav Bandyopadhyay, Postdoctoral Fellow, University of Surrey
Although this architecture is suitable for producing high-quality images, it also introduces practical limitations. This model is slow and energy-intensive, requiring you to send a prompt or image to a remote server before waiting for a response.
In a new study, scientists set out to address that bottleneck. SD3.5-Flash significantly shortens the generation pipeline. The model can generate images in just four processing steps instead of dozens of iterations, the scientists say.
This is achieved by compressing the diffusion process into a more efficient format while preserving image quality. Essentially, the system learns how to “jump” through the fine-tuning process in larger leaps rather than moving forward step by step. However, research shows that maintaining visual quality while reducing the number of steps is a central technical challenge.
What to read next
“Our SD3.5-Flash model allows users to create images entirely on-device from text descriptions without exfiltrating any data from the hardware,” said Hmrishav Bandyopadhyay, a postdoctoral fellow at the University of Surrey who developed the model during an internship at Stability AI, in a statement. “Achieving this level of efficiency is technically difficult because the diffusion model must be compressed so that it can be executed in just a few steps while maintaining quality.”
Reducing the number of inference steps means the model requires far fewer computational resources, allowing it to run on consumer hardware.
Improving the privacy, speed, and sustainability of AI
Running generative AI locally rather than in the cloud can offer several benefits. The first is privacy. When AI models run entirely on-device, there is no need to send prompts and generated images to a remote server, reducing the risk of data leakage, interception, or misuse.
The second is speed. With fewer processing steps and no network delays, image generation can be nearly instantaneous.
Finally, there is an environmental perspective. Large-scale cloud AI models consume large amounts of energy and water through data center operations, but lightweight models running locally can significantly reduce those demands.
Yi-Zhe Song, director of the SketchX Lab at the University of Surrey, said the broader aim is to make AI more accessible and practical: “SD3.5-Flash puts powerful creative tools directly into the hands of users, while keeping their data private and reducing the energy demands associated with cloud processing.”
In the study, the team tested SD3.5-Flash against a traditional diffusion pipeline to measure whether the significant reduction in processing steps had an impact on image quality. They evaluated the system using standard benchmarks for generative models, such as image fidelity and how well the output matches text prompts. These metrics are widely used in machine learning research to compare different image generation approaches.
Tests on standard image generation benchmarks show that the model can provide results similar to traditional diffusion systems, despite reducing the number of processing steps from about 30 to 50 to just four.
Most notably, this technology is already making its way into real products. Lenovo has licensed the model for integration into its upcoming Personal Ambient Intelligence Platform (called Qira), which aims to bring AI capabilities directly to consumer devices.
This could enable features such as AI image generation on laptops, tablets, and smartphones without the need for an internet connection. In March, the company announced its first set of Qira-compatible devices, including a new concept device, suggesting that it won’t be long before this new AI system is integrated into laptops, tablets, and smartphones.
If successful, it will lead to broader changes in how generative AI is delivered. Rather than relying on centrally managed infrastructure, future AI tools will increasingly run locally on the edge and may be built directly into everyday devices. Researchers believe this is part of a larger effort to make generative AI more efficient and practical.
Compressing large models without sacrificing quality remains an active area of research, but SD3.5-Flash suggests that the gap between powerful AI systems and consumer hardware may be rapidly closing. As companies like Lenovo integrate more devices, the next wave of AI creativity tools may not be in the cloud, but in your pocket.
Source link
