With so much money flowing into AI startups, it’s a good time to become an AI researcher and test your ideas. Also, if your idea is sufficiently novel, it may be easier to obtain the necessary resources as an independent company rather than within a large research institute.
That’s the story of Inception, a startup developing diffusion-based AI models that just raised $50 million in seed funding. The round was led by Menlo Ventures, with participation from Mayfield, Innovation Endeavors, Microsoft’s M12 fund, Snowflake Ventures, Databricks Investment, and Nvidia’s venture arm NVentures. Andrew Ng and Andrej Karpathy provided additional angel funding.
The project leader is Professor Stefano Armon of Stanford University, whose research focuses on diffusion models that generate output through iterative refinement rather than word-by-word. These models power image-based AI systems such as Stable Diffusion, Midjourney, and Sora. Ermon, who has been working on these systems since before the AI boom took off, uses Inception to apply the same model to a broader range of tasks.
Along with the funding, the company released a new version of its Mercury model designed for software development. Mercury is already integrated into many development tools, including ProxyAI, Buildglare, and Kilo Code. Most importantly, Ermon said the diffusion approach helps Inception’s model save on two of the most important metrics: latency (response time) and computational cost.
“These diffusion-based LLMs are much faster and more efficient than what others are building today,” Ermon says. “It’s a completely different approach, and there’s a lot of innovation that can still be done.”
A little background knowledge is required to understand the technical differences. Diffusion models are structurally different from the autoregressive models that dominate text-based AI services. Autoregressive models such as GPT-5 and Gemini work sequentially, each predicting the next word or word fragment based on what was previously processed. Diffusion models trained for image generation take a more holistic approach, incrementally changing the overall structure of the response until it matches the desired outcome.
Conventional wisdom is to use autoregressive models for text applications, and that approach has been very successful with recent generations of AI models. However, a growing body of research suggests that diffusion models may perform better when the models process large amounts of text or manage data constraints. According to Ermon, these properties are a big advantage when performing operations on large codebases.
tech crunch event
san francisco
|
October 13-15, 2026
The pervasive model also provides flexibility in how hardware is used, an advantage that is especially important as the infrastructure demands of AI become clearer. While autoregressive models require operations to be performed one after the other, diffuse models can process many operations simultaneously, significantly reducing latency for complex tasks.
“Our benchmark is over 1,000 tokens per second, which is much higher than what’s possible using existing autoregressive technology, because our product is built to be parallel. It’s built to be really, really fast,” Ermon says.
Source link
