Inception Unveils Revolutionary Diffusion-Based AI Model for Text

2/26/2025

Discover how Inception's groundbreaking diffusion-based large language model is revolutionizing text generation, offering faster performance and reduced computing costs compared to traditional models.

Inception Unveils Revolutionary Diffusion-Based AI Model for Text

Learn about Inception's innovative diffusion-based AI model for text generation, providing faster performance and reduced costs, challenging traditional large language models.

Inception: Revolutionizing AI with Diffusion-Based Language Models

Inception, a pioneering company based in Palo Alto, has emerged from the innovative efforts of Stanford computer science professor Stefano Ermon. The company has developed a groundbreaking AI model utilizing "diffusion" technology, which they have named a diffusion-based large language model, or DLM for short.

Understanding Diffusion-Based Large Language Models

Currently, the spotlight in generative AI models is primarily on two types: large language models (LLMs) and diffusion models. LLMs, which utilize the transformer architecture, are primarily employed for text generation. In contrast, diffusion models, which drive AI systems like Midjourney and OpenAI’s Sora, are mainly used to generate images, videos, and audio.

Inception’s diffusion-based language model offers the functionalities of traditional LLMs, including code generation and question-answering, but claims significantly faster performance and reduced computing costs. According to Ermon, the model leverages the power of diffusion technology to overcome the speed limitations associated with LLMs.

Professor Stefano Ermon’s Vision

Ermon explained to TechCrunch that he has long researched how to apply diffusion models to text generation in his Stanford lab. His hypothesis was that the sequential nature of LLMs limits their speed. In contrast, diffusion models can start with an approximate estimate of the data they generate, refining it all at once. This allows for generating and modifying large blocks of text in parallel.

After persistent efforts, Ermon and his student achieved a significant breakthrough, which they published in a research paper last year. Recognizing the potential of this advancement, Ermon founded Inception last summer, bringing on board two former students, UCLA professor Aditya Grover and Cornell professor Volodymyr Kuleshov, to co-lead the company.

Innovative Solutions and Market Adoption

While Ermon withheld specific details about Inception’s funding, it is understood that the Mayfield Fund has invested in the company. Inception has already garnered several customers, including unnamed Fortune 100 companies, by addressing critical needs for reduced AI latency and increased speed.

"What we found is that our models can leverage the GPUs much more efficiently," Ermon stated, referring to the crucial chips used for running models in production. "I think this is a big deal. This is going to change the way people build language models."

Inception’s Offerings

Inception provides an API along with on-premises and edge device deployment options, support for model fine-tuning, and a suite of out-of-the-box DLMs for various use cases. The company claims its DLMs can run up to 10 times faster than traditional LLMs while costing 10 times less.

"Our ‘small’ coding model is as good as OpenAI’s GPT-4o mini while more than 10 times as fast," a company spokesperson told TechCrunch. "Our ‘mini’ model outperforms small open-source models like Meta’s Llama 3.1 8B and achieves more than 1,000 tokens per second."