Advertisement

News

MIT has created an AI model capable of generating images 30 times faster

Soon you will be able to generate images in a snap.

MIT has created an AI model capable of generating images 30 times faster
Pedro Domínguez

Pedro Domínguez

  • Updated:

The diffusion models, popularly known as “image generators”, generate high-quality images, but they require dozens of steps to do so. A group of scientists from MIT has created an AI image generator that simplifies the multiple processes of traditional diffusion models in a single step.

Midjourney ACCESS

In a single step, this new AI generates images 30 times faster. This is achieved by training a new computer model to mimic the actions of more complex original models that produce images (teacher-student model). The method, called “distribution matching distillation” (DMD), creates images much more quickly while maintaining their quality.

This method combines the ideas of diffusion models, such as DALL-E 3 or Midjourney, and generative adversarial networks (GAN) to generate visual content in a single step, compared to the hundred stages of iterative refinement currently required by diffusion models.

The DMD method has two components:

  • Loss by regression: Anchors the mapping to ensure a coarse organization of the image space to make training more stable.
  • Loss by distribution matching: Ensures that the probability of generating a specific image with the student model corresponds to its frequency of occurrence in the real world.

To do this, two diffusion models are used as a guide. These models allow the system to distinguish between generated images and real ones, and train the fast single-step generator. The system achieves faster creation by training a new network to reduce the distribution divergence between its generated images and those in the training dataset used by the classical diffusion models.

“Our key idea is to approximate the gradients that guide the improvement of the new model using two diffusion models,” says Tianwei Yin, PhD in electrical engineering and computer science from MIT, affiliated with CSAIL and principal investigator of the DMD framework. “In this way, we distill the knowledge from the original, more complex model into the simpler and faster model, while avoiding the known problems of instability and mode collapse of GANs.”

Midjourney ACCESS
Pedro Domínguez

Pedro Domínguez

Publicist and audiovisual producer in love with social networks. I spend more time thinking about which videogames I will play than playing them.

Latest from Pedro Domínguez

Editorial Guidelines