VLOGGER: the new AI model from Google capable of creating moving avatars

A group of researchers from Google has created VLOGGER, a new artificial intelligence tool that takes a still image and is able to turn it into an animated and controllable avatar. This is a video generation approach that is somewhat different from Sora, from OpenAI, but it could have many applications.

Gemini DOWNLOAD

VLOGGER is an AI model capable of creating an animated avatar from a still image and maintaining the photorealistic appearance of the person in the photo in each frame of the final video. Similar things can already be done to some extent with tools like Pika Labs’ lip syncing, but this seems to be a simpler option that consumes less bandwidth.

The model also takes an audio file of the person speaking and controls the movement of the body and lips to reflect the natural way that person would move if they were the one saying the words. This includes creating head movements, facial expressions, gaze, blinking, as well as hand gestures and movements of the upper body without any reference beyond the image and audio.

Currently VLOGGER cannot be tested, as it is nothing more than a research project with several demonstration videos, but if it ever becomes a product it could be a new way of communicating in team collaboration apps like Slack or Teams.

VLOGGER is based on the broadcast architecture that drives text-to-image, video, and even 3D models, like MidJourney or Runway, but adds additional control mechanisms. To generate the avatar, VLOGGER follows a series of steps: first, it takes the audio and image as input data, subjects them to a 3D motion generation process, then to a “temporal diffusion” model to determine timing and movement, and finally scales up and converts it into the final result.

To train the model, a large multimedia dataset called MENTOR was needed, which contains 800,000 videos of different people speaking with each part of their face and body labeled at every moment.

Gemini DOWNLOAD

VLOGGER: the new AI model from Google capable of creating moving avatars

Latest from Pedro Domínguez

The head of Google search is transparent with his employees: AI has changed everything

AI has a new victim: call centers are about to disappear

BlizzCon 2024 has been canceled.

No matter how much money you invest in AI, the problems remain the same.

The new Apple AI will work without internet and will be more powerful than many of its competitors.

If you are one of those who loves Dark Mode, Android 15 will be your operating system