It’s still a text-to-image model, and Google is calling it a text-to-live image model. Unlike generated AI videos with static photos and some degree of motion, Imagen 2 can showdifferent camera anglesand there is consistency across the scene as well.
That said, the model can only output video clips aka live images at a low resolution of640 x 360. Google is pitching Imagen 2 to enterprise customers including marketers and creatives who can quickly generate short clips for ads, campaigns, and more.
Apart from that, Google is using its SynthID technique to apply aninvisible watermarkon AI-generated clips and images. The company says SynthID can withstand edits and even compression. In addition to that, Google has also filtered the image generation model for safety and bias.
It must be noted thatGoogle recently came under firefor refusing to generate images of white-skinned people. After the incident, Googlepaused image generation for humans, and even after two months, the company has not lifted the restriction on Gemini.
That said, Imagen 2 has been made generally available on Vertex AI for enterprise customers. It now also supportsinpainting and outpainting, the ability to edit images using AI, expand the border, or add/remove certain parts of the image. OpenAI also broughtimage editing to Dall -Egenerated images recently.
While the Imagen 2 model can generate video clips for up to four seconds, I am not sure how it can compete with other text-to-video generators. Runway offers video generationup to 18 secondsat a much better resolution and OpenAI recently introduced its groundbreakingSora model. To compete with these models, Google has to come up with a far more powerful diffusion model.
Passionate about Windows, ChromeOS, Android, security and privacy issues. Have a penchant to solve everyday computing problems.