Google has officially launched its latest text-to-image model, Imagen 4, showcasing significant advancements in text rendering compared to its predecessor, Imagen 3. This new model promises to enhance the quality and accuracy of images generated from text prompts, making it a valuable tool for creators and marketers alike. Alongside Imagen 4, Google has introduced a premium version called Imagen 4 Ultra, which is designed to adhere to text prompts with even greater precision, available for those willing to invest a little more.
Both versions of the model will be available through a paid preview in the Gemini API, with limited free testing also accessible in Google AI Studio. The standard Imagen 4 model is priced at just $.04 per image, making it an affordable option for most tasks. In contrast, the Imagen 4 Ultra model, which is tailored for users who require high fidelity in image generation, comes at a rate of $.06 per image, reflecting a 50 percent increase in cost.
Google has demonstrated the capabilities of Imagen 4 Ultra through a variety of images, including a whimsical three-panel comic depicting a small spaceship under attack by a gigantic blue space lizard, complete with sound effects like "Crunch!" and "Had!!". This example illustrates how well the model can follow detailed prompts, delivering images that, while decent, resemble a toon rendering from a 3D application.
Another prompt requested an image capturing the essence of a vintage travel postcard for Kyoto, featuring iconic features such as a pagoda under cherry blossoms and snow-capped mountains in the background. The output was accurate but somewhat generic, lacking the charm that could elevate it beyond a simple representation. Other generated images included a hiking couple waving from atop a rock and an artificial avant-garde fashion shoot, both demonstrating strong adherence to prompts but still retaining a distinctly machine-generated quality.
While Imagen 4 shows noticeable improvements over its predecessor, it may not quite match the innovative edge offered by market leaders like Dall-E 3 and Midjourney 7. Users seeking high-quality results may find themselves less impressed by Imagen 4, especially as the initial excitement surrounding AI art appears to be waning. The current trend suggests that much of the generated content is being repurposed for spammy advertisements on social media and lower-quality placements at the bottom of articles.
In summary, Google’s Imagen 4 and Imagen 4 Ultra introduce promising advancements in the realm of text-to-image generation. While they offer substantial improvements and precise execution of prompts, the overall reception may be tempered by the growing fatigue surrounding AI-generated art. As the market continues to evolve, it remains to be seen how these models will position themselves among the competition and in the hearts of users.