On Wednesday, OpenAI unveiled a new and improved image generation feature within its API, enabling developers to seamlessly integrate this advanced technology into their applications and services. This feature, which debuted for most ChatGPT users in late March, quickly gained traction for its remarkable ability to create realistic Ghibli-style images and “AI action figures,” captivating users worldwide.
The launch of the new image generator has proven to be a double-edged sword for OpenAI. While it has attracted millions of new sign-ups for ChatGPT, it has also significantly strained the company's operational capacity. In just the first week since the tool's availability, over 130 million ChatGPT users generated more than 700 million images, showcasing the feature's immense popularity and user engagement.
The image-generation capability within OpenAI’s API is powered by a sophisticated AI model known as gpt-image-1. This natively multimodal model excels at creating images across various styles, adhering to custom guidelines, and leveraging extensive world knowledge to render text effectively. Developers using gpt-image-1 can generate multiple images simultaneously, allowing for enhanced flexibility and efficiency in image production.
OpenAI has ensured that gpt-image-1 incorporates robust safety guardrails similar to those found in image generation for ChatGPT. These safeguards prevent the model from producing inappropriate content that violates the company's policies. Developers have the option to control moderation sensitivity, which can be set to “auto” for standard filtering or “low” for a less restrictive approach. The low filtering option limits fewer categories of potentially age-inappropriate content, as detailed in OpenAI's documentation provided to TechCrunch.
To maintain transparency, all images generated with gpt-image-1 are watermarked with C2PA metadata, allowing them to be identified as AI-generated on supported platforms and applications. OpenAI has established a pricing structure for using this tool: $5 per million input tokens for text, $10 per million input tokens for images, and $40 per million output tokens for images. To put this into perspective, this pricing translates to approximately 2 cents, 7 cents, and 19 cents per generated image for low, medium, and high-quality square images, respectively.
Several prominent companies, including Adobe, Airtable, Wix, Instacart, GoDaddy, Canva, and Figma, are already utilizing or experimenting with gpt-image-1. Notably, Figma’s design platform now allows users to generate and edit images using this powerful model, while Instacart is testing the capabilities of gpt-image-1 for developing images related to recipes and shopping lists.
With its groundbreaking advancements in image generation, OpenAI is paving the way for innovative applications across various industries, making it an exciting time for developers and users alike.