In a significant development for developers and AI enthusiasts, Google has announced a substantial expansion of its Gemini AI model family. After months of meticulous enhancements, the powerful Gemini 2.5 Pro is officially leaving its preview phase and is now ready for developers to leverage in their projects. Additionally, Google has unveiled its upcoming high-efficiency model, dubbed Gemini 2.5 Pro Flash-Lite, aimed at providing cost-effective solutions for high-volume AI workloads.
Since the debut of Gemini 2.5 in 2025, Google's AI ambitions have significantly evolved, showcasing major improvements over previous versions. These enhancements have positioned Google as a formidable competitor against OpenAI and its widely-used GPT models. However, the rollout has been characterized by a series of previews and test builds as Google fine-tunes its models for general availability—a state in which a model is considered stable enough for long-term development.
At the recent Google I/O event, the Gemini 2.5 Flash model transitioned out of preview mode, signaling its readiness for broader use. Meanwhile, the Gemini 2.5 Pro has now also reached general availability, with the latest build, 06-05, being particularly noteworthy. This version addresses several issues encountered in the earlier I/O build of 2.5 Pro, indicating a successful refinement process.
All models within the Gemini 2.5 family feature adjustable thinking budgets, a valuable attribute for developers looking to maintain control over their operational costs. For those particularly mindful of their budgets, Google is introducing the Gemini 2.5 Flash-Lite model, which has transitioned from an experimental phase to a preview stage. This cost-efficient model provides an innovative solution for running extensive AI workloads without incurring prohibitive expenses.
When compared to the Gemini 2.5 Flash model, Flash-Lite offers a remarkable cost advantage—it operates at one-third of the expense for text, image, and video inputs and less than one-sixth the cost for output tokens. However, it is important to note that this variant may not be available to regular users through the app, as it is designed for specific use cases where cost efficiency is paramount while paying by the token.
Enhancing its capabilities, both the Gemini Flash and Flash-Lite models are set to be integrated into Google Search. A spokesperson from Google confirmed to Ars Technica that customized versions of these models are now operational in features such as AI overviews and AI Mode. The model utilized for a particular query is determined by its suitability; for instance, complex queries may be handled by 2.5 Pro, while simpler tasks could utilize Flash or even Flash-Lite for basic searches.
Developers can explore the Flash-Lite preview alongside stable versions of the Gemini 2.5 Flash and Gemini 2.5 Pro in Google AI Studio and Vertex AI. Users of the Gemini app will not experience significant changes, as the final versions of 2.5 Pro and 2.5 Flash were already integrated into the app prior to this announcement. The Pro variant will also drop its preview label, following in the footsteps of Flash from last month, although its functionality will remain unchanged.
For free users, access to Gemini 2.5 Pro in the app remains limited, while paying Pro users enjoy an enhanced allowance of up to 100 prompts per day. The highest level of access to Gemini 2.5 Pro is reserved for AI Ultra subscribers, who benefit from the full capabilities of this advanced AI model.