After an exciting preview three months ago, Google is set to officially launch Gemini 2.5 Pro, making its advanced reasoning model available to both consumers and developers. The experimental testing phase for Gemini 2.5 Pro began in late March (March 25), and just four days after its debut for paying subscribers and developers, Google surprised many by offering it to free users as well.
In May, just before the I/O conference, Google implemented a significant coding upgrade for Gemini 2.5 Pro on May 6. This was followed by one last update on June 5, which means that the stable version launching today shows “no changes from the June 5 preview version.” This launch marks a crucial shift, as 2.5 Pro will no longer carry the “preview” label in the Gemini app’s model picker.
Importantly, this release comes on the heels of the Gemini 2.5 Flash entering general availability during the I/O event last month. Google has clarified the distinctions between the two models: 2.5 Pro is specifically tailored for “Reasoning, math & code” prompts, while 2.5 Flash is designed for “Fast all-around help.”
For users of the free Gemini app, there will be “limited access” to 2.5 Pro features. In contrast, AI Pro subscribers will be granted expanded access, with the ability to utilize up to 100 prompts per day. For those seeking the highest level of access, Google AI Ultra offers even more capabilities.
Alongside the launch of Gemini 2.5 Pro, Google has also made the 2.5 Flash model generally available and stable for developers. This model retains the pricing structure from its previous preview on May 20, with updated rates: $0.30 per million input tokens (an increase from $0.15) and $2.50 per million output tokens (a decrease from $3.50). Notably, the previous price differentiation between “thinking” and “non-thinking” models has been eliminated, resulting in a single pricing tier regardless of input token size.
Moreover, developers can now preview Gemini 2.5 Flash Lite, which is optimized for high-volume, latency-sensitive tasks such as translation and classification where cost is critical. Google claims that Flash Lite offers “lower latency than 2.0 Flash-Lite and 2.0 Flash” across a broad range of prompts. By default, the “thinking” feature is disabled but can be activated with designated budgets.
The native tools available in this latest update include “Grounding with Google Search, Code Execution, and URL Context,” in addition to function calling capabilities. Furthermore, the introduction of multimodal input and an extended context length of 1 million tokens enhances the model's usability and effectiveness.
With the launch of Gemini 2.5 Pro, Google is poised to revolutionize how users interact with AI technology. The comprehensive features and updated pricing structure make it an attractive option for developers and consumers alike, solidifying Google's position as a leader in the AI landscape.