In a significant breakthrough for artificial intelligence, Google’s Gemini AI model has achieved a remarkable feat by successfully completing the classic video game, Pokémon Blue, which was released 29 years ago. This exciting news was announced by Google CEO Sundar Pichai on X, where he celebrated the accomplishment, stating, “What a finish! Gemini 2.5 Pro just completed Pokémon Blue!”
For those unfamiliar, the livestream event titled Gemini Plays Pokémon was orchestrated by Joel Z, a 30-year-old software engineer who is not affiliated with Google. Despite his independent status, Google executives have shown enthusiastic support for his project. Logan Kilpatrick, the product lead for Google AI Studio, previously remarked on Gemini's progress, noting that it had earned its 5th badge in Pokémon, while the next best AI model only has 3 badges to its name. Pichai humorously quipped about their work on “API, Artificial Pokémon Intelligence :)”
But why is Pokémon the game of choice for AI development? Back in February, the AI company Anthropic shed light on the advancements made by its Claude AI models in playing “Pokémon Red.” They noted that Claude’s “extended thinking and agent training” provided significant advantages in tackling unexpected challenges like classic video games. Both Pokémon Red and Blue are versions of a beloved GameBoy title first launched in 1996, deeply embedded in the enduring Pokémon franchise.
Interestingly, there is even a Twitch channel dedicated to Claude playing Pokémon, which was cited as an inspiration by Joel Z. Despite Gemini's recent success, it’s worth noting that Claude has yet to conquer “Pokémon Red.” This raises the question: does Gemini outperform Claude in gameplay?
Joel Z has emphasized that direct comparisons between Gemini and Claude are not straightforward. On his Twitch channel, he urged viewers, “Please don’t consider this a benchmark for how well an LLM can play Pokémon. You can’t really make direct comparisons — Gemini and Claude have different tools and receive different information.”
Both AI models require assistance to navigate the game, utilizing specific agent harnesses that provide them with game screenshots and additional context. This setup allows the AI to determine its responses, which may involve engaging specialized agents and executing the corresponding commands.
Joel Z acknowledged that there were several “dev interventions” that aided Gemini in completing Pokémon Blue, but he maintained that these were not instances of cheating. He explained, “My interventions improve Gemini’s overall decision-making and reasoning abilities. I don’t give specific hints — there are no walkthroughs or direct instructions for particular challenges like Mt. Moon.”
He added that the only guidance he provided was alerting Gemini to the necessity of speaking with a Rocket Grunt twice to obtain the Lift Key, which was a bug later corrected in Pokémon Yellow. Furthermore, he noted, “Gemini Plays Pokémon is still actively being developed, and the framework continues to evolve,” hinting at even more advancements to come in AI gaming.
This landmark achievement not only marks a significant moment for Google’s Gemini AI model but also sets the stage for future innovations in the realm of AI and gaming, as the technology continues to evolve and amaze.