In recent months, the AI industry has witnessed a growing consensus among its leading proponents that we are on the cusp of achieving artificial general intelligence (AGI). This concept refers to virtual agents capable of matching or exceeding human-level understanding and performance across a wide array of cognitive tasks. Notably, OpenAI is fostering expectations for a PhD-level AI agent that could operate autonomously, akin to a high-income knowledge worker, in the near future. Tech visionary Elon Musk predicts that we may have AI systems smarter than any individual human by the end of 2025. Meanwhile, Anthropic CEO Dario Amodei suggests that while it may take longer, it’s plausible that AI will surpass human capabilities in nearly every domain by the end of 2027.
Last month, Anthropic showcased its project titled “Claude Plays Pokémon,” which serves as a milestone on the journey toward AGI. The company claims that this initiative demonstrates the potential of AI systems to tackle challenges with increasing competence, utilizing not just training but also generalized reasoning. Anthropic made headlines by highlighting how Claude 3.7 Sonnet, the latest iteration of its AI model, exhibited improved reasoning capabilities that allowed it to progress in the beloved Game Boy RPG in ways that earlier models could not. For context, Claude models from just a year ago struggled to navigate the game’s opening area, while Claude 3.7 Sonnet managed to collect multiple in-game Gym Badges with relatively few actions.
This significant improvement, according to Anthropic, stems from the model's ability to engage in “extended thinking,” enabling it to plan, remember objectives, and adapt when initial strategies fail—skills deemed essential for overcoming challenges posed by pixelated gym leaders. These capabilities, they argue, are also critical for solving real-world problems.
However, relative success compared to previous models does not equate to absolute mastery of the game. Since the unveiling of Claude Plays Pokémon, thousands of viewers on Twitch have observed Claude struggle to make consistent progress. Despite prolonged pauses for reasoning between moves—allowing viewers to witness the system’s simulated thought processes—Claude often finds itself retracing steps in previously explored areas, getting stuck in dead ends, or repeatedly engaging with unhelpful non-playable characters (NPCs). These instances of distinctly sub-human performance raise questions about the emergence of computer superintelligence.
Interestingly, Claude's ability to engage with Pokémon at all is noteworthy. AI systems designed for games like Go or Dota 2 typically rely on deep knowledge of game mechanics and strategies. However, for Claude Plays Pokémon, Anthropic developer David Hershey revealed that the model was not specifically trained for the game. Instead, it utilizes its broad understanding of the world to interpret video games. “Claude has a sense of Pokémon based on what it has read,” Hershey explained, noting that while Claude knows about Gym Badges and their locations, it struggles to interpret the low-resolution graphics of a Game Boy screen as effectively as a human.
Despite advancements in AI image processing, Claude’s limitations remain evident. Hershey remarked that Claude often attempts to interact with walls and encounters difficulties understanding basic navigation. These challenges underscore the differing strengths and weaknesses of AI compared to human players. For example, Claude excels in text-based interactions, successfully integrating knowledge to formulate battle strategies against opponents.
Beyond text and image interpretation, Claude faces memory retention issues. With a context window of only 200,000 tokens, Claude struggles to maintain a coherent knowledge base over extended gameplay sessions. This limitation can lead to forgetting crucial information or mistakenly inserting incorrect data into its knowledge base. Hershey explained that Claude can become fixated on erroneous coordinates, wasting time exploring incorrect areas instead of progressing. However, he noted that Claude 3.7 Sonnet is better than earlier models at questioning its assumptions and trying new strategies, ultimately leading to moments of genuine progress.
Observing the varying strategies and progress of Claude during gameplay reveals the complexities of AI development. While sometimes capable of constructing coherent strategies, Claude often falters, showcasing the ongoing challenges in AI reasoning. Hershey suggests that improving Claude's understanding of Game Boy graphics and expanding its context window could pave the way for more coherent long-term reasoning and learning.
Despite the current limitations of Claude's Pokémon performance, the project offers valuable insights into the state of AI research and the journey toward achieving human-level intelligence. Hershey acknowledges that while watching Claude struggle can give the impression of an AI model lacking direction, the instances of self-awareness and strategic thinking represent significant milestones in the evolution of AI.
As we look to the future, the progress of Claude and similar models signifies that we are inching closer to achieving artificial general intelligence. The advancements made thus far, though modest, suggest that the gap between AI capabilities and human-level reasoning is narrowing, offering a glimpse into the potential of AI systems in the years to come.