Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Police shut down Cluely’s party, the ‘cheat at everything’ startup

    June 17, 2025

    Sam Altman says Meta tried and failed to poach OpenAI’s talent with $100M offers

    June 17, 2025

    OpenAI’s $200M DoD contract could squeeze frenemy Microsoft

    June 17, 2025
    Facebook X (Twitter) Instagram
    • Home
    • Technology
    • Gaming
    • Phones
    • Buy Now
    Facebook X (Twitter) Instagram Pinterest Vimeo
    My BlogMy Blog
    • Home
    • Features
      • Example Post
      • Typography
      • Contact
      • View All On Demos
    • Technology

      Is the Hyperloop Doomed? What Elon Musk’s Latest Setback Really Means

      March 10, 2022

      The Best Early Black Friday Deals on Gaming Laptops and Accessories

      March 10, 2022

      Apple Watch’s ECG Can Help Diagnose Heart Problem: Research

      January 19, 2021

      Simple Tips and Tricks to Take Care of Your Expensive DSLR Camera

      January 16, 2021

      Tech Study Reveals Effects of Mobile Technology on Professionals

      January 15, 2021
    • Typography
    • Phones
      1. Technology
      2. Gaming
      3. Gadgets
      4. View All

      Is the Hyperloop Doomed? What Elon Musk’s Latest Setback Really Means

      March 10, 2022

      The Best Early Black Friday Deals on Gaming Laptops and Accessories

      March 10, 2022

      Apple Watch’s ECG Can Help Diagnose Heart Problem: Research

      January 19, 2021

      Simple Tips and Tricks to Take Care of Your Expensive DSLR Camera

      January 16, 2021

      Game Development This Week: Save On Essential Tools and More

      November 19, 2022

      Riot Games Acquires a Wargaming Studio to Help With Live Game Development

      March 10, 2022

      Keep Talking and Nobody Explodes: A Boomer Gaming in VR

      March 12, 2021

      Hologate Announces New Plans for First Large Format World VR Arcade

      January 16, 2021
      8.9

      DJI Avata Review: Immersive FPV Flying For Drone Enthusiasts

      January 15, 2021
      8.9

      Bose QuietComfort Earbuds II: Noise-Cancellation Kings Reviewed

      January 15, 2021

      Thousands Of PC Games Discounted In New Black Friday Sale

      January 15, 2021

      Could Solar-Powered Headphones Be The Next Must-Have?

      January 15, 2021

      Will Using a VPN on Phone Helps Protect You from Ransomware?

      January 14, 2021

      Popular New Xbox Game Pass Game Being Review Bombed With “0s”

      January 14, 2021

      Google Says Surveillance Vendor Targeted Samsung Phones

      January 14, 2021

      Why Are iPhones More Expensive Than Android Phones?

      January 14, 2021
    • Buy Now
    Subscribe
    My BlogMy Blog
    Home»Uncategorized»Google’s Gemini panicked when playing Pokémon
    Uncategorized

    Google’s Gemini panicked when playing Pokémon

    Y U RajuBy Y U RajuJune 17, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    AI companies are battling to dominate the industry, but sometimes, they’re also battling in Pokémon gyms.

    As Google and Anthropic both study how their latest AI models navigate early Pokémon games, the results can be as amusing as they are enlightening — and this time, Google DeepMind has written in a report that Gemini 2.5 Pro resorts to panic when its Pokémon are close to death. This can cause the AI’s performance to experience “qualitatively observable degradation in the model’s reasoning capability,” according to the report.

    AI benchmarking — or, the process of comparing the performance of different AI models — is a dubious art that often provides little context for the actual capabilities of a given model. But some researchers think that studying how AI models play video games could be useful (or, at the very least, kind of funny).

    Over the last several months, two developers unaffiliated with Google and Anthropic have set up respective Twitch streams called “Gemini Plays Pokémon” and “Claude Plays Pokémon,” where anyone can watch in real time as an AI tries to navigate a children’s video game from over twenty-five years ago.

    Each stream displays the AI’s “reasoning” process — or, a natural language translation of how the AI evaluates a problem and arrives at a response — giving us insight into the way that these models work.

    Image Credits:Google

    While the progress of these AI models is impressive, they are still not very good at playing Pokémon. It takes hundreds of hours for Gemini to reason through a game that a child could complete in exponentially less time.

    What’s interesting about watching an AI navigate a Pokémon game is not so much about its time of completion, but rather, how it behaves along the way.

    “Over the course of the playthrough, Gemini 2.5 Pro gets into various situations which cause the model to simulate ‘panic,’” the report says.

    This state of “panic” can result in the model’s performance getting worse, as the AI may suddenly stop using certain tools at its disposal for a stretch of gameplay. While AI does not think or experience emotion, its actions mimic the way in which a human might make poor, hasty decisions when under stress — a fascinating, yet unsettling response.

    “This behavior has occurred in enough separate instances that the members of the Twitch chat have actively noticed when it is occurring,” the report says.

    Claude has also exhibited some curious behaviors in its journeys across Kanto. In one instance, the AI picked up on the pattern that when all of its Pokémon run out of health, the player character will “white out” and return to a Pokémon Center.

    When Claude got stuck in the Mt. Moon cave, it erroneously hypothesized that if it intentionally got all of its Pokémon to faint, then it would be transported across the cave to the Pokémon Center in the next town.

    However, that isn’t how the game works. When all of your Pokémon die, you return to whatever Pokémon Center you used most recently, rather than the nearest geographically. Viewers watched on in horror as the AI essentially tried to kill itself in the game.

    Despite its shortcomings, there are a few ways in which the AI can outperform human players. As of the release of Gemini 2.5 Pro, the AI is able to solve puzzles with impressive accuracy.

    With some human assistance, the AI created agentic tools — prompted instances of Gemini 2.5 Pro geared toward specific tasks — to solve the game’s boulder puzzles and find efficient routes to reach a destination.

    “With only a prompt describing boulder physics and a description of how to verify a valid path, Gemini 2.5 Pro is able to one-shot some of these complex boulder puzzles, which are required
    to progress through Victory Road,” the report says.

    Since Gemini 2.5 Pro did a lot of the work in creating these tools on its own, Google theorizes that the current model may be capable of creating these tools without human intervention. Who knows, maybe Gemini will therapize itself into creating a “don’t panic” module.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleA comprehensive list of 2025 tech layoffs
    Next Article EVs dominate the most American-made cars index and it’s not just because of Tesla
    Y U Raju

    Related Posts

    Uncategorized

    Police shut down Cluely’s party, the ‘cheat at everything’ startup

    June 17, 2025
    Uncategorized

    Sam Altman says Meta tried and failed to poach OpenAI’s talent with $100M offers

    June 17, 2025
    Uncategorized

    OpenAI’s $200M DoD contract could squeeze frenemy Microsoft

    June 17, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Demo
    Top Posts

    2025 will be a ‘pivotal year’ for Meta’s augmented and virtual reality, says CTO

    June 6, 202536 Views

    Still no AI-powered, ‘more personalized’ Siri from Apple at WWDC 25

    June 9, 202535 Views

    Anthropic unveils custom AI models for U.S. national security customers

    June 5, 202535 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    85
    Featured

    Pico 4 Review: Should You Actually Buy One Instead Of Quest 2?

    thf0oJanuary 15, 2021
    8.1
    Uncategorized

    A Review of the Venus Optics Argus 18mm f/0.95 MFT APO Lens

    thf0oJanuary 15, 2021
    8.9
    Editor's Picks

    DJI Avata Review: Immersive FPV Flying For Drone Enthusiasts

    thf0oJanuary 15, 2021

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Demo
    Most Popular

    2025 will be a ‘pivotal year’ for Meta’s augmented and virtual reality, says CTO

    June 6, 202536 Views

    Still no AI-powered, ‘more personalized’ Siri from Apple at WWDC 25

    June 9, 202535 Views

    Anthropic unveils custom AI models for U.S. national security customers

    June 5, 202535 Views
    Our Picks

    Police shut down Cluely’s party, the ‘cheat at everything’ startup

    June 17, 2025

    Sam Altman says Meta tried and failed to poach OpenAI’s talent with $100M offers

    June 17, 2025

    OpenAI’s $200M DoD contract could squeeze frenemy Microsoft

    June 17, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • Home
    • Technology
    • Gaming
    • Phones
    • Buy Now
    © 2025 ThemeSphere. Designed by ThemeSphere.

    Type above and press Enter to search. Press Esc to cancel.