Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Brian Singerman’s new fund has a twist, and Peter Thiel as a big backer

    July 14, 2025

    Meta built its AI reputation on openness — that may be changing

    July 14, 2025

    Former Sequoia partner Matt Miller raises $355M for new fund — with Sequoia’s backing

    July 14, 2025
    Facebook X (Twitter) Instagram
    • Home
    • Technology
    • Gaming
    • Phones
    • Buy Now
    Facebook X (Twitter) Instagram Pinterest Vimeo
    My BlogMy Blog
    • Home
    • Features
      • Example Post
      • Typography
      • Contact
      • View All On Demos
    • Technology

      Is the Hyperloop Doomed? What Elon Musk’s Latest Setback Really Means

      March 10, 2022

      The Best Early Black Friday Deals on Gaming Laptops and Accessories

      March 10, 2022

      Apple Watch’s ECG Can Help Diagnose Heart Problem: Research

      January 19, 2021

      Simple Tips and Tricks to Take Care of Your Expensive DSLR Camera

      January 16, 2021

      Tech Study Reveals Effects of Mobile Technology on Professionals

      January 15, 2021
    • Typography
    • Phones
      1. Technology
      2. Gaming
      3. Gadgets
      4. View All

      Is the Hyperloop Doomed? What Elon Musk’s Latest Setback Really Means

      March 10, 2022

      The Best Early Black Friday Deals on Gaming Laptops and Accessories

      March 10, 2022

      Apple Watch’s ECG Can Help Diagnose Heart Problem: Research

      January 19, 2021

      Simple Tips and Tricks to Take Care of Your Expensive DSLR Camera

      January 16, 2021

      Game Development This Week: Save On Essential Tools and More

      November 19, 2022

      Riot Games Acquires a Wargaming Studio to Help With Live Game Development

      March 10, 2022

      Keep Talking and Nobody Explodes: A Boomer Gaming in VR

      March 12, 2021

      Hologate Announces New Plans for First Large Format World VR Arcade

      January 16, 2021
      8.9

      DJI Avata Review: Immersive FPV Flying For Drone Enthusiasts

      January 15, 2021
      8.9

      Bose QuietComfort Earbuds II: Noise-Cancellation Kings Reviewed

      January 15, 2021

      Thousands Of PC Games Discounted In New Black Friday Sale

      January 15, 2021

      Could Solar-Powered Headphones Be The Next Must-Have?

      January 15, 2021

      Will Using a VPN on Phone Helps Protect You from Ransomware?

      January 14, 2021

      Popular New Xbox Game Pass Game Being Review Bombed With “0s”

      January 14, 2021

      Google Says Surveillance Vendor Targeted Samsung Phones

      January 14, 2021

      Why Are iPhones More Expensive Than Android Phones?

      January 14, 2021
    • Buy Now
    Subscribe
    My BlogMy Blog
    Home»Uncategorized»Google launches ‘implicit caching’ to make accessing its latest AI models cheaper
    Uncategorized

    Google launches ‘implicit caching’ to make accessing its latest AI models cheaper

    Y U RajuBy Y U RajuMay 8, 2025No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Google is rolling out a feature in its Gemini API that the company claims will make its latest AI models cheaper for third-party developers.

    Google calls the feature “implicit caching” and says it can deliver 75% savings on “repetitive context” passed to models via the Gemini API. It supports Google’s Gemini 2.5 Pro and 2.5 Flash models.

    That’s likely to be welcome news to developers as the cost of using frontier models continues to grow.

    We just shipped implicit caching in the Gemini API, automatically enabling a 75% cost savings with the Gemini 2.5 models when your request hits a cache 🚢

    We also lowered the min token required to hit caches to 1K on 2.5 Flash and 2K on 2.5 Pro!

    — Logan Kilpatrick (@OfficialLoganK) May 8, 2025

    Caching, a widely adopted practice in the AI industry, reuses frequently accessed or pre-computed data from models to cut down on computing requirements and cost. For example, caches can store answers to questions users often ask of a model, eliminating the need for the model to recreate answers to the same request.

    Google previously offered model prompt caching, but only explicit prompt caching, meaning devs had to define their highest-frequency prompts. While cost savings were supposed to be guaranteed, explicit prompt caching typically involved a lot of manual work.

    Some developers weren’t pleased with how Google’s explicit caching implementation worked for Gemini 2.5 Pro, which they said could cause surprisingly large API bills. Complaints reached a fever pitch in the past week, prompting the Gemini team to apologize and pledge to make changes.

    In contrast to explicit caching, implicit caching is automatic. Enabled by default for Gemini 2.5 models, it passes on cost savings if a Gemini API request to a model hits a cache.

    Techcrunch event

    Berkeley, CA
    |
    June 5


    BOOK NOW

    “[W]hen you send a request to one of the Gemini 2.5 models, if the request shares a common prefix as one of previous requests, then it’s eligible for a cache hit,” explained Google in a blog post. “We will dynamically pass cost savings back to you.”

    The minimum prompt token count for implicit caching is 1,024 for 2.5 Flash and 2,048 for 2.5 Pro, according to Google’s developer documentation, which is not a terribly big amount, meaning it shouldn’t take much to trigger these automatic savings. Tokens are the raw bits of data models work with, with a thousand tokens equivalent to about 750 words.

    Given that Google’s last claims of cost savings from caching ran afoul, there are some buyer-beware areas in this new feature. For one, Google recommends that developers keep repetitive context at the beginning of requests to increase the chances of implicit cache hits. Context that might change from request to request should be appended at the end, the company says.

    For another, Google didn’t offer any third-party verification that the new implicit caching system would deliver the promised automatic savings. So we’ll have to see what early adopters say.





    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticlePowerSchool paid a hacker’s ransom, but now schools say they are being extorted
    Next Article Instagram Threads is getting video ads
    Y U Raju

    Related Posts

    Uncategorized

    Brian Singerman’s new fund has a twist, and Peter Thiel as a big backer

    July 14, 2025
    Uncategorized

    Meta built its AI reputation on openness — that may be changing

    July 14, 2025
    Uncategorized

    Former Sequoia partner Matt Miller raises $355M for new fund — with Sequoia’s backing

    July 14, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Demo
    Top Posts

    2025 will be a ‘pivotal year’ for Meta’s augmented and virtual reality, says CTO

    June 6, 202544 Views

    Still no AI-powered, ‘more personalized’ Siri from Apple at WWDC 25

    June 9, 202543 Views

    XRobotics’ countertop robots are cooking up 25,000 pizzas a month

    June 9, 202542 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews
    85
    Featured

    Pico 4 Review: Should You Actually Buy One Instead Of Quest 2?

    thf0oJanuary 15, 2021
    8.1
    Uncategorized

    A Review of the Venus Optics Argus 18mm f/0.95 MFT APO Lens

    thf0oJanuary 15, 2021
    8.9
    Editor's Picks

    DJI Avata Review: Immersive FPV Flying For Drone Enthusiasts

    thf0oJanuary 15, 2021

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Demo
    Most Popular

    2025 will be a ‘pivotal year’ for Meta’s augmented and virtual reality, says CTO

    June 6, 202544 Views

    Still no AI-powered, ‘more personalized’ Siri from Apple at WWDC 25

    June 9, 202543 Views

    XRobotics’ countertop robots are cooking up 25,000 pizzas a month

    June 9, 202542 Views
    Our Picks

    Brian Singerman’s new fund has a twist, and Peter Thiel as a big backer

    July 14, 2025

    Meta built its AI reputation on openness — that may be changing

    July 14, 2025

    Former Sequoia partner Matt Miller raises $355M for new fund — with Sequoia’s backing

    July 14, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • Home
    • Technology
    • Gaming
    • Phones
    • Buy Now
    © 2025 ThemeSphere. Designed by ThemeSphere.

    Type above and press Enter to search. Press Esc to cancel.