Discussion
Unexpected €54k billing spike in 13 hours: Firebase browser key without API restrictions used for Gemini requests
luanmuniz: Unfortunately, yet just another story like this. One of these unexpected usage charges in the thousands appears every month, and with the same automatic denied too. This is one of the reasons I just stopped using these kinds of pay-per-usage cloud services long ago. At best, I still use services that have hard-bounded usage limits, like EC2 from AWS, where one instance can never go beyond 24h/day usage and is always capped, with shutdowns when exceeded, and limited credit cards, too.It's super frustrating that this is the only option to realistically deal with this issue, since all stories end up the same way: The cloud company just saying "f* you, we don't care, pay up." and legal fees are always expensive :(
patcon: That's fucking bonkers that nothing in the system could see this as unusual and worthy of throttling. The embarrassment of this -- that a company LITERALLY SELLING machine learning services and expertise -- cannot spot such a thing... This should have led them to deal with this internally and refund it. Just... Wow Google.
trick-or-treat: This is GCP's revenue model, lol. Let's provide a (semi) generous free tier and trick people into accidentally going over it.
lukewarm707: i have seen this so many times...i'm thinking it's time we replaced api keys.some type of real time crypto payment maybe?
trick-or-treat: Prepaid only is a fantastic idea, especially for dumb-ass startups. Limiting your liability to $100 or so sound like a big-ass W.
100ms: Implementing this in any meaningful manner quickly begins to look like every read becoming a globally synchronised write. Of course it doesn't have to be perfect, but even approximating perfection doesn't look much different. Also, can you imagine the kind of downtimes and complaints that would inevitably originate from a fully synchronous billing architecture?
embedding-shape: Considering the amount of repositories on public GitHub with hard-coded Gemini API tokens inside the shared source code (https://github.com/search?q=gemini+%22AIza%22&type=code), this hardly comes as a surprise. Google also has historically treated API keys as non-secrets, except with the introduction of the keys for LLM inference, then users are supposed to treat those secretly, but I'm not sure everyone got that memo yet.Considering that the author didn't share what website this is about, I'd wager they either leaked it accidentally themselves via their frontend, or they've shared their source code with credentials together with it.
freedomben: Prepaid only is a fantastic idea, until your site goes (desirably) viral and then gets shut off right as traffic is picking up, or you grow steadily and forget to increase your deposit amount and suddenly production is down. Billing alerts are a much better solution IMHO.
drtz: > Are there recommended safeguards beyond ... moving calls server-side?This implies the API calls originated in the client, suggesting the client may have had they API key.
lerp-io: i love how cloud providers are ok with hackers sniffing API keys because they simply offload to user (especially good if its a business they know will pay) and it counts as profit.
ajaystream: Firebase keys are client-visible by design — the enforcement is the restrictions you set on the key (referrer, bundle ID, API surface). A key with none set is a password pasted into your JS bundle. Nothing new there.What is new is the spend velocity. A scraped Maps key used to burn a few hundred bucks before anyone noticed. Gemini at ~10¢/call turns the same leak into five figures in under a day — billing alerts fire in hours, damage happens in minutes.Feels like the missing primitive is a hard per-key spend ceiling you opt into at key creation, below the project billing cap. Google has the signal; they just don't expose the control.
freedomben: There is some new stuff here, see https://news.ycombinator.com/item?id=47156925 for example.
mdrzn: Related: https://news.ycombinator.com/item?id=47156925
lukewarm707: there is no way to cap your billing on gcp.you can get notifications but that's it.i don't want to get throttled below my quota but some type of spend limit would be good.
freedomben: Oh please no. And the "alternatives" to API keys aren't going to help much either, they'll just add friction to getting started (as reference: see the pain involved in writing a script that hits gmail or calendar API)
singpolyma3: Um. What? In what world are API keys not secrets?
embedding-shape: [delayed]
ckbkr10: theres not a single real gemini api key in the results
embedding-shape: Setup a watcher and you'll come across live ones eventually :)
dabedee: As others have said, this is a "feature" for Google, not a bug. There is no easy way to set a hard cap on billing on a project. I spent the better time of an hour trying to find it in the billing settings in GCP, only to land on reddit and figuring out that you could set a budget alert to trigger a Pub/Sub message, which triggers a Cloud Function to disable billing for the project. Insanity.
mcccsm: Ai and everything that derives from it follows GIGO principles. If the user is a dumbo, you'll see dumbo outcomes
mcccsm: Two things that should be default on any GCP project touching generative-AI APIs: 1 API-key restrictions by HTTP referrer AND by API (`generativelanguage.googleapis.com` only),2 a billing budget with a Pub/Sub "cap" action, not just an email alert. Neither is on by default, and almost nobody sets them before shipping. 13 hours is actually fast for detection. most teams find out at end-of-month reconciliation.
827a: Important to note that, even to this day, Google's AI Studio Build Mode still recommends getting around this "client visible by design with very low enforceable protections" by publicly exposing an AI proxy with zero protection [1]. They don't care.[1] https://github.com/qudent/qudent.github.io/blob/master/_post...
oezi: > — billing alerts fire in hours, damage happens in minutes.And why do you need to use AI to tell us that. How much shorter could the prompt have been?
jb1991: You are getting down voted but the first thing I thought when I read the above comment you replied to was that it was written by an LLM as well. It has all the stylings of it. Word choice, sentence structure, phrasing, metaphors, etc.
xnorswap: Indeed, the same exact phrase threw me out of the comment and made me realise, "Wait, this is an LLM".The clincher is the capitalisation of Maps, who outside of someone consuming corporate brand guidelines would bother with that?
hilariously: This reads like 100% an LLM comment. by design -- the enforcementNothing new here - what is new is theA thing before anyone noticed - another thing, billing in hours, damage in minutes.has the signal, doesn't expose the controlEvery one of those "exposes the signal" to me.
rvnx: and the notifications can be delayed because the spending system is not updated in real-time, so even if you have a Cloud Task triggering on spending to disable the project it may be too slow and several thousands may already be spent.
theli0nheart: Stop using ChatGPT to write your comments please.
dpkirchner: That's standard for Firebase apps. It's also recommended by Google (they describe the keys as "public by design").
microtonal: You can also have both, a cap and one or more billing alert levels below it. Some providers do this (e.g. IIRC Backblaze B2).
freedomben: Yes in reality, and ideally, you can have both, but GP specifically said "Prepaid only" implying you can't have both (which is what I replied to)
JohnScolaro: > We had a budget alert (€80) and a cost anomaly alert, both of which triggered with a delay of a few hours. By the time we reacted, costs were already around €28,000.I had a similar experience with GCP where I set a budget of $100 and was only emailed 5 hours after exceeding the budget by which time I was well over it.It's mind boggling that features like this aren't prioritized. Sure it would probably make Google less money short term, but surely that's more preferable to providing devs with such a poor experience that they'd never recommend your platform to anyone else again.
zanbezi: Exactly my thoughts, can not really understand how delayed alerts are acceptable... Have you managed to settle the cost with Google, what was the outcome?
alasano: My favorite Google LLM benchmark is asking Gemini models to create a script that fetches API usage (just request counts) for a project from GCP.100% failure rate.
dminik: Try this one. Should remove most readme keys:https://github.com/search?q=gemini+%3D+%22AIza%22+AND+%28+la...
fg137: Flagged
fg137: Google's world. They explicitly tell you that API keys are not secrets.https://trufflesecurity.com/blog/google-api-keys-werent-secr...
comrade1234: Can you pre-load money into your account and have that be used until it's zero, at which time you have to load more? Deepseek does it this way.
ok123456: No GCP is not prepay.
londons_explore: Which is a stupid idea for something where there is billing involved... Anyone on the internet can take that key and scrape the Google maps API (faking the referer header) and cost you $$$$$.Google should have simply done with by origin URL if they wanted stuff to be open like that.
ratsimihah: this is such a wall of shame haha
827a: I said this when this finding was originally posted and I'll say it again: This is by far the worst security incident Google has ever had, and that's why they aren't publicly or loudly responding to it. It's deeply embarrassing. They can't fix it without breaking customer workflows. They really, really want it to just go away and six months from now they'll complete their warning period to their enterprise contracts and then they can turn off this automated grant. Until then they want as few people to know about it as possible, and that means if you aren't on anyone's big & important customer list internally, and you missed the single 40px blurb they put on a buried developer documentation site, you're vulnerable and this will happen to you.Disgusting behavior.
tantalor: What does this have to do with security?
thedangler: Also, can't you tie a key to a domain or IP address to help stop unauthorized usage?
littlecranky67: Not if its publicly called from Javascript, as your user's browser will make those requests. You neither know their IP addresses, nor is the referer or origin header a safe choice as it can be spoofed outside of a browser.
lucavice: If it's called from Javascript in the browser, it's not a secret API key....
duskdozer: Oh, wow.
Hamuko: Which cloud provider actually prioritises features that cut off your money supply? Because AWS sure as shit doesn't either.
benterix: Amazon, Microsoft and Google don't offer hard cap. Most other/smaller public cloud providers do. The reasons are quite obvious.
CWwdcdk7h: Google doesn't allow disconnecting credit card from account unless you close it. That includes situation when you are just trying out free tier.
benterix: > We had a budget alert (€80) and a cost anomaly alert, both of which triggered with a delay of a few hours> By the time we reacted, costs were already around €28,000> The final amount settled at €54,000+ due to delayed cost reportingSo much for the folks defending these three companies that refused to provide hard spending cap ("but you can set the budget", "you are doing it wrong if you worry about billing", "hard cap it's technically impossible" etc.)
villgax: Shirky’s principle at work is all
sillysaurusx: I know you're well within your rights to post this, but would you consider replacing your comment with something like "It's easy to find working keys on github if you search the appropriate terms"?Think of it this way: although you're not to blame, HN drives a lot of traffic to your preconfigured github search. There are also bad actors who browse HN; I had a Firebase charge of $1k from someone who set up an automated script to hammer my endpoint as hard as possible, just to drive the price up. Point being, HN readers are motivated to exploit things like what you posted.It's true that the github search is a "wall of shame", and perhaps the users deserve to learn the hard way why it's a good idea to secure API keys. But there's also no benefit in doing that. The world before and after your comment will be exactly the same, except some random Gemini users are harmed. (It's very unlikely that Google or Github would see your comment and go "Oh, it's time we do something about this right now".)
zozbot234: > Google also has historically treated API keys as non-secrets, except with the introduction of the keys for LLM inference, then users are supposed to treat those secretlyThis was reported a long time ago, and was supposed to be fixed by Google via making sure that these legacy public keys would not be usable for Gemini or AI. https://news.ycombinator.com/item?id=47156925 https://ai.google.dev/gemini-api/docs/troubleshooting#google... "We are defaulting to blocking API keys that are leaked and used with the Gemini API, helping prevent abuse of cost and your application data." Why are we hearing about this again?
PunchyHamster: the topic is cost overruns. they still allow for cost overruns. What's so hard to comprehend ?
alibarber: Forgive my ignorance - but what's the payoff for fraudsters in getting access to a generative AI service for a short-ish period of time, before they get cut off?With EC2 / GCC credentials, I could understand going all out on bitcoin mining - but what are they asking the AI to do here that's worth setting up some kind of botnet or automation to sift the internet for compromised keys?
Illniyar: I think the logistics of calculating cost in real time is something that is extremely hard. I don't think there is one big cloud service provider that has hard limits instead of alerts.As long as they revert the charge when notified of scenarios like this , and they have historically done so for many cases, it's fine. It's an acceptable workaround for a hard problem and the cost of doing business ( just like Credit Cards accept a certain amount of loss to fraud as part of business)
zulban: Ridiculous. They are clearly not trying at all. A hard wall preventing going over budget by 100x in a couple hours is not some devilishly complicated decentralized system problem.Don't tote the party line.Same reason why Azure AI only has easy rate limits by minute, not by day or week or month. Open source proxy projects do it easily tho. Think about the incentives.Going over a hard cap by 3% would be a reasonable failure to make, not by 30000%.
zarzavat: It's not a security incident because it makes Google money. It's extra revenue. They are embarrassed all the way to the bank.At some point, when it appeared 2 months ago on HN and they still did nothing about it, intentionality can be assumed.
bombcar: This is exactly it - and the normal "resolution" is a class-action lawsuit but no doubt their terms and conditions forbid that.However, anyone affected should probably pollute their docket with lawsuits anyway.
pwdisswordfishs: Generally no. Most giants and indies alike have been strongly opposed to implementing this feature for business reasons. (When you run across something that does let you do things that way, it's one of a handful of exceptions.) Their response is to tell you to set up budget alerts, which is not a solution, as described in this post.<https://en.wikipedia.org/wiki/The_purpose_of_a_system_is_wha...>
lxgr: This is presumably by design: How can it be the vendor's fault if your custom billing protection implementation failed you at a critical time? Much harder to defend against a switch on their dashboard allowing billing overshoot.
Maxious: > The Gemini API supports monthly spend caps at both the billing account tier and project levels. These controls are designed to protect your account from unexpected overages, and the ecosystem to ensure service availabilityhttps://ai.google.dev/gemini-api/docs/billing#project-spend-...
rtkwe: The problem is it's specific to that API and defaults to uncapped so people who aren't using it and haven't heard about the issues with the Firebase API keys probably won't have set them.
zozbot234: Except that Google's own statements are very clear that "leaked" (i.e. public) API keys should not be able to access the Gemini API in the first place: "We have identified a vulnerability where some API keys may have been publicly exposed. To protect your data and prevent unauthorized access, we have proactively blocked these known leaked keys from accessing the Gemini API. ... We are defaulting to blocking API keys that are leaked and used with the Gemini API, helping prevent abuse of cost and your application data." https://ai.google.dev/gemini-api/docs/troubleshooting#google...
someothherguyy: Once upon a time Google maps loads were nearly free, and there was no way to restrict that key.
lxgr: API keys for Firebase. While Google really messed up here, I doubt they ever published anything claiming that no Google API keys at all are secrets.
imafish: OpenAI also worked like this last time I used it - not sure if that's changed.
lxgr: Public API keys are a thing. Arguably they are poorly named (it's really more of a client identifier), and modeling them as primarily a key instead of primarily as a non-secret identifier can go very wrong, as evidenced here.
time0ut: It is scary building on the public cloud as a solo dev or small team. No real safety net, possibly unbounded costs, etc. A large portion of each personal project I do is spent thinking about how to prevent unexpected costs, detect and limit them, and react to them. I used to just chuck everything onto a droplet or VPS, but a lot of the projects I am doing lately need services from Google or AWS. I tend to prefer GCP at this point because at least I can programmatically disconnect the billing account when they get around to tripping the alert.
LelouBil: [delayed]
Maxious: https://ai.google.dev/gemini-api/docs/billing#prepay
Maxious: > When your Prepay credit balance on the billing account hits $0, all API keys in all projects linked to that billing account will stop working simultaneously. Prepay credits apply only to Gemini API usage costs; you can't use them to pay for other Google Cloud services.https://ai.google.dev/gemini-api/docs/billing#prepay
jddj: Distillation maybe?
wongarsu: Cutting off at the exact cent is difficult, but a hard limit that triggers within one dollar of the actual limit should really be possibleIf for some resources you can't sample measurements fast enough you could weaken it to "triggers within one dollar or five minutes after cost overrun, whichever comes later". But LLM APIs are one of those cases where time isn't a factor, your only issue is that if you only check quota before each inference a given query might bring you over
zotex: we love Amazon, Microsoft and Google being altruistic and making sure your not burdened with too much money
Aurornis: Early Generative AI was popular with spammers before it became mainstream because it could be used to write infinite variations of spam messages. Making each message unique is more likely to bypass spam filters.There are also a lot of AI use cases that require a lot of token spend to brute force a problem. Someone might want to search for security exploits in a codebase but they don’t want to spend the $50,000 in tokens from their own money. Finding someone’s key and using it as hard as possible until getting locked out could move these projects forward.
janandonly: Yet another good reason to use a pre-paid service.There are many to choose from now, like Openrouter.com, PPQ.ai, and routstr.com.
whywhywhywhy: Why is the default uncapped then other than the hopes of billing people who screw up or get exploited.
lxgr: I doubt most cloud providers are even technically ready for true prepaid billing (which requires things such as estimating and reserving funds prior to paid operations, corresponding real-time two-way interfaces instead of just eventually consistent billing event aggregation etc).In early mobile networks, the feature set for prepaid used to always lag behind, since real-time billing wasn't really a design consideration from the beginning.I suppose rather than taking on that extra work or offering a reduced feature set or by building something best-effort and taking financial responsibility for its failures, if cloud providers can just get away with making this the user's problem, why wouldn't they?
nurettin: I'd buy the technically impossible angle.Even if you manage to get your microservices to synch every penny spent to your payment account at realtime (impossible) you still have to waiver the excess, losing some money every time someone goes past their quota.
chinathrow: Take them to court.
Leomuck: That's actually crazy. So I can build a project I love, that does good, but somehow get in a situation where I'm accidentally paying 30.000€ (or 50.000€) to a big tech company? How is that fair? I mean yes, as a software engineer, you ought to reflect on all possible weaknesses, but there was a time when overlooking something meant something completely different than being down 30/50k. That is actually life-altering.
benoau: Your kid can do this in a smartphone game designated suitable for children, heavily optimized to exacerbate the possibility, and depending on where you live they can just choose not to refund you.When the FTC went investigating a decade-ish ago they found Facebook saying the quiet parts out loud: it was all extremely deliberate.
benterix: I invite you to look at the various solutions implemented by those public cloud providers that actually implemented this feature.
naturalauction: We had this exact same problem (the key initially wasn’t a secret but became a secret once we enabled Gemini API with no warnings).We managed to catch it somewhat early through alerting, so the damage was only $26k.We asked our Google cloud support rep for a refund - they initially came back with a no but now the case is under further consideration.I’d escalate this up the chain as much as possible.
whalesalad: they dgaf. i've been told anything over 10k requires sign off from the executive team
startages: Yeah, that the main reason I never use services like Google Cloud if I don't have to, it's impossible to have a hard cap, and anyone pretending to be an expert, is just off. Google says that they can't provide a hard cap because that would mean shutting down all your services..bla bla, but at least give users the option.
logankilpatrick: We have spend caps at the billing account level and the project level (developer set) in the Gemini API now. There is up to a 10 minute delay in processing everything but this should significantly mitigate the risk here: https://ai.google.dev/gemini-api/docs/billing#tier-spend-cap...By default, new Tier 1 paid accounts can only spend $250 in a given month.
walthamstow: I'm with you. And what do you even do when the quota is breached, nuke the resources? People will complain about that just as much as overspends.I don't buy the 'evil corp screwing people' angle either. They are making farrr too much legit money to care about occasionally screwing people out of 20k and 50k.
johnmaguire: If I set a limit, and you cut off my service because I reached the limit, I would definitely not "complain just as much" as if I set a limit and you allowed me to spend past it.We're not talking about an EC2 or EBS volume here, this is access to an API.
walthamstow: Meh, you probably would complain. Maybe you forgot you set it. Now your project is taking off, making money, and it got nuked.Why aren't we talking about an EC2 - is that not a cloud compute service? People have been complaining about cloud billing since long before LLMs.Anything to say about the technical problem of constantly monitoring many services against a project or account-level limit?
drfloyd51: See also: Why is the default cap so low? I lost €78bojillion because my API stopped working.
jamespo: Monitoring could pick this up in minutes rather than how long this took to discover
adriand: You mean openrouter.ai. And yes, on reading this blog post, I immediately reviewed my API keys in OpenRouter to make sure that they were capped. My prod key was capped at $20/day (phew!) but my dev key had no cap, which I just updated. What a horrible story.
sillysaurusx: Back in 2020 I had a similar situation. Ended up charging $500 due to an overnight TPU training run using egress bandwidth across zones.Google support was surprisingly understanding, after I explained the issue. They asked some clarifying questions. Then they said that they can offer a one time refund for this case.Since then I was paranoid not to accidentally do it again. I don't know whether GCP would refund a second time.
genxy: GCP charging for interzone traffic is an interesting financial choice. They own all the infra and in many cases this is literally moving from building to building.
sillysaurusx: There's cross-region, and cross-zone. If both boxes are located within the same zone (e.g. us-east1) then the bandwidth is free, since it's intrazone traffic. Cross-zone egress traffic (e.g. us-east1 to us-central1) is billed at a certain rate, and cross-region egress traffic (e.g. us-east1 to europe-west8) is billed at a significantly higher rate.Amusingly enough, ingress traffic seems to always be free. So you can upload as much data as you want into their cloud, but good luck if you need to get it out.
genxy: I am referring to cross-zone within in the same region, so like us-central1-a to us-central1-b. These are building to building and often never cross public land.
sillysaurusx: Oh, yes! I forgot entirely about that case. You're right, egress traffic is charged there too.Are the datacenters really located so close together? I assumed they weren't within walking distance of each other.
coredog64: Correct, they're close in the sense of country-scale geography but physically spaced to avoid specific issues like location on a flood plain.
intended: As the other user said - this would be an anti-feature and user hostile.This is a sign that somehow there isn’t sufficient incentive to work on these features.
cxr: [delayed]
TrackerFF: It's like a fire alarm system that goes off 30 mins after the it senses a fire. Good stuff.
juancn: These are all poorly designed systems from a CX perspective (the billing systems).Billing is usually event driven. Each spending instance (e.g. API call) generates an event.Events go to queues/logs, aggregation is delayed.You get alerts when aggregation happens, which if the aggregation service has a hiccup, can be many hours later (the service SLA and the billing aggregator SLA are different).Even if you have hard limits, the limits trigger on the last known good aggregate, so a spike can make you overshoot the limit.All of these protect the company, but not the customer.If they really cared about customer experience, once a hard limit hits, that limit sets how much the customer pays until it is reset, period, regardless of any lags in billing event processing.That pushes the incentive to build a good billing system. Any delays in aggregation potentially cost the provider money, so they will make it good (it's in their own best interest).
bartread: Sure, but 80 -> 28,000 -> 54,000 is a hell of a lot of slippage.Trading platforms can guarantee a maximum slippage on stops, and often even offer guaranteed stops (with an attached premium), so I don’t see why Google and Firebase can’t do similar.The way it works at present is ridiculous.
nurettin: > Trading platforms can guarantee a maximum slippage on stopsYeah no, physically impossible. If nobody is selling at that price, there is no guarantee your sell stop will execute near that price. They can sweep the market, find the best seller price and execute.There might be a costly way to do it with microservices as I indicated, but your example easily falls apart.
bartread: Not impossible to do: they can hedge and/or absorb the cost, hence the premium. They usually also specify a (fairly large) minimum distance for such stops.
p_stuart82: having to glue pub/sub to a cloud function just to approximate a hard cap is the whole indictment. that's not a safety feature. that's you building your own brakes.
weird-eye-issue: Thanks LLM.
Barbing: Demand on-call phone numbers, autodial the entire company when it looks like they’re about to lose their first bojillion.No, you don't really have to give Google a bunch of phone numbers. The input box will also accept entry of the following text:“I'm a big stupid idiot, and when my API stops working, which it will, it will be all my fault and not Google's.”
Bridged7756: No. I believe all major cloud providers are Pay As You Go. I think only Azure has a tier where you can run on free credits for a while.
hypercube33: The only thing I've seen is in MECM (SCCM) the azure extension will hard shut down when you hit a limit. if you want.
throw_m239339: > These are all poorly designed systems from a CX perspective (the billing systems).These aren't poorly designed systems, they work exactly as intended.
ChromaticPanic: You mean we can implement rate limiting on API for security purposes no problem but suddenly having it track costs as well is technically impossible?
imafish: This is from my experience the same in AWS and Azure. I would love for a kill-switch if the usage goes above a critical threshold. 5 hours down time will not kill my app but a huge cloud bill might.
coredog64: It's been a year since I last looked at this, but when I did you could get near-realtime cost metrics for AWS Bedrock via CloudWatch (you get input & output token counts and have to generate the actual price yourself)
arcticfox: I get furious every time this comes up and somehow there are bootlickers ready to defend big tech on it.My ~2 person small business was almost put out of business due to a runaway job. I had instrumented everything perfectly according to the GCP instructions - as soon as billing went over the cap the notification was hooked up to a kill switch, which it did instantly.GCP sent the notification they offered as best practice 6 HOURS late. They did everything they could to not credit my account until they realized I had the receipts. They said an investigation revealed their pipeline was overwhelmed by the number of line items and that was the reason for the lag. ... The exact scenario it is supposed to function in. JFC.
Barbing: Almost wish the people defending it were paid. Almost more intelligent to rush to the defense if there were a direct financial benefit.Part of it is possibly the curse of knowledge. Someone in the 99th percentile of cloud configuration experts simply can't recall their junior dev days.
charcircuit: In my junior dev days I always paid for the resources I used. Just because you consume a lot of resources by accident that doesn't mean you shouldn't have to pay for it. Accidents do not absolve you from liability.
Barbing: Interesting!I know software is special. That's why software defects are acceptable while a crumbling bridge is not.With that said, should this apply to other industries? If I clip a warehouse shelf on my first day driving a forklift, should my wages be garnished for life to cover the inventory? Or is the inherent nature of the logistics industry such that an accident does not always imply liability? (Or other)
sofixa: > So much for the folks defending these three companies that refused to provide hard spending cap ("but you can set the budget", "you are doing it wrong if you worry about billing", "hard cap it's technically impossible" etc.)Yes, it's technically+business impossible. To implement a hard cap, a bill never to go over, they'd have to cut your service, but also delete all your data in databases, object storage, data lake, etc. This is simply not an option, so they take the different option of authorising support to wave surprise surcharges / billing DDoSes.
benterix: This argument simply doesn't hold water - their (smaller) competition solved this problem over a decade ago.
theanonymousone: But isn't OpenRouter anyway prepaid, meaning you lose at most your current credit?
plorkyeran: You can have a hard cap on compute spend while letting storage go over. Surprise huge bills are approximately never due to storage.
croes: > InsanityYou mean cash machine
827a: Billing control is security, to be clear, but beyond that: The key permissions that enable anyone to generate text also grant access to all GCP Generative AI endpoints in the project they were provisioned in. That includes things like Files that your system might have uploaded to Gemini for processing, and querying the Gemini context caches for recent Gemini completions your system did. Both of these are highly likely to contain customer-facing data, if your organization & systems use them.
nhuser2221: Happened to me to, luckily only was 40$, restricted the api the next day. They were using gemini 3 flash which I am not using.
riteshkew1001: This story is almost quaint. The version we're about to see is a coding agent running in CI with an API key, hitting a transient 429, retrying in a tight loop because the prompt told it to "be persistent." Firebase had at least a human typing the query. Caps aren't a nice-to-have once the caller is autonomous.
not_your_vase: I just find it extraordinary that the biggest tech company in the world can do cutting edge real time AI for millions of people, run Youtube and of course all the other google services with having literally the smartest people in the world and unlimited resources on board, but still can't keep real time track of the user's current billing and their spending limits, it's all best effort still. Somehow it doesn't add up. (Pun not intended, but I'm happy to have it)
strangattractor: If spending caps made them more money they'd find a way;)
sdevonoes: It’s not fair. Google, Amazon, Microsoft… they have never played fairly. They will never do.
saidnooneever: you cannot earn billions a year and not be cheating your users out of their money. its that simple. they dont care for people, otherwise they wouldnt be putting so much effort in making them poor.
_DeadFred_: What, would you call this the behavior of a company that doesn't care for people?https://nypost.com/2026/04/15/business/amazon-warehouse-empl...
ai_slop_hater: You can try implementing rate limiting and not exposing your API keys to the public.
subscribed: You're supposed to drive slow and careful, and not rely on seatbelts and airbags.
sachinag: Hey folks, I just wanted to drop a quick note here that there’s a way to stop billing in an emergency that’s officially documented on the Google Cloud documentation site: https://docs.cloud.google.com/billing/docs/how-to/disable-bi... . You can see the big red warning that this could destroy resources that you can’t get back even if you reconnect a billing account, but this is a way to stop things before they get out of control. This billing account disconnect goes all the way to implement a full on “emergency hand brake” that “unplugs the thing from the wall” (or whatever analogy you prefer) without you having to affirmatively do it yourself.https://docs.cloud.google.com/billing/docs/how-to/modify-pro... and https://docs.cloud.google.com/billing/docs/how-to/budgets-pr... are other documented alternatives to receive billing alerts without the billing account disconnect.The billing account disconnect obviously shouldn’t be used for any production apps or workloads you’re using to serve your own customers or users, since it could interrupt them without warning, but it’s a great option for internal workloads or test apps or proof of concept explorations.Hope this helps!
dorgo: [delayed]
giancarlostoro: It shouldnt mean shutting down all your services, it should mean not letting you provision new ones and limiting the scope of what you can continue doing.
michaelt: If I budget enough to store 1TB of data for 1 month, then on the first day of the month I store 2TB of data - what should the behaviour be after 15 days?
ch0wn: This should be illegal. If a contractor your hired to swap out a tile on your bathroom floor billed you for remodelling your back garden, you would obviously have the legal right to refuse that.
Rekindle8090: "We can either charge per tile, per job or on demand. Or you can have us on call for a year and get any of the former at a discounted rate." "Per tile. Lay tiles until I say stop" >you fall asleep "Wtf why are you still laying tile" "You said per tile and lay until you say stop. That'll be 50k please"How is this the contractors fault?
arjie: Surprised they don’t have usage limits. E.g. you can’t get many IPs from AWS for your region until you request a limit increase. The UX for these kinds of things seems like it should default to low and allow easy increasing.
glenpierce: Nuke the data. It’s gone forever if you didn’t back it up elsewhere. This should be a meaningful risk mitigation that I can employ to avoid having a catastrophic financial disaster.This isn’t a limit I’m setting at some percentage above expected costs, it’s: “I don’t want to take out a HELOC if something goes wrong”
PufPufPuf: For personal projects, is there a cloud service that has actual working spend caps? I would perhaps try using a cloud service if I wasn't exposing myself to a risk of losing my yearly income by a small mistake. Or is renting a VPS the only sensible option?
boredpudding: Google API keys have been used for ages on the frontend. For example on Google Maps embeds. Those are not possible without exposing a key to the frontend. They weren't secret, until Gemini arrived.https://trufflesecurity.com/blog/google-api-keys-werent-secr...https://medium.com/@ahhyesic/your-google-maps-api-key-now-ha...https://www.malwarebytes.com/blog/news/2026/02/public-google...
someothherguyy: If one ignores 70% of the documentation, it makes for a demonizing blog post about it, sure." API keys for Firebase services are not secretAPI keys for Firebase services only identify your Firebase project and app to those services. Authorization is handled through Google Cloud IAM permissions, Firebase Security Rules, and Firebase App Check.All Firebase-provisioned API keys are automatically restricted to Firebase-related APIs. If your app's setup follows the guidelines in this page, then API keys restricted to Firebase services do not need to be treated as secrets, and it's safe to include them in your code or configuration files. Set up API key restrictionsIf you use API keys for other Google services, make sure that you apply API key restrictions to scope your API keys to your app clients and the APIs you use.Use your Firebase-provisioned API keys only for Firebase-related APIs. If your app uses any other APIs (for example, the Places API for Maps or the Gemini Developer API), use a separate API key and restrict it to the applicable API."https://firebase.google.com/support/guides/security-checklis...
three14: The only reasonable design is to have two kinds of API keys that cannot be used interchangeably: public API keys, that cannot be configured to use private APIs, and private API keys, that cannot be configured to use public APIs. There's no one who must use a single API key for both purposes, and almost all cases in which someone does configure an API key like that will be a mistake. It would be even better if the API keys started with a different prefix or had some other easy way to distinguish between the two types so that I can stop getting warnings about my Firebase keys being "public".
SAI_Peregrinus: It'd be much better to call them something like "API usernames" or "API Client IDs". Though I also dislike the naming of "public keys" in asymmetric cryptography, for the same reasons, and I'm definitely not winning that fight!
william0353: Hi I am just curious the reason behind it as I have a firebase app with firebase ai logic service as well.Is that the apiKey below which was used for web sdk init?const firebaseConfig = { apiKey: XXXX, authDomain: XXXDid Zanbezi enabled app check? This is kind worrying...};
bombcar: Is there a cloud provider that does have hard unbreakable billing caps? Everything I've seen has always been notifications or soft caps.Not talking about fixed-access things like a Hetzer box.
pwdisswordfishs: Bunny.net purports to have a pay-as-you-go prepaid credit system that sounds like it works the way people want, and with their description of the way it works probably being sufficient to be legally enforceable if it turns out that it actually works differently and you were to end up with a surprise bill. And evidently it really does work that way; see this post from a couple weeks ago: <https://news.ycombinator.com/item?id=47676416>The only other provider known to work that way is NearlyFreeSpeech.NET, which serves a completely different market segment (so much so that it might as well not even be considered the same kind of product/service).
tveita: "Can you lay tiles until I say stop, or until it's about $250 worth, whichever comes first""No, as one of the top tile layers in the country I can't do that, for your own protection. What if fifty elephants came and wanted to use your bathroom all at once? You'd feel pretty dumb having to reject them instead of me simply automatically adding $1 million to your bill"
sofixa: If that happens, you create a support ticket and AWS/GCP/Azure wave it, especially the first time. They're aware that billing per usage can have surprise effects, but at the same time they don't want to kill their customers' workloads and delete their data, so it is what it is.
gettingoverit: It's quite easy to check responses to other customers in other threads there, and somehow I see quite a lot of "oh, go to that other support" and ghosting.If you create support ticket on hacker news, then yes, you will probably get it waved. It's somewhat sad that HN is their support forum now.
ButlerianJihad: Send me a PDF of your bill, and I will happily print out 10 copies so I can wave them all above my head
johnmaguire: Why aren't we talking about an EC2? Because this is a thread about the Gemini inference API. Loss of service would be restored on payment, not permanent. But that's besides the point: I as the customer set a limit, and you as as the service provider did not adhere to it.I've worked on a number of systems and while it is sometimes impossible to stop at an exact limit, I am confident that it is feasible to stop with less slippage than occurred in this scenario. And at the companies I've worked at, within a margin of error that we're able to absorb any slippage ourselves, as these losses are made up elsewhere, and are worth the customer goodwill. If we can do it, I'm sure Google can.
ItsClo688: agree. the real problem isn't that hard caps are "technically impossible" — it's that the incentive to build them is backwards. a hard cap that stops a runaway process costs the cloud provider money. a "budget alert" that fires after the fact costs the customer money. the 10-minute delay in billing processing is doing a lot of work in that logankilpatrick comment. at $4k/minute burn rates, that's still a $40k exposure window
jeroenhd: From the linked thread: https://ai.google.dev/gemini-api/docs/billing#tier-spend-cap...The warnings firing off hours later is obviously awful design, but the warnings are just warnings. The spend caps are something different and Gemini has them at the very least.For most use cases where businesses use the cloud hard spending caps are an awful idea anyway. Killing your servers the moment you start picking up loads of new customers is a surefire way to kill your big growth opportunity at exactly the wrong time.Of course, if you're not planning for sudden massive growth, you'd be crazy to host your stuff with the big three cloud providers.
bwpw: The scarier version of this problem is coming. Imagine AI agents making API calls across multiple vendors. Each vendor tracks usage in isolation. But no one knows what the agent is spending across all vendors. Nobody building the cross-vendor spending cap today.
monkpit: But in this example, the last line of your story is the customer going “yeah, sounds good, let’s do it and hope that doesn’t happen” and signing the agreement.
dpkirchner: I've yet to receive an accurate response from Gemini about GCP services, beyond completely trivial topics. The most recent, I think, was Gemini advising me that I could attach an existing pd SSD PVC to a n4 or c4 VM. For whatever unknowable reason, Google doesn't allow this and doesn't offer a migration path, and Gemini doesn't "know" anything about it either. It's wild.
franga2000: This is not about paying or not paying. It's about cloud providers not having working tools that let you limit your spending.If I don't set up a budget and run up a huge bill, fine, sure, I should probably pay for it. But if I follow best practices and set up a rule like: "if usage > X €, then stop accepting jobs", and I do it correctly according to the vendor's instructions, yet it still lets me blow past the budget, that's entirely on the vendor.
ButlerianJihad: Back in 2020-21, we were teaching our students how to stand up and configure cloud services, and I decided that a good extension of my homelab would be in AWS, so that I could learn basic cloud administration.I was using the Free Tier for starters, of course. I managed to start up a working MediaWiki server on a Linux machine in EC2. I began to explore some of the more esoteric options and always checked out the extra-secure methods, and IAM and so forth.My MediaWiki had a lot of spammers registering accounts. They weren't actually able to make edits, but I couldn't seem to stop them from creating user accounts. And I always felt rather... naked in terms of securing the Linux system itself. It seemed like the entire Internet had an Eye of Sauron focused on my open TCP ports and they were port-scanning it and running pentests 24/7. I honestly couldn't keep up!Ultimately, I did realize that I could never constrain the budget to an affordable $20 or $30. Signs seemed to indicate that any malicious traffic could crank up my network egress costs alone! There were some rudimentary controls but they would never permit a full-scale shutdown of all services that could actually cost money.So I shut down the cloud services and abandoned my Amazon AWS account. Migration to the cloud might seem like a good value proposition for any business that can't handle its own machine rooms or its own I.T. team to manage physical infrastructure. But it's an unconstrained cost nightmare waiting to happen for basically anyone of any scale. I would never recommend it, for personal or business use, until that aspect is somehow brought under control.
charcircuit: The employer is held liable in such a scenario.
Barbing: Sounds right. Not sure if this is the position:If you’re coding, you should pay for your mistakes, if you’re driving a forklift (sober/responsibly), your employer should pay?
charcircuit: If you are coding your employer pays for it too. If I take the site down and we lose $5 million I am not personally liable for that.
Barbing: That mutually exclusive with our originating comment you think?:"...I always paid for the resources I used. Just because you consume a lot of resources by accident that doesn't mean you shouldn't have to pay for it."
arcticfox: It's not about not paying for the resources you use. It's about not having any mechanism to limit those resources, despite that being an entirely reasonable thing for the cloud providers to provide.Using these platforms is like giving everyone in your business a credit card with an infinite limit. If someone steals it, or anyone makes a mistake, your liability is literally unlimited for no reason at all other than complete laziness by the counterparty.These are completely normal and expected concepts in commercial contracts that the cloud providers just have no respect to provide. I would even wager that their bigger customers have this in their contracts and only SMBs get screwed like this.