firebase-functions: Error: The request was aborted because there was no available instance.
This is happening in production firebase environment with a Blaze subscription, I’ve started seeing the error The request was aborted because there was no available instance.
since 22nd August 10pm GMT+8.
This error happens across all functions when I make 100+ invocations.
When the error appears it affects all other functions as well (see screenshot).
Can happen with any function with maxInstances
parameter set or without it.
All functions are deployed in us-central1
Quotas doesn’t seem to reach the limit.
Related issues
[REQUIRED] Version info
node: v12.22.3
firebase-functions: 3.14.1
firebase-tools: 9.16.0
firebase-admin: 9.11.0
[REQUIRED] Test case
Firebase pubsub listener
[REQUIRED] Steps to reproduce
Send 100+ messages to the firebase pubsub
[REQUIRED] Expected behavior
Functions execute.
[REQUIRED] Actual behavior
Functions failing with a message: The request was aborted because there was no available instance.
Were you able to successfully deploy your functions?
successfully deployed
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 61
- Comments: 93 (10 by maintainers)
Links to this issue
- Cloud function some time throw error The request was aborted because there was no available instance - Stack Overflow
- firebase - Root cause and retry of "The request was aborted because there was no available instance." error in Cloud Functions - Stack Overflow
- http - Google Cloud Function The request was aborted because there was no available instance - Stack Overflow
Our cloud functions have been able to scale with no issue for over a year. Nothing has changed within our infrastructure but as of late last night 2021-09-07, our functions have begun to fail. This doesn’t appear to be related to traffic, cold starts or long running executions. A request will be made to a function and it will fail on every request for several minutes. It will then begin to work and another function will begin to fail.
There definitely seems to be something larger going on here than just revealing logs to Stackdriver.
I don’t really understand how its only a silent vs logging problem. Our app has been running without issue for months, and only NOW are we getting these issues that actually are preventing some functions from running. It feels to me like something actually is going on or has changed, because we have not changed our backend for a while and its been running smoothly until now.
We are on Blaze subscription. We’re also experiencing this since August 22rd running on Node 12 instances on
asia-south1
Same error across almost all functions. Both onCall and onRequest functions failing.Tried upgrading to Node 14, and setting minInstances=1 and maxInstances=5. Error still persists.
We’re also experiencing this since August 23rd 6 AM CEST running on Node 14 instances. Same error, across multiple functions on
europe-west1
.@tolypash not exactly the solution you’d like but consider moving out of google’s ecosystem. This is a lesson learned hard. So far I have only heard about poor customer support but now I have witnessed it with this issue.
Hot off the press - Google Cloud Functions in us-central1 did report some problem ~2021-09-08:
https://status.cloud.google.com/incidents/16SSwVXrYSLjy8fEMvyZ
The status report claims that the issue only affected functions deployed in us-central1 and that it is now resolved. If you are still seeing issues, please contact Google Cloud Support.
We’ve had this issue for a while now. It seems to be happening more often lately.
@taeold You are lying. This issue is not about invisible warning. Our game service has been calling 100,000 https calls everyday without error for 1 month. We are just experiencing the same issue reported here since yesterday (asia-northeast1). We had to handle 6+ purchase failure cases for past 24 hours because of this issue. Be honest, Google. Tell us what you guys are doing. US -> EU -> South-east Asia (Singapore) -> North-east Asia (Japan). Something is happening.
im having the same issue on asia-northeast1
My theory is that they reduced tolerance for cold starts. For example: earlier they were ok waiting 2s for cold starting but now they throw the said error in just 1s. If you notice (or can create) a function with lil to no dependency, basically a helloWorld function, will not get affected by this.
This issue also aligns with announcement of min-instance for cloud function. Support recommended using this new beta feature but did not have an answer for the cause of this issue.
So likely they changed some configuration in the backend and this is an effect of that. Lot more people having production impact because of this here: https://issuetracker.google.com/issues/194948300 Hope they find and fix this soon 🤞
This error is still going on.
I am getting the same issue on asia-northeast1.
All functions failed between 15:00 and 16:00 today.
My Cloud Function max_instances is set to no limit.
Only started happening few hours earlier on cloud functions. Re deployed but doesn’t seem to be working.
Hi everyone.
Google Cloud Function (GCF) users as a whole are reporting the same issue described here, and https://issuetracker.google.com/issues/153207649#comment3 is the official response from the GCF team.
tl;dr GCF nodejs runtime used to silently drop requests when instance couldn’t be scaled fast enough to respond to demand. Now it’s logging the failed request on your project’s log, hence the sudden appearance of the issue (release note). For pubsub-triggered functions, this error is usually handled gracefully by automatic retry mechanism in the GCF infrastructure. The same can’t be said of HTTP-triggered functions, and the request would have been dropped by the client unless a retry mechanism was already implemented.
To reduce occurrence of the once invisible but now transparent “aborted because there was no available instance” errors, recommendations in https://cloud.google.com/functions/docs/troubleshooting#scalability applies.
I hope this clears up the confusion a bit. I’ll leave this ticket open to answer any follow up questions, but since this problem is directly related to Google Cloud Functions and not specific to Firebase Functions, please consider reaching out to GCP support with your project-specific questions.
@larssn I get what you’re saying. OP is saying support is not being helpful and so was the case with me when I was helping someone navigating this issue. I think suggestion/solution of moving out of this is pragmatic. More so when you are having a real customer impact which is making you lose money. You do not want to bet your company and its revenue to a cloud company which is having hard time determining if at all there is a problem.
Anyway, that is my personal take on this. You are welcome to disagree with it 😃
Same issue since ~ 2021-08-26 20:00 BST,
Node 14, firebase-functions 3.14.1 firebase-admin 9.10.0
It is happening with very minor spikes of requests < 100 across all function deployments.
I’d also like to understand why this very impactful issue being reported by many people isn’t reflected on https://status.cloud.google.com/ as being investigated.
Edit: Looks like this is already being tracked here https://issuetracker.google.com/issues/194948300
@amitrao17 i think you should definitely await all the promises before returning - I’m guessing part of what is happening is whatever Google changed is killing functions much more quickly after they finish running which per the spec is correct or at least not unexpected. Maybe before Google let functions hang around longer so for example your unresolved promises had time to finish even though there was no guarantee of that
I would also like to note that “retrying” HTTP requests from the clients side is not possible either, because this issue seems to be affecting all functions for a certain period of time (ranging from a few seconds to sometimes minutes)
So even when I retry on the client, there will be another error thrown unless retries are spaced out minutes apart, which is not possible for the client.
luckily scheduled functions and triggers are guaranteed at least once working delivery it seems
I’m getting dozens of these errors again for HTTP functions, it seems even worse than before.
On Wed, Sep 15, 2021 at 12:14 AM Wtrapp @.***> wrote:
Apologies, my last post is not true. I have just filtered my log to see that other user’s of my cloud functions are encountering this error, but not as often as when I first posted.
We used to have a similar problem using Firebase for HTTP serving - not errors but cold starts causing HTTP requests to take 10+ seconds meaning our app would often hang loading looking like it crashed. Seems a change has turned what were cold starts into errors.
The problem with Firebase is there is no way to control cold starts unlike with AWS Lambda. Lambda is much smarter about scaling up and down and sending requests to existing instances whereas with Firebase it is more random. Eg having a pinger keeping an instance alive doesn’t really do anything useful.
The solution is to stop using Firebase for HTTP… it really is very bad for it. Switch to App Engine and you can control the scaling a lot more and avoid these problems.
Happening to me too. No issues with scaling since we started our project in March 2020 until start of last week, when this issue happens every few minutes
save issue:(
@larssn At-least once guarantee applies to all event-driven functions. Are you seeing events from Firebase/Firestore being dropped on your project?
@taeold But could you clarify what happens to event-driven functions, such as firebase/firestore triggers? The docs here guarantees at-least-once execution.
I’m hoping that is still the case. Only few of our functions are idempotent enough to warrant enabling the retry policy.