nodejs-firestore: RST_STREAM error keeps showing up
Environment details
- OS: macOS Catalina 10.15.5 Beta (19F53f)
- Node.js version: 13.7.0
- npm version: 6.13.6
@google-cloud/firestore
version: 3.7.4
Steps to reproduce
This error keeps appearing over and over in my logs (not regularly reproducible):
Error: 13 INTERNAL: Received RST_STREAM with code 2
at Object.callErrorFromStatus (/api/node_modules/@grpc/grpc-js/src/call.ts:81)
at Object.onReceiveStatus (/api/node_modules/@grpc/grpc-js/src/client.ts:324)
at Object.onReceiveStatus (/api/node_modules/@grpc/grpc-js/src/client-interceptors.ts:439)
at Object.onReceiveStatus (/api/node_modules/@grpc/grpc-js/src/client-interceptors.ts:402)
at Http2CallStream.outputStatus (/api/node_modules/@grpc/grpc-js/src/call-stream.ts:228)
at Http2CallStream.maybeOutputStatus (/api/node_modules/@grpc/grpc-js/src/call-stream.ts:278)
at Http2CallStream.endCall (/api/node_modules/@grpc/grpc-js/src/call-stream.ts:262)
at ClientHttp2Stream.<anonymous> (/api/node_modules/@grpc/grpc-js/src/call-stream.ts:532)
at ClientHttp2Stream.emit (events.js:315)
at ClientHttp2Stream.EventEmitter.emit (domain.js:485)
at emitErrorCloseNT (internal/streams/destroy.js:76)
at processTicksAndRejections (internal/process/task_queues.js:84)
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 80
- Comments: 152 (30 by maintainers)
I don’t mean to be pushy on this, but I don’t understand how something this severe has not gotten more attention? @josegpulido and @bitcoinbullbullbull having to basically re-implement core firestore functionality seems ridiculous to me?
looks like the issue has something to do with concurrency. when i limit my concurrency to 1 - the writes complete successfully. but even when the # of writes concurrency is at 2, i start getting these errors.
i can only limit my concurrency in my recreate case - in my real app - we must have parallel writes. this ticket has been open for 2 YEARS - and causes CUSTOMER OUTAGES FOR us. can someone at google responsible for firestore give us an update?
Hey - sorry for the radio silence. I just wanted to ensure everyone that we are still treating this with the priority it deserves.
Hello @CollinsVizion35,
We haven’t found a solution to this issue yet.
We’ve attempted several methods, but none have resolved the problem:
Interestingly, everything operates flawlessly in our development project. The only difference is that the development project has a smaller User collection.
I’m starting to suspect that this might be related to some undocumented limitation in Firestore…
I will stay in touch about updates!
We are also affected by this error. A month before, it was a sporadic error, maybe once a week, and now we are seeing it many times per day.
Update: I ended up moving off of Firestore and moved to a Postgres DB, and ultimately everything from GCP to AWS because of this issue.
an update from my end - we had to rearchitect our data layout to avoid more than one write to a document per second (This is described in the best practices for firestore) https://firebase.google.com/docs/firestore/best-practices#updates_to_a_single_document
i’ve never heard of a data store that limits you to 1 write per second… but whatever.
the thing is, you can get away with having a few writes to a document for a second, but if you start doing it at scale and highly parallel, you are 1) more likely to get the issue 2) less likely to be able to just retry and have the write succeed.
i put in a retry of 3, and was still unable to get a successful write when dealing with hundreds of concurrent writes.
so, although i do think that firestore should really work on these issues and not have weird best practices like 1 write a second - we were able to rewrite our app to get things to succeed (with minimal retrying)
I’m getting this error as well. Any updates, please?
I’m using: “firebase-admin”: “^9.12.0”
Runtime: Node 16
@bitcoinbullbullbull Probably… This and other issues are opened from several months ago. In my case, I had to proxy all of these firebase operations to cloud functions…
FYI https://github.com/googleapis/nodejs-firestore/pull/1373 does not fix this issue. It addresses the fact that once the SDK receives RST_STREAM, the next couple of operations will also see this error. With #1373, this will be less likely, as we establish a new GRPC connection to the backend once we see the first RST_STREAM.
We also have a suspicion that there is a lingering problem in
@grpc/grpc-js
and hope to address it soon.👋 @schmidt-sebastian,
We’ve been facing this error in our production environment for the past 3 days and it’s occurred roughly 10,600 times:
Error: 13 INTERNAL: Received RST_STREAM with code 1
The error is triggered when executing the following code:
Do we have any updates or workarounds for this? It’s affecting our users and we’d appreciate your guidance.
Note: Our Users collection has a significantly large number of documents. Could the volume of documents be a contributing factor to this issue?
I just finished my backend local development and everything work well. Then I deployed and none of my functions work. I’m getting the same error.
My code is pretty simple.
Using Node.js 16
I don’t get it once or twice, it literally happens everytime. My endpoint doesn’t work at all. This is a brand new project on Firestore and my first deployment.
Package.json dependencies
Is there anything I can try to get past this? We need to launch our product and I’d hate to need another week to rewrite all these endpoints on a different platform.
Hey @maylorsan, i think i have found a solution from @edmilsonss
I think it works with these changes
former code: admin.initializeApp({ credential: admin.credential.cert(serviceAccount), databaseURL: “https://project name.firestore.googleapis.com”, });
// Create a Firestore instance const db = admin.firestore();
new code: admin.initializeApp({ credential: admin.credential.cert(serviceAccount), databaseURL: “https://project name.firestore.googleapis.com”, });
// Create a Firestore instance const db = admin.firestore(); const settings = { preferRest: true, timestampsInSnapshots: true }; db.settings(settings);
I promise it’s not a belated April Fool’s, but we actually have a fix: https://github.com/grpc/grpc-node/pull/2084 This will be part of the next
@grpc/grpc-js
release.FWIW, our next big project will be to transition most API call in this client to HTTP Rest. The primary goal is to reduce startup time, but I personally think that it will help with this issue too. If we get lucky, we can have something by end of the quarter.
I basically cannot launch my business because of this. Should I just rewrite everything in MongoDB?
Resolved in my case, caused by presence
Causing invalid firestore connection
I found the issue for my problem. I had the wrong port (8081 instead of 8080) set for
FIRESTORE_EMULATOR_HOST
.Hi @maylorsan
Since your case is different to what is been reported in this issue, could you please open a new ticket and describe your problem in detail?
@maylorsan Setting
preferRest: true
fixes it for one of our simpler services, but not for others. We are not using firestore listeners in any of them, so I’m surprised it’s switching to http streaming from REST at all. Could you give a list of situations in which preferRest will fall back to http streaming so that we can try to avoid them?I’ve got a workaround/solution in my situation
See here: https://github.com/firebase/firebase-admin-node/issues/2345#issuecomment-1776090309
@maylorsan
Okay, thank you. I tried using batch commit and it still didn’t work.
@maylorsan
Have you found a solution to this? because i am facing the same exact error.
Trying to write to collection. Protocol error. No info anywhere. What should be done ?
“firebase-admin”: “^11.5.0”
I’m getting this error when using firebase from within cypress and connecting it to emulator for e2e testing. I can’t make any requests to populate the database because of this error.
That change has been released, so if you update your dependencies, you should get it. Please try it out to see if it helps.
I would like to note that @michaelAtCoalesce was able to share a consistent reproduction of the error, and that was a great help with tracking down the bug we found. So if anyone else encounters a similar error, a similar reproduction would probably be helpful for that too.
I want to temper expectations there. That change fixes “RST_STREAM with error code 2” errors in some cases, but we don’t know if there are other causes of the same error.
we are experiencing this error intermittently (every couple of weeks).
due to error: Error: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)
interestingly, when it does occur, it happens at the beginning of one of our most commonly-invoked cloud functions. seems suspiciously like a coldstart-related issue?
The goal is to reduce latency.
We are still trying to come up with a faster way to force these errors, which would allow us to trace the TCP packages through our systems. Unfortunately, this is proving to be difficult. These errors are essentially “Internal Server Errors” and could originate in many different parts of our system, all of which handle such a significant amount of traffic that we need a very slim time interval to make sure that we can look at all available data.
Sorry for this non-update.
@schmidt-sebastian Do you have some updates on the issue? On my side, just regular outages, nothing new 😂
Might be a problem of grpc/grpc-js as I don’t use nodejs-firestore and also face this error once in a while. It happens with stale connections
This is happening to a project I’m working on as well, using nodejs 10 runtime in Google Cloud functions (via firebase). I have four scripts that fail at least once a day - they make
await reference.set()
calls to firestore as the script executes. At the moment I don’t have any retry logic. The scripts run every two minutes. They make at most 20 set calls.@merlinnot We continue to receive pushback on retrying RST_STREAM/INTERNAL for Commit RPCs. I hope your comment is enough to convince the respective folks that we should retry.
@schmidt-sebastian I’ve been running on v3.8.5 for the entire day, I still see RST_STREAM.
I checked my code to see for which usage patterns does it occur:
await reference.set()
await batch.commit()
await reference.update()
Can we please reopen this issue for visibility?
All document modifications (e.g.
delete()
,create()
) use the same underlying request type. This request type is not safe for automatic retry, so unfortunately we cannot simply retry these errors.