grpc-node: `@grpc/grpc-js` throw 'Received RST_STREAM with code 0' with retry enabled
Problem description
@grpc/grpc-js
throw ‘Received RST_STREAM with code 0’ with retry enabled
Reproduction steps
- Start an example
HelloWorld
Golanggrpc
server in Kubernetes and enable the istio sidecar - To reduce the connection reset(
reset reason: connection termination
), we set themax_concurrent_streams: 256
- Making multi-calls same time by nodeJs with large responses (like response size 6k/per request, and 1000 requests)
- Got Error: 13 INTERNAL: Received RST_STREAM with code 0
Environment
- Kubernetes with istio envoy sidecar enabled(Thanks
orbstack
Kubernetes environment, I can debug this error in local VSCode environment)- NodeJs -> client sidecar(envoy) -> server
- NodeJs -> server sidecar(envoy) -> server
- NodeJs -> client sidecar(envoy) -> server sidecar(envoy) -> server
- Node version: v16.18.1
- Node installation method: Docker image
- If applicable, compiler version: N/A
- Package name and version:
@grpc/grpc-js
version > 1.7.3
Additional context
Logs with GRPC_TRACE=all GRPC_VERBOSITY=DEBUG
D 2023-09-09T23:58:35.160Z | load_balancing_call | [1513] Received metadata
D 2023-09-09T23:58:35.160Z | retrying_call | [1512] Received metadata from child [1513]
D 2023-09-09T23:58:35.160Z | retrying_call | [1512] Committing call [1513] at index 0
D 2023-09-09T23:58:35.160Z | resolving_call | [256] Received metadata
D 2023-09-09T23:58:35.160Z | subchannel_call | [3256] HTTP/2 stream closed with code 0
D 2023-09-09T23:58:35.160Z | subchannel_call | [3256] ended with status: code=13 details="Received RST_STREAM with code 0"
D 2023-09-09T23:58:35.160Z | load_balancing_call | [1513] Received status
D 2023-09-09T23:58:35.160Z | load_balancing_call | [1513] ended with status: code=13 details="Received RST_STREAM with code 0"
D 2023-09-09T23:58:35.160Z | retrying_call | [1512] Received status from child [1513]
D 2023-09-09T23:58:35.160Z | retrying_call | [1512] state=COMMITTED handling status with progress PROCESSED from child [1513] in state ACTIVE
D 2023-09-09T23:58:35.160Z | retrying_call | [1512] ended with status: code=13 details="Received RST_STREAM with code 0"
D 2023-09-09T23:58:35.160Z | resolving_call | [256] Received status
D 2023-09-09T23:58:35.160Z | resolving_call | [256] ended with status: code=13 details="Received RST_STREAM with code 0"
Error: 13 INTERNAL: Received RST_STREAM with code 0
at callErrorFromStatus (/[redacted]/grpc-node/examples/node_modules/@grpc/grpc-js/build/src/call.js:31:19)
at Object.onReceiveStatus (/[redacted]/grpc-node/examples/node_modules/@grpc/grpc-js/build/src/client.js:192:76)
at Object.onReceiveStatus (/[redacted]/grpc-node/examples/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:360:141)
at Object.onReceiveStatus (/[redacted]/grpc-node/examples/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:323:181)
at /[redacted]/grpc-node/examples/node_modules/@grpc/grpc-js/build/src/resolving-call.js:94:78
at processTicksAndRejections (node:internal/process/task_queues:78:11)
for call at
at ServiceClientImpl.makeUnaryRequest (/[redacted]/grpc-node/examples/node_modules/@grpc/grpc-js/build/src/client.js:160:32)
at ServiceClientImpl.<anonymous> (/[redacted]/grpc-node/examples/node_modules/@grpc/grpc-js/build/src/make-client.js:105:19)
at /[redacted]/grpc-node/examples/helloworld/dynamic_codegen/client.js:73:14
at new Promise (<anonymous>)
at main (/[redacted]/grpc-node/examples/helloworld/dynamic_codegen/client.js:72:21)
at Object.<anonymous> (/[redacted]/grpc-node/examples/helloworld/dynamic_codegen/client.js:102:1)
at Module._compile (node:internal/modules/cjs/loader:1155:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1209:10)
at Module.load (node:internal/modules/cjs/loader:1033:32)
at Function.Module._load (node:internal/modules/cjs/loader:868:12) {
code: 13,
details: 'Received RST_STREAM with code 0',
metadata: Metadata { internalRepr: Map(0) {}, options: {} }
}
Methods I can apply
- Disable
grpc.enable_retries
by setgrpc.enable_retries = 0
- Downgrade to the version
1.7.3
It seems we can just ignore this error, and doesn’t affect the results.
RST_STREAM
with code NO_ERROR
in HTTP2 RFC
Error Codes
There are some similar cases in other projects:
- Issue 650438:
Cronet
treatsRST_STREAM
with code NO_ERROR as aSPDY
failure (error -337) - Issue 603182:
RST_STREAM
with NO_ERROR isn’t handledproperly
- HTTP/2: send WINDOW_UPDATE instead of
RST_STREAM
with NO_ERROR. - HTTP/2: switched back to
RST_STREAM
with NO_ERROR.
So can we just ignore this NO_ERROR
in the Node js SDK? Thanks in advance.
About this issue
- Original URL
- State: open
- Created 10 months ago
- Reactions: 1
- Comments: 27 (13 by maintainers)
We’ve been facing this error in our production environment for the past 3 days and it’s occurred roughly 15k times:
Error: 13 INTERNAL: Received RST_STREAM with code 1
The error is triggered when executing the following code:
We’ve attempted several methods, but none have resolved the problem:
Interestingly, everything operates flawlessly in our development project. The only difference is that the development project has a smaller User collection.
@murgatroid99 Is any updates on this issue? We using nodejs v16.20.0 and firebase-admin library
The error this bug is about means that the client did not receive trailers and that it received an RST_STREAM with code 0 (no error).
I presented a theory in a previous comment https://github.com/grpc/grpc-node/issues/2569#issuecomment-1883544814. The problem could also be related to https://github.com/envoyproxy/envoy/issues/30149 as mentioned in https://github.com/grpc/grpc-node/issues/2569#issuecomment-1899554032. But in general, I don’t know why Envoy does what it does; the right people to ask about that are the Envoy maintainers.
@maylorsan Code 1 is different from code 0. Please file a separate issue.
Same problem i am facing here. is there any update ?
OK, I see what happened: having retries enabled changed the sequencing of some low-level operations, which caused a different outcome with your specific test setup. With retries disabled, each HTTP/2 stream starts and ends in a single synchronous operation, while with retries enabled, starting and ending the stream are separated into two asynchronous operations. The result is that with retries disabled, each stream starts and ends one at a time, while with retries enabled, all of the streams start, and then all of the streams end. This means that in the “retries disabled” case, the remote end can finish processing the first stream before the last stream even starts, while in the “retries enabled” case, the remote end is guaranteed to see all of the streams open at once. So, the primary difference between the observable outcomes in these two tests is that the “retries enabled” test triggers the “max concurrent streams” limit, and the “retries disabled” test does not. If you instead forced the client to reach the “max concurrent streams” limit by opening streaming requests without closing them, you would see the same error in both cases.
The primary problem here, still, is that when the “max concurrent streams” limit is reached, Istio closes the stream in a way that is not valid within the gRPC protocol. Istio is what needs to be changed here. There may be a configuration option to change how it responds in this situation, but if not, this is a bug in Istio.