grpc-node: Please enable better error handling or auto handling for the case of normal message callback followed by "RST_STREAM with code 0" error status callback
Is your feature request related to a problem? Please describe.
I wrote a simple unary rpc client over to an internal rpc server that I don’t have control. It failed with “Error: 13 INTERNAL: Received RST_STREAM with code 0”. I have identified the root cause. The server sent the expected message which was received in the onReceiveMessage callback of the grpc-js client.ts. However somehow the grpc-js client received the unexpected error, “RST_STREAM with code 0” instead of OK status. It appears the server or its layers on the server side unexpectedly closed the stream after the correct response message was sent to the grps-js client. The responseMessage variable still holds the earlier received message by the time onReceiveStatus callback is called with the error status in the grpc-js layer.
It appears, from my client code, there is no way to access the received message when the follow-up status callback has received a status other than status.OK. (The current implementation v1.4.5 of “onReceiveStatus” in makeUnaryRequest, makeClientStreamRequest in @grpc/grpc-js/src/client.ts)
Describe the solution you’d like
When an error happens, could we callback with not just error but also response message?
FROM:
if (status.code === constants_1.Status.OK) {
callProperties.callback(null, responseMessage);
}
else {
callProperties.callback(call_1.callErrorFromStatus(status));
}
TO:
if (status.code === constants_1.Status.OK) {
callProperties.callback(null, responseMessage);
}
else {
callProperties.callback(call_1.callErrorFromStatus(status), responseMessage);
}
This enables user to write a better error handling:
// from my client promise callback
...
(err, resp) => {
if (err) {
if (err.code === 13 && err.message.includes('RST_STREAM with code 0') && resp) {
resolve(resp)
} else {
reject(err)
}
} else {
resolve(resp)
}
}
Describe alternatives you’ve considered
For the same server, I have wrote a scala client with the sbt-akka-grpc. It worked out of the box and I didn’t have to deal with the case of the correct-message followed by RST_STREAM with code 0 error.
// scala grpc client
call onComplete {
case Success(msg) => // <-- I just received the message on the success case for the same scenario
myUnmarshalling(msg)
exit(0)
case Failure(e) =>
e.printStackTrace(System.err)
exit(1)
}
}
Hence another option would be for grpc-js to treat this special case as a normal OK case and issue a message to the customer instead of giving the error.
Additional context
Add any other context about the feature request here.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 18 (4 by maintainers)
@murgatroid99 I have the same problem. Sometimes it works as expected: I get an INTERNAL 13 error. Sometimes not: I get RST_STREAM with code 0.
Tested on: @grpc-js: 1.5.9, 1.6.12, 1.7.3, 1.8.11 (tested all versions) node: 16.18.0 and 16.19.1 (nvm) MacOS m1 Monterey 12.5
The server running on node 16.18.0 and uses grpc-js 1.8.11
Server error log: (I am not seeing any grpc error here)
Server code (nestjs): main.ts
exceptionsFilter.ts
Client code:
Client logs:
RST_STREAM with code 0
What I want all the time.
I noticed that when I get “Received RST_STREAM with code 0” there are no “Received server trailers” in the logs.
Enabling
HTTP/2 end-to-end
in my deployment settings on Cloud Run fixed the issue for me.@i7N3 Have you checked whether this happens when running the client and server on the same machine. If it does, a complete reproduction would really help me track down the cause of the bug.
If it does not happen, then the most likely cause is some intervening proxy, so I would recommend opening a support ticket for whatever system you are running the server on, with a reference to your comment.
That error occurs when a client receives an RST_STREAM to end the stream without receiving trailers from the server. That is a protocol error if it occurs, which is why it causes an error to be surfaced. I don’t see any way that the change you linked would impact the handling of the relevant events that result in that error.
As I asked the previous person, can you run your client with the environment variables
GRPC_TRACE=all
andGRPC_VERBOSITY=DEBUG
, and share the output from a run when this error occurs?