grpc-java: GRPC will keep retrying (until RPC timeout) if a user uses a bad service account key

What version of gRPC-Java are you using?

1.27.2

What is your environment?

Linux, JDK8

What do you see?

When a user uses an invalid service account key (like one that was deleted), it is treated as an UNAVAILABLE error. Unavailable errors are interpreted as ones that should be retried by client libraries; consequently the application will attempt to retry this operation with invalid credentials until the RPC timeout is reached (usually 10 minutes).

What do you expect to see instead?

When a user uses an invalid service account key to authenticate with GRPC, it should yield an UNAUTHENTICATED error which indicates the operation should not be retried.

Steps to reproduce the bug

This is most easily reproduced through experimenting with the client libraries.

  1. Create a service account and service account key, download it.
  2. Delete the key entry in Cloud Console (so the downloaded key is no longer valid).
  3. Authenticate with some google service using that key. Here is an example using Pub/Sub like this.

Additional details:

Suggested fix:

The code that controls this behavior is in GoogleAuthLibraryCallCredentials. I don’t think all IOExceptions should be retried though.

In this case, one can see that the exception has a .getCause() which is HttpResponseException and it has .getStatusCode() == 400 which indicates a bad request. This is the error thrown if the user provides an invalid service account key.

Would it be possible to modify it so that if it is an IOException, it will examine the getCause() of the exception and throw UNAUTHENTICATED if the cause is HttpResponseException with status code 400?

Example of exception that you see:

th [] threw exception [Request processing failed; nested exception is com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: Credentials failed to obtain metadata] with root cause

com.google.api.client.http.HttpResponseException: 400 Bad Request
{"error":"invalid_grant","error_description":"Invalid JWT Signature."}
	at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1113) ~[google-http-client-1.34.2.jar:1.34.2]
	at com.google.auth.oauth2.ServiceAccountCredentials.refreshAccessToken(ServiceAccountCredentials.java:441) ~[google-auth-library-oauth2-http-0.20.0.jar:na]
	at com.google.auth.oauth2.OAuth2Credentials.refresh(OAuth2Credentials.java:157) ~[google-auth-library-oauth2-http-0.20.0.jar:na]
	at com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:145) ~[google-auth-library-oauth2-http-0.20.0.jar:na]
	at com.google.auth.oauth2.ServiceAccountCredentials.getRequestMetadata(ServiceAccountCredentials.java:603) ~[google-auth-library-oauth2-http-0.20.0.jar:na]
	at com.google.auth.Credentials.blockingGetToCallback(Credentials.java:112) ~[google-auth-library-credentials-0.20.0.jar:na]
	at com.google.auth.Credentials$1.run(Credentials.java:98) ~[google-auth-library-credentials-0.20.0.jar:na]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_181-google-v7]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_181-google-v7]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_181-google-v7]
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:295) ~[na:1.8.0_181-google-v7]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_181-google-v7]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_181-google-v7]
	at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181-google-v7]

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Reactions: 2
  • Comments: 29 (24 by maintainers)

Commits related to this issue

Most upvoted comments

A design is in-progress in goolge-auth-library-java to give gRPC the information it needs to choose the Status code appropriately (retriable vs non-retriable). The grpc-java changes are expected to be small. I’m very glad there’s been recent movement here.