google-cloud-go: spanner: transaction fails with `Session not found` error

Client

Spanner: v0.37.4 (But as far as I know, it also occurs when we use latest version)

Describe Your Environment

Alpine Docker on GKE

Expected Behavior

If possible, this client library should retries a transaction when it fails with Session not found error.

Actual Behavior

We sometimes get the Session not found errors. This client library retries transactions only when the Abort error occurred, so when taken session is not active, it just returns the NotFound error to callers without retrying.

Since this library creates the pool of the sessions, I think it should retry failed transactions caused by Session not found by taking another session. But are there any problems to take another session from the pool?

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 22 (18 by maintainers)

Commits related to this issue

Most upvoted comments

@110y Yes (and additional transaction types).

@110y and @kazegusuri

Thanks for updating your CL and sorry for taking so long to merge this. I’ll have a look at this this morning and try to get it in ASAP.

@olavloite

As trial, I send a CL that makes ReadWriteTransaction retry on Session not found error (case 2 you mentioned above).

@110y To my understanding, it is possible (although not very common) that Cloud Spanner deletes sessions that have been idle for less than 1 hour, which could cause this problem. The reason I asked whether you had a specific use case that would always (or often) cause this problem, was to check whether you were running into some unknown bug in the session pool. Considering your error rate of 1-2 errors per week at 1QPS, I don’t think that this is a specific bug in the Go session pool.

The Java client library for Cloud Spanner added a protection against this problem a couple of months ago along the lines that I mentioned above. I’ll have a look to see if it is feasible to add this protection for the Go client library as well.