etcd: client: user may freeze on KeepAlive

etcd client version: 3.1.0-rc.0 (git hash: 83347907)

Issue can be reproduced in following way:

  1. Create an etcdv3 client.
  2. Make lessor.recvKeepAliveLoop() fail with any server-side error (we got "lease: unknown error(not a primary lessor )").
  3. Create a new lease.
  4. Try to keep it alive with KeepAlive.

Expected behavior: We expected client to recover from recvKeepAliveLoop error after some time and continue to process leases. Alternatively we expected KeepAlive to return error immediately due to broken recvKeepAliveLoop, so we can restart client manually, at least.

Observed behavior: User of KeepAlive freezes because returned channel is never closed, after keepAliveCtxCloser exits on <-l.donec case in select due to recvKeepAliveLoop signal. User code assumes that keep alive still works and expects responses or close of the channel, but none of these events happens.

Side notes We used concurrency.Session to control the lease, and in this case it leads to goroutine leak and freeze of session.Close (or more precisely session.Orphan).

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 15 (15 by maintainers)

Commits related to this issue

Most upvoted comments

OK, I can repro with that test. I’ll see if I can cook up a fix. Thanks!