etcd: Watch endpoint should have a timeout option

Currently, the ?wait=true option to GET (to create a watch) will wait indefinitely for the key to change. This interacts poorly with socket timeouts.

If you have any non-infinite socket timeout, you must know that a socket timeout exception can actually happen simply if no change has happened before the timeout. You can’t differentiate between this state and the server disappearing and being unable to reply.

If you have an infinite socket timeout, and the remote server crashes, you may wait forever for a connection which is long dead.

Thoughts on how to improve this:

  • Add a “timeout” to watches which, on expiry, returns a unique HTTP code, say 204 No Content
  • Periodically write a small amount of (ignored) data out to the HTTP stream as a keepalive (whitespace, so it’s still equivalent JSON?)

About this issue

  • Original URL
  • State: closed
  • Created 9 years ago
  • Comments: 28 (21 by maintainers)

Commits related to this issue

Most upvoted comments

@xiang90 so what’s the solution? Could you provided some examples of using v3’ watcher to distinguish the status of no event and server crashs which being unable to reply? I have confronted the problem with v3’s watcher.

...
wch := etcdClient.Watch(context.Background(), "/EtcdWatcherTest/TestWatchWithFailover/", opts...)

go func() {
   // shutdown etcd server
   // sleep 120s
   //  restart etcd server
   // KV.Put(xxxx)
}
wch get nothing and no err, just wait for a long long time
...