google-cloud-python: Subscribe experimental doesn't appear to be retrying on retryable exceptions, also doesn't surface the error.

Full context at https://github.com/GoogleCloudPlatform/google-cloud-python/issues/4841#issuecomment-386988007 downwards.

gist with logs: https://gist.github.com/psalaberria002/e91956d94e82d706d65da7111a40ca24

Possible culprits:

  1. send() and recv() expectedly error on next(self.call), but if that was a retryable error that somehow re-bubbled, it would not trigger any callbacks.
  2. bidi uses rpc._wrapped because api_core doesn’t appropriately expose add_done_callback. This means that bidi unwraps all of the retry logic.

This doesn’t explain why retrying a perfectly normal looking error somehow still leads us to a broken stream, unless the newly opened stream also has an error. But it’s unclear why that wouldn’t happen when calling the initial RPC.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 43 (20 by maintainers)

Most upvoted comments

hey thanks!

here is some new logs, I could reproduce the error three times, it’s much harder in DEBUG mode: 1.

DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | waiting for recv.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 3 batched requests
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.leaser | The current p99 value is 10 seconds.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Renewing lease for 3 ack IDs.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 3 batched requests
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Snoozing lease management for 3.072772 seconds.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 4 messages.
INFO  | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited. <google.api_core.grpc_helpers._StreamingResponseIterator object at 0x7fa07d4248d0>
DEBUG | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 4 batched requests
INFO  | 12:47:17 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG | 12:47:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Empty queue and inactive call, exiting request generator.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 5 messages.
INFO  | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited. <google.api_core.grpc_helpers._StreamingResponseIterator object at 0x7f40a40322e8>
INFO  | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | RPC termination has signaled streaming pull manager shutdown.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping consumer.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 5 batched requests
INFO  | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping scheduler.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Cleanly exiting request generator.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping leaser.
INFO  | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exiting.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping dispatcher.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Exiting the QueueCallbackWorker.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping heartbeater.
INFO  | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
DEBUG | 16:35:18 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Finished stopping manager.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | waiting for recv.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 6 batched requests
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 6 batched requests
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | recved response.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Scheduling callbacks for 8 messages.
INFO    | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Observed recoverable stream error 504 Deadline Exceeded
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | The bidirectional RPC unexpectedly exited. <google.api_core.grpc_helpers._StreamingResponseIterator object at 0x7f829405a6a0>
INFO    | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | RPC termination has signaled streaming pull manager shutdown.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.dispatcher | Handling 9 batched requests
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping consumer.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Cleanly exiting request generator.
INFO    | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.bidi | Thread-ConsumeBidirectionalStream exiting
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping scheduler.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping leaser.
INFO    | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.leaser | Thread-LeaseMaintainer exiting.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping dispatcher.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Sent request(s) over unary RPC.
ERROR   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Error in queue callback worker: 'NoneType' object has no attribute 'remove'
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/helper_threads.py", line 112, in __call__
    self._callback(items)
  File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 101, in dispatch_callback
    self.ack(batched_commands.pop(requests.AckRequest))
  File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 124, in ack
    self.drop(items)
  File "/usr/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_protocol/dispatcher.py", line 132, in drop
    self._manager.leaser.remove(items)
AttributeError: 'NoneType' object has no attribute 'remove'
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.helper_threads | Exiting the QueueCallbackWorker.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Stopping heartbeater.
INFO    | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.heartbeater | Thread-Heartbeater exiting.
DEBUG   | 16:02:23 | google.cloud.pubsub_v1.subscriber._protocol.streaming_pull_manager | Finished stopping manager.

Thanks, @jam182. let’s continue this over at #5444.

#5402 should address this. Standby for a release.