google-cloud-ruby: Spontaneous 16:Deadline Exceeded after long sleeps
I have a wrapper. When it faces an exception on calling Google::Cloud::Logging::Logger
’s debug
, info
, warn
, error
, methods it just retries:
rescue Google::Cloud::DeadlineExceededError, Google::Cloud::UnauthenticatedError => e
ruby_logger.error "'#{e}' of #{message.inspect}"
ruby_logger.info "sleep #{timeout += 1}"
sleep timeout
retry
It worked fine since spring but nearly a month ago (or a bit earlier) it started to hang – I mean, sometimes retrying does not help anymore and so it gets stuck in a loop, because I don’t want to skip any lines to be logged. So before sleep timeout
I put a raise e if timeout > 100
and here is backtrace:
E 171203 222609 : '16:Deadline Exceeded'
I 171203 222609 : sleep 98
E 171203 222807 : '16:Deadline Exceeded'
I 171203 222807 : sleep 99
E 171203 223007 : '16:Deadline Exceeded'
I 171203 223007 : sleep 100
E 171203 223222 : '16:Deadline Exceeded'
I 171203 223222 : sleep 101
/home/nakilon/.gem/gems/google-cloud-logging-1.3.1/lib/google/cloud/logging/service.rb:308:in `rescue in execute': 16:Deadline Exceeded (Google::Cloud::UnauthenticatedError)
from /home/nakilon/.gem/gems/google-cloud-logging-1.3.1/lib/google/cloud/logging/service.rb:305:in `execute'
from /home/nakilon/.gem/gems/google-cloud-logging-1.3.1/lib/google/cloud/logging/service.rb:104:in `write_entries'
from /home/nakilon/.gem/gems/google-cloud-logging-1.3.1/lib/google/cloud/logging/project.rb:310:in `write_entries'
from /home/nakilon/.gem/gems/google-cloud-logging-1.3.1/lib/google/cloud/logging/logger.rb:551:in `write_entry'
from /home/nakilon/.gem/gems/google-cloud-logging-1.3.1/lib/google/cloud/logging/logger.rb:317:in `add'
from /home/nakilon/.gem/gems/google-cloud-logging-1.3.1/lib/google/cloud/logging/logger.rb:231:in `warn'
from /home/nakilon/.gem/bundler/gems/gcplogger-1bb83b0335e7/lib/gcplogger.rb:31:in `call'
from /home/nakilon/.gem/bundler/gems/gcplogger-1bb83b0335e7/lib/gcplogger.rb:31:in `block (3 levels) in <module:GCPLogger>'
from main.rb:33:in `block in <main>'
from main.rb:32:in `loop'
from main.rb:32:in `<main>'
Weird detail is that my program logs a lot of stuff with different log levels but it gets stuck always on the same line:
sleep 300
logger.warn "LOOP #{Time.now}"
This payload is pretty simple: "LOOP 2017-12-03 20:30:48 +0000"
$ gem env
RubyGems Environment:
- RUBYGEMS VERSION: 2.0.14
- RUBY VERSION: 2.0.0 (2014-11-13 patchlevel 598) [x86_64-linux]
- INSTALLATION DIRECTORY: /home/nakilon/.gem
- RUBY EXECUTABLE: /usr/bin/ruby
- EXECUTABLE DIRECTORY: /home/nakilon/.gem/bin
- RUBYGEMS PLATFORMS:
- ruby
- x86_64-linux
- GEM PATHS:
- /home/nakilon/.gem
- /home/nakilon/.gem/ruby
- /usr/share/gems
- /usr/local/share/gems
- GEM CONFIGURATION:
- :update_sources => true
- :verbose => true
- :backtrace => false
- :bulk_threshold => 1000
- REMOTE SOURCES:
- https://rubygems.org/
$ cat Gemfile.lock
GEM
remote: https://rubygems.org/
specs:
addressable (2.5.2)
public_suffix (>= 2.0.2, < 4.0)
faraday (0.13.1)
multipart-post (>= 1.2, < 3)
fastimage (1.7.0)
addressable (~> 2.3, >= 2.3.5)
google-cloud-core (1.1.0)
google-cloud-env (~> 1.0)
google-cloud-env (1.0.1)
faraday (~> 0.11)
google-cloud-logging (1.3.1)
google-cloud-core (~> 1.1)
google-gax (~> 0.10.1)
stackdriver-core (~> 1.2)
google-gax (0.10.1)
google-protobuf (~> 3.2)
googleapis-common-protos (>= 1.3.5, < 2.0)
googleauth (~> 0.6.2)
grpc (>= 1.7.2, < 2.0)
rly (~> 0.2.3)
google-protobuf (3.5.0)
googleapis-common-protos (1.3.7)
google-protobuf (~> 3.0)
googleapis-common-protos-types (~> 1.0)
grpc (~> 1.0)
googleapis-common-protos-types (1.0.1)
google-protobuf (~> 3.0)
googleauth (0.6.2)
faraday (~> 0.12)
jwt (>= 1.4, < 3.0)
logging (~> 2.0)
memoist (~> 0.12)
multi_json (~> 1.11)
os (~> 0.9)
signet (~> 0.7)
grpc (1.7.2)
google-protobuf (~> 3.1)
googleapis-common-protos-types (~> 1.0.0)
googleauth (>= 0.5.1, < 0.7)
json (2.1.0)
jwt (1.5.6)
little-plugger (1.1.4)
logging (2.2.2)
little-plugger (~> 1.1)
multi_json (~> 1.10)
memoist (0.16.0)
mini_portile2 (2.1.0)
multi_json (1.12.2)
multipart-post (2.0.0)
nokogiri (1.6.8.1)
mini_portile2 (~> 2.1.0)
os (0.9.6)
public_suffix (2.0.5)
reddit_bot (1.4.0)
json
rly (0.2.3)
signet (0.8.1)
addressable (~> 2.3)
faraday (~> 0.9)
jwt (>= 1.5, < 3.0)
multi_json (~> 1.10)
stackdriver-core (1.2.0)
$ hostnamectl
Static hostname: localhost
Transient hostname: instance-1
Icon name: computer-vm
Chassis: vm
Machine ID: a29d1507b8f8596c6e7284b11dc84f8c
Boot ID: 17531d1bfee14443b222687801a6eea9
Virtualization: kvm
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-327.36.1.el7.x86_64
Architecture: x86-64
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 15 (15 by maintainers)
Hi @blowmage, we are working with the backend team to fix this. My hunch is that no work is needed at the client library side. I’ll open an issue if instead there is anything needed from this repo.