google-cloud-go: pubsub: Intermittent processing slowdown

Client

PubSub

Environment

Ubuntu Linux Bionic VM on GCE

Go Environment

wstorey_maxmind_com@vagrant-wstorey:~/mm_website (main)$ go version
go version go1.16.3 linux/amd64
wstorey_maxmind_com@vagrant-wstorey:~/mm_website (main)$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/wstorey_maxmind_com/.cache/go-build"
GOENV="/home/wstorey_maxmind_com/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/wstorey_maxmind_com/go/pkg/mod"
GONOPROXY=""
GONOSUMDB="github.maxmind.com/*"
GOOS="linux"
GOPATH="/home/wstorey_maxmind_com/go"
GOPRIVATE=""
GOPROXY="https://go-proxy.maxmind.com/"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.4"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2229490606=/tmp/go-build -gno-record-gcc-switches"

Code

I don’t have a small reproducer currently. I actually don’t know how to reproduce it even with our full service. It is happening about once a day since Wednesday.

Expected behavior

Message processing (receiving and then acking) continues at a consistent rate.

Actual behavior

Recently we have had a handful of situations where a daemon processing messages from a Pub/Sub subscription apparently greatly slows down such that we end up with a large backlog of messages. Restarting the daemon resolves the issue. They began on Wednesday after things being stable.

  • 2021-05-05 at approximately 13:51 Pacific
  • 2021-05-06 at approximately 07:02 Pacific
  • 2021-05-06 at approximately 07:51 Pacific
  • 2021-05-06 at around 10:12 Pacific
  • 2021-05-06 at around 10:22 Pacific
  • Once 2021-05-07 at approximately 19:40 UTC

I’m not sure there’s enough to really diagnose what is going on, but it appears to be stuck inside the pubsub package somewhere.

Currently we are theorizing that there is a connection problem to Pub/Sub or perhaps an issue with the Pub/Sub server we’re connected to. We’re hoping you might be able to point us in the direction of figuring out what is going on.

Screenshots

These are from the most recent event.

Opencensus ACKs up to the time we restarted it:

Screenshot 2021-05-07 15 10 36

Opencensus modified ACKs, likewise:

Screenshot 2021-05-07 15 12 47

Opencensus stream responses, likewise:

Screenshot 2021-05-07 15 13 41

Opencensus pulls, likewise:

Screenshot 2021-05-07 15 14 21

Opencensus outstanding messages, likewise:

Screenshot 2021-05-07 15 14 49

Successful ACKs (from Google’s side, note the timestamps do not exactly line up with the Opencensus ones AFAIK):

Screenshot 2021-05-07 15 15 34

Expired ACKs (likewise):

Screenshot 2021-05-07 15 16 22

Unacked messages (likewise):

Screenshot 2021-05-07 15 16 46

I have a few more graphs (including more opencensus data). I included ones that looked interesting.

Additional context

We are on cloud.google.com/go/pubsub v1.10.1. We had updated to v1.10.3 but downgraded to see if the version update was related since we deployed v1.10.3 shortly before the first time we saw this.

The most recent time this happened I captured traces from a SIGQUIT: https://gist.github.com/horgh/76e9093c2cf48a04cf1a4ab0c1f82200

This shows me that we have no messages in flight (in our code that is) being processed. We are sitting in a Receive() call however.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 18 (7 by maintainers)

Most upvoted comments

Thanks! I’ll give that a shot. Is there any value you think would make sense? I’m thinking maybe 50, arbitrarily 😛.

We’ve downgraded to 1.8.0. I’ll let you know if it happens again.

Note that we were last on 1.8.x in early January. On January 5 we updated from 1.8.3 to 1.9.1.

We experienced this issue once more Friday afternoon and again on Saturday.