distribution: crash in s3-aws doWalk() since #2455 was merged

Since PR #2455 was merged, the registry is crashing with the following backtrace on the first hit it gets:

time="2018-03-20T08:23:52.237673588Z" level=info msg="debug server listening :5001"
time="2018-03-20T08:23:52.241709386Z" level=info msg="redis not configured" go.version=go1.10 instance.id=85be40ec-6e71-409a-bb3d-e67d3ceab702 service=registry version=v2.6.0-rc.1-273-g607ae5d1
time="2018-03-20T08:23:52.241842948Z" level=info msg="Starting upload purge in 11m0s" go.version=go1.10 instance.id=85be40ec-6e71-409a-bb3d-e67d3ceab702 service=registry version=v2.6.0-rc.1-273-g607ae5d1
time="2018-03-20T08:23:52.277934092Z" level=info msg="backend redirection disabled" go.version=go1.10 instance.id=85be40ec-6e71-409a-bb3d-e67d3ceab702 service=registry version=v2.6.0-rc.1-273-g607ae5d1
time="2018-03-20T08:23:52.277986941Z" level=info msg="using inmemory blob descriptor cache" go.version=go1.10 instance.id=85be40ec-6e71-409a-bb3d-e67d3ceab702 service=registry version=v2.6.0-rc.1-273-g607ae5d1
time="2018-03-20T08:23:52.278086413Z" level=info msg="listening on [::]:5000" go.version=go1.10 instance.id=85be40ec-6e71-409a-bb3d-e67d3ceab702 service=registry version=v2.6.0-rc.1-273-g607ae5d1
time="2018-03-20T08:31:53.603909559Z" level=panic msg="runtime error: invalid memory address or nil pointer dereference"
2018/03/20 08:31:53 http: panic serving 172.17.0.3:33060: &{0xc42016a0a0 map[] 2018-03-20 08:31:53.603909559 +0000 UTC m=+481.379379642 panic runtime error: invalid memory address or nil pointer dereference <nil>}
goroutine 31 [running]:
net/http.(*conn).serve.func1(0xc42036cbe0)
	/usr/local/opt/go/libexec/src/net/http/server.go:1726 +0xd0
panic(0xd678c0, 0xc420160780)
	/usr/local/opt/go/libexec/src/runtime/panic.go:505 +0x229
github.com/docker/distribution/vendor/github.com/sirupsen/logrus.Entry.log(0xc42016a0a0, 0xc4205e4090, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/vendor/github.com/sirupsen/logrus/entry.go:124 +0x568
github.com/docker/distribution/vendor/github.com/sirupsen/logrus.(*Entry).Panic(0xc420160730, 0xc420642960, 0x1, 0x1)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/vendor/github.com/sirupsen/logrus/entry.go:169 +0xaa
github.com/docker/distribution/vendor/github.com/sirupsen/logrus.(*Logger).Panic(0xc42016a0a0, 0xc420642960, 0x1, 0x1)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/vendor/github.com/sirupsen/logrus/logger.go:236 +0x6d
github.com/docker/distribution/vendor/github.com/sirupsen/logrus.Panic(0xc420642960, 0x1, 0x1)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/vendor/github.com/sirupsen/logrus/exported.go:107 +0x4b
github.com/docker/distribution/registry.panicHandler.func1.1()
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/registry.go:311 +0xf9
panic(0xc70780, 0x12f5e70)
	/usr/local/opt/go/libexec/src/runtime/panic.go:505 +0x229
github.com/docker/distribution/registry/storage/driver/s3-aws.(*driver).doWalk.func1(0xc420168c80, 0xc42022c001, 0x7fed1a013400)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/storage/driver/s3-aws/s3.go:946 +0xa0
github.com/docker/distribution/vendor/github.com/aws/aws-sdk-go/service/s3.(*S3).ListObjectsV2PagesWithContext(0xc4201660a0, 0x7fed1a013400, 0xc4204d7030, 0xc4205aec30, 0xc420642f38, 0x0, 0x0, 0x0, 0x1, 0x2)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/vendor/github.com/aws/aws-sdk-go/service/s3/api.go:4190 +0x10c
github.com/docker/distribution/registry/storage/driver/s3-aws.(*driver).doWalk(0xc420470d80, 0xe5c740, 0xc4204d6fc0, 0xc420642ff8, 0xc4200f8ae1, 0x20, 0xd7d645, 0x1, 0xc4205aebe0, 0x0, ...)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/storage/driver/s3-aws/s3.go:944 +0x3b6
github.com/docker/distribution/registry/storage/driver/s3-aws.(*driver).Walk(0xc420470d80, 0xe5c740, 0xc4204d6fc0, 0xc420105260, 0x20, 0xc4205aebe0, 0x2, 0x0)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/storage/driver/s3-aws/s3.go:892 +0x16b
github.com/docker/distribution/registry/storage/driver/base.(*Base).Walk(0xc4201647b0, 0xe5c740, 0xc4204d6fc0, 0xc420105260, 0x20, 0xc4205aebe0, 0x0, 0x0)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/storage/driver/base/base.go:239 +0x23d
github.com/docker/distribution/registry/storage.(*registry).Repositories(0xc4203886c0, 0xe5c8c0, 0xc4205ae5a0, 0xc4202f17c0, 0x14, 0x14, 0xc4200f8646, 0x0, 0xc4206431e8, 0x8204cf, ...)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/storage/catalog.go:29 +0x215
github.com/docker/distribution/registry/handlers.(*catalogHandler).GetCatalog(0xc420166390, 0xe5a540, 0xc4205ae410, 0xc42015b800)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/handlers/catalog.go:48 +0x12a
github.com/docker/distribution/registry/handlers.(*catalogHandler).GetCatalog-fm(0xe5a540, 0xc4205ae410, 0xc42015b800)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/handlers/catalog.go:24 +0x48
net/http.HandlerFunc.ServeHTTP(0xc4203abaa0, 0xe5a540, 0xc4205ae410, 0xc42015b800)
	/usr/local/opt/go/libexec/src/net/http/server.go:1947 +0x44
github.com/docker/distribution/vendor/github.com/gorilla/handlers.MethodHandler.ServeHTTP(0xc4202438c0, 0xe5a540, 0xc4205ae410, 0xc42015b800)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/vendor/github.com/gorilla/handlers/handlers.go:35 +0x34d
github.com/docker/distribution/registry/handlers.(*App).dispatcher.func1(0xe5a540, 0xc4205ae410, 0xc42015b800)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/handlers/app.go:715 +0x4a9
net/http.HandlerFunc.ServeHTTP(0xc42016c640, 0xe5a540, 0xc4205ae410, 0xc42015b200)
	/usr/local/opt/go/libexec/src/net/http/server.go:1947 +0x44
github.com/docker/distribution/vendor/github.com/gorilla/mux.(*Router).ServeHTTP(0xc4200aed20, 0xe5a540, 0xc4205ae410, 0xc42015b200)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/vendor/github.com/gorilla/mux/mux.go:114 +0xdc
github.com/docker/distribution/registry/handlers.(*App).ServeHTTP(0xc4200e6360, 0x7fed1a013210, 0xc4202429f0, 0xc42015b000)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/handlers/app.go:630 +0x2b4
github.com/docker/distribution/registry.alive.func1(0x7fed1a013210, 0xc4202429f0, 0xc42015af00)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/registry.go:331 +0x6a
net/http.HandlerFunc.ServeHTTP(0xc420162ea0, 0x7fed1a013210, 0xc4202429f0, 0xc42015af00)
	/usr/local/opt/go/libexec/src/net/http/server.go:1947 +0x44
github.com/docker/distribution/health.Handler.func1(0x7fed1a013210, 0xc4202429f0, 0xc42015af00)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/health/health.go:271 +0x123
net/http.HandlerFunc.ServeHTTP(0xc42016d120, 0x7fed1a013210, 0xc4202429f0, 0xc42015af00)
	/usr/local/opt/go/libexec/src/net/http/server.go:1947 +0x44
github.com/docker/distribution/registry.panicHandler.func1(0x7fed1a013210, 0xc4202429f0, 0xc42015af00)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/registry/registry.go:314 +0x81
net/http.HandlerFunc.ServeHTTP(0xc42016d140, 0x7fed1a013210, 0xc4202429f0, 0xc42015af00)
	/usr/local/opt/go/libexec/src/net/http/server.go:1947 +0x44
github.com/docker/distribution/vendor/github.com/gorilla/handlers.combinedLoggingHandler.ServeHTTP(0xe54200, 0xc4200ac008, 0xe55460, 0xc42016d140, 0xe5b840, 0xc4204328c0, 0xc42015af00)
	/Users/hrakers/dev/go/src/github.com/docker/distribution/vendor/github.com/gorilla/handlers/handlers.go:75 +0x123
net/http.serverHandler.ServeHTTP(0xc42009c820, 0xe5b840, 0xc4204328c0, 0xc42015af00)
	/usr/local/opt/go/libexec/src/net/http/server.go:2694 +0xbc
net/http.(*conn).serve(0xc42036cbe0, 0xe5c380, 0xc420073540)
	/usr/local/opt/go/libexec/src/net/http/server.go:1830 +0x651
created by net/http.(*Server).Serve
	/usr/local/opt/go/libexec/src/net/http/server.go:2795 +0x27b

This is using the current master 607ae5d128a82f280e8c7f453d5fb30c535bda17 on Alpine 3.7 x86_64.

CC @sargun

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 29 (3 by maintainers)

Commits related to this issue

Most upvoted comments

As nobody else seems to be interested in creating a PR, i have rebased @leoh0 s work, and created an PR.

Agree with @tbe - this bug should have higher prio:

This bug also affected Gitlab-registry 2.7.1 with DigitalOcean S3 backend storage. Though regular operations (push\pull) are working - garbage-collector fails to run whenever s3aws.ListObjectsV2Pages used , even for PurgeUploads routine…

DEBU[0000] s3aws.ListObjectsV2Pages(docker/registry/v2/repositories/)  environment=production go.version=go1.10.3 instance.id=e80a6ae8-9d66-4196-b6d1-318084e90eab service=registry trace.duration=663.166259ms trace.file="/var/cache/omnibus/src/registry/src/github.com/docker/distribution/registry/storage/driver/s3-aws/s3.go" trace.func="github.com/docker/distribution/registry/storage/driver/s3-aws.(*driver).doWalk" trace.id=2937cfcc-76f4-4f35-95d8-e3c6ec0bf278 trace.line=969 trace.parent.id=8492a930-cb79-4d82-ab1b-b084d6555e5c
DEBU[0000] s3aws.Walk("/docker/registry/v2/repositories")  environment=production go.version=go1.10.3 instance.id=e80a6ae8-9d66-4196-b6d1-318084e90eab service=registry trace.duration=663.29677ms trace.file="/var/cache/omnibus/src/registry/src/github.com/docker/distribution/registry/storage/driver/base/base.go" trace.func="github.com/docker/distribution/registry/storage/driver/base.(*Base).Walk" trace.id=8492a930-cb79-4d82-ab1b-b084d6555e5c trace.line=232
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xc1af60]

goroutine 1 [running]:
github.com/docker/distribution/registry/storage/driver/s3-aws.(*driver).doWalk.func1(0xc4201aaf00, 0xc420428901, 0x7f2f200df120)
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/registry/storage/driver/s3-aws/s3.go:973 +0xa0
github.com/docker/distribution/vendor/github.com/aws/aws-sdk-go/service/s3.(*S3).ListObjectsV2PagesWithContext(0xc420422150, 0x7f2f200df120, 0xc4201fd6c0, 0xc42042a460, 0xc42039f7b8, 0x0, 0x0, 0x0, 0x1, 0x2)
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/vendor/github.com/aws/aws-sdk-go/service/s3/api.go:4198 +0x124
github.com/docker/distribution/registry/storage/driver/s3-aws.(*driver).doWalk(0xc4201aae80, 0xf7a8a0, 0xc4201fd650, 0xc4204d9878, 0xc420418931, 0x20, 0xe7ce75, 0x1, 0xc420167a60, 0x0, ...)
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/registry/storage/driver/s3-aws/s3.go:971 +0x3b6
github.com/docker/distribution/registry/storage/driver/s3-aws.(*driver).Walk(0xc4201aae80, 0xf7a8a0, 0xc4201fd650, 0xc420427540, 0x20, 0xc420167a60, 0x2, 0x0)
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/registry/storage/driver/s3-aws/s3.go:919 +0x16b
github.com/docker/distribution/registry/storage/driver/base.(*Base).Walk(0xc420363720, 0xf7a8a0, 0xc4201fd650, 0xc420427540, 0x20, 0xc420167a60, 0x0, 0x0)
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/registry/storage/driver/base/base.go:239 +0x23d
github.com/docker/distribution/registry/storage.(*registry).Enumerate(0xc4201fd5e0, 0xf7a5e0, 0xc42041b590, 0xc420428900, 0x0, 0x0)
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/registry/storage/catalog.go:67 +0xec
github.com/docker/distribution/registry/storage.MarkAndSweep(0xf7a5e0, 0xc42041b590, 0xf865a0, 0xc420363720, 0xf7c7e0, 0xc4201fd5e0, 0x0, 0xf7c7e0, 0xc4201fd5e0)
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/registry/storage/garbagecollect.go:40 +0x202
github.com/docker/distribution/registry.glob..func3(0x14a2080, 0xc4203625c0, 0x1, 0x1)
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/registry/root.go:80 +0x48a
github.com/docker/distribution/vendor/github.com/spf13/cobra.(*Command).execute(0x14a2080, 0xc420362570, 0x1, 0x1, 0x14a2080, 0xc420362570)
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/vendor/github.com/spf13/cobra/command.go:495 +0x197
github.com/docker/distribution/vendor/github.com/spf13/cobra.(*Command).Execute(0x14a23c0, 0xc4204d9f78, 0xc42009a058)
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/vendor/github.com/spf13/cobra/command.go:560 +0x2f5
main.main()
        /var/cache/omnibus/src/registry/src/github.com/docker/distribution/cmd/registry/main.go:23 +0x2d

Failed to run garbage-collect command, starting registry service.

With previous versions of registry garbage collector works fine… And there no possibility to switch registry version back or at least to use patched file registry/storage/driver/s3-aws/s3.go - it seems gitlab fetches files directly from repo https://github.com/docker/distribution each time garbage collector runs

So the only proper way for now - impatiently wait for approval and merging of PR: https://github.com/docker/distribution/pull/2879

Any advances on this? Currently it also fails on DigitalOcean Spaces with the same error. I had to make a fork with PR #2455 reverted so it works again.

IMHO the problem is ceph does not provide s3 v2 bucket api(https://docs.aws.amazon.com/AmazonS3/latest/API/v2-RESTBucketGET.html). Even if registry call with v2 api, ceph will respond with just v1 api which does not have KeyCount.

I’m not expert with this code but I think it is only necessary to provide the variable corresponding to KeyCount. So I’m runing about 2-3 weeks in our production cluster with below code, but it seems to be no problem.

https://github.com/leoh0/distribution/commit/abcacf2d0e35278c6a42cf762893ce62f1db217e

I can’t see an open PR relating to https://github.com/leoh0/distribution/commit/abcacf2d0e35278c6a42cf762893ce62f1db217e. I’ll see if I can find time to rebase on 2.7.1 and test.