confluent-kafka-go: Build error with golang:1.20-alpine3.17 platform=linux/arm64 using confluent-kafka-go v2.1.0
Description
ARM64 build using golang:1.20-alpine3.17 fails. AMD64 using confluent-kafka-go v2.1.0 build succeeds. ARM64 and AMD64 with v2.0.2 are also successful.
go mod tidy && go mod vendor
docker buildx build --build-arg TARGETARCH=arm64 .
[+] Building 164.6s (11/11) FINISHED
=> [internal] load build definition from Dockerfile 0.1s
=> => transferring dockerfile: 352B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/golang:1.20-alpine3.17 0.9s
=> [auth] library/golang:pull token for registry-1.docker.io 0.0s
=> [1/6] FROM docker.io/library/golang:1.20-alpine3.17@sha256:08e9c086194875334d606765bd60aa064abd3c215abfbcf5737619110d48d114 0.0s
=> [internal] load build context 0.4s
=> => transferring context: 104.94MB 0.3s
=> CACHED [2/6] RUN echo arm64 0.0s
=> [3/6] RUN apk add alpine-sdk ca-certificates 27.5s
=> [4/6] WORKDIR /code 0.1s
=> [5/6] ADD . /code 0.3s
=> ERROR [6/6] RUN CGO_ENABLED=1 GO111MODULE=on GOOS=linux GOARCH=arm64 go build -mod=vendor -o consumer_example -tags musl -ldflags "-w -s" . 135.7s
------
> [6/6] RUN CGO_ENABLED=1 GO111MODULE=on GOOS=linux GOARCH=arm64 go build -mod=vendor -o consumer_example -tags musl -ldflags "-w -s" .:
#0 135.6 # main
#0 135.6 /usr/local/go/pkg/tool/linux_arm64/link: running gcc failed: exit status 1
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: /code/vendor/github.com/confluentinc/confluent-kafka-go/v2/kafka/librdkafka_vendor/librdkafka_musl_linux_arm64.a(rdkafka_sasl_cyrus.o): in function `rd_kafka_sasl_cyrus_close':
#0 135.6 (.text+0xb4): undefined reference to `sasl_dispose'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: /code/vendor/github.com/confluentinc/confluent-kafka-go/v2/kafka/librdkafka_vendor/librdkafka_musl_linux_arm64.a(rdkafka_sasl_cyrus.o): in function `rd_kafka_sasl_cyrus_recv':
#0 135.6 (.text+0x1a0): undefined reference to `sasl_client_step'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: (.text+0x1c8): undefined reference to `sasl_errdetail'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: (.text+0x35c): undefined reference to `sasl_getprop'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: (.text+0x38c): undefined reference to `sasl_getprop'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: (.text+0x3ac): undefined reference to `sasl_getprop'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: /code/vendor/github.com/confluentinc/confluent-kafka-go/v2/kafka/librdkafka_vendor/librdkafka_musl_linux_arm64.a(rdkafka_sasl_cyrus.o): in function `rd_kafka_sasl_cyrus_client_new':
#0 135.6 (.text+0xf74): undefined reference to `sasl_client_new'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: (.text+0xfd4): undefined reference to `sasl_client_start'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: (.text+0xff4): undefined reference to `sasl_errdetail'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: (.text+0x110c): undefined reference to `sasl_listmech'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: (.text+0x1180): undefined reference to `sasl_errstring'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: /code/vendor/github.com/confluentinc/confluent-kafka-go/v2/kafka/librdkafka_vendor/librdkafka_musl_linux_arm64.a(rdkafka_sasl_cyrus.o): in function `rd_kafka_sasl_cyrus_global_init':
#0 135.6 (.text+0x16dc): undefined reference to `sasl_client_init'
#0 135.6 /usr/lib/gcc/aarch64-alpine-linux-musl/12.2.1/../../../../aarch64-alpine-linux-musl/bin/ld: (.text+0x170c): undefined reference to `sasl_errstring'
#0 135.6 collect2: error: ld returned 1 exit status
#0 135.6
------
Dockerfile:12
--------------------
10 | ADD . "/code"
11 |
12 | >>> RUN CGO_ENABLED=1 GO111MODULE=on GOOS=linux GOARCH=$TARGETARCH go build -mod=vendor -o consumer_example -tags musl -ldflags "-w -s" .
13 |
--------------------
How to reproduce
- Use consumer example https://github.com/confluentinc/confluent-kafka-go/tree/master/examples/consumer_example
- go.mod
module main
go 1.20
require github.com/confluentinc/confluent-kafka-go/v2 v2.1.0
- Dockerfile
FROM --platform=linux/$TARGETARCH golang:1.20-alpine3.17 as builder
ARG TARGETARCH
RUN echo $TARGETARCH
RUN apk add alpine-sdk ca-certificates
WORKDIR "/code"
ADD . "/code"
RUN CGO_ENABLED=1 GO111MODULE=on GOOS=linux GOARCH=$TARGETARCH go build -mod=vendor -o consumer_example -tags musl -ldflags "-w -s" .
- Failed build
go mod tidy && go mod vendor
docker buildx build --build-arg TARGETARCH=arm64 .
- Successful build
go mod tidy && go mod vendor
docker buildx build --build-arg TARGETARCH=amd64 .
- arm64 and amd64 are successful after go.mod dependency is downgraded
require github.com/confluentinc/confluent-kafka-go/v2 v2.0.2
Checklist
Please provide the following information:
- confluent-kafka-go and librdkafka version (
LibraryVersion()): confluent-kafka-go v2.1.0
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 5
- Comments: 16 (5 by maintainers)
Commits related to this issue
- Downgrade confluent-kafka-go to v2.0.2 due to https://github.com/confluentinc/confluent-kafka-go/issues/981 — committed to grepplabs/mqtt-proxy by everesio a year ago
The root cause appears to be that
librdkafkanow requires Cyrus SASL, but the confluent-kafka-go wrappers don’t spell out a link dependency to it.All the workarounds above seem to avoid solving this problem by instead installing a system
librdkafka-devwhich requires-tags dynamicper https://github.com/confluentinc/confluent-kafka-go/#librdkafka (not sure why earlier posted workaround examples work without it; we saw linker errors still).To fix what I understand to be the root cause, we can:
cyrus-sasl-dev(for Alpine, see librdkafka sasl docs for other platforms) is installed in the build and run environmentlibsasl2.soI adapted the repro case from the original report for go1.21 + alpine3.18 with the requisite flags:
This works on my arm64/M1 Mac for
TARGETARCHof botharm64andamd64.Raised this PR. And confirmed that the produced binaries don’t include
rdkafka_sasl_cyrus.o, except fordarwinwhere it’s expected to have it.Just try making the following changes
Thank you all for raising awareness on this issue.
That didn’t happen because we configure and build these static binaries in a Semaphore pipeline, not on our laptops. Then we import those binaries locally to push them to
confluent-kafka-go.I believe the issue is here in the release pipeline:
As it should be
because it’s excluding the files the files that have the attribute
extra=gssapi. Given it’s not excluding them, depending on the order, the version with libsasl2 or the one without it could be taken.That explains why the issue is present in 2.1.0 and 2.3.0 but not in 2.2.0 and 2.0.2. Going to create a PR to fix it before our upcoming 2.4.0 release.
Followup: we actually ran into a problem with the proposed workaround –
CGO_LDFLAGSare injected before the cgo LDFLAGS, and gcc-lswitches are sensitive to order (beautifully described here: https://eli.thegreenplace.net/2013/07/09/library-order-in-static-linking).There’s a supremely hacky way to work around this too, using a dangling
-Wl,--start-groupbefore-lsasl2;GCC complains with
but essentially fixes the unclosed group for you.
As far as fixing the root cause bug; I’m not sure why there’s now a hard link dependency on libsasl2.so. But I see that the Darwin cgo LDFLAGS have
-lsasl2as part of the distribution: https://github.com/confluentinc/confluent-kafka-go/blob/master/kafka/build_darwin_arm64.go#L9. There’s probably reasons why this can’t work on Linux in general, but it might be a thread to start pulling on.this needs a bit more attention. wasted too much time on this. 🥲