postgres-operator: PgBouncer Unable to connect to postgres master

  • Which image of the operator are you using? v1.6.3
  • Where do you run it - cloud or metal? Kubernetes or OpenShift? AWS K8s(With Istio side car enabled)
  • Are you running Postgres Operator in production? no
  • Type of issue? question

I am trying to install the operator with the connection pooler enabled, Initially I had issues with Postgres pods also, the secondary node was unable to talk to the master, that issue is fixed with the help of this ticket #https://github.com/zalando/postgres-operator/issues/772. Now I am having issues with pgbouncer which failing to connect with Postgres cluster service. PgBoncer log

exec /bin/pgbouncer /etc/pgbouncer/pgbouncer.ini
2021-06-01 15:00:25.696 UTC [1] LOG kernel file descriptor limit: 65536 (hard: 65536); max_client_conn: 10000, max expected fd use: 10012
2021-06-01 15:00:25.696 UTC [1] LOG listening on 0.0.0.0:5432
2021-06-01 15:00:25.696 UTC [1] LOG listening on [::]:5432
2021-06-01 15:00:25.696 UTC [1] LOG listening on unix:/tmp/.s.PGSQL.5432
2021-06-01 15:00:25.697 UTC [1] LOG process up: PgBouncer 1.12.0, libevent 2.1.11-stable (epoll), adns: c-ares 1.16.1, tls: OpenSSL 1.1.1g  21 Apr 2020
2021-06-01 15:01:25.698 UTC [1] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 0 B/s, xact 0 us, query 0 us, wait 0 us
2021-06-01 15:02:11.714 UTC [1] LOG C-0x56242d327a70: (nodb)/(nouser)@127.0.0.1:48248 registered new auto-database: db=testdb
2021-06-01 15:02:11.834 UTC [1] WARNING lookup failed: my-test-pg-cluster: result=0
2021-06-01 15:02:25.702 UTC [1] LOG stats: 0 xacts/s, 0 queries/s, in 0 B/s, out 0 B/s, xact 0 us, query 0 us, wait 0 us
2021-06-01 15:02:27.085 UTC [1] WARNING lookup failed: my-test-pg-cluster: result=0
2021-06-01 15:02:31.418 UTC [1] WARNING C-0x56242d327e90: testdb/(nouser)@127.0.0.1:49168 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:31.422 UTC [1] WARNING C-0x56242d327c80: testdb/(nouser)@127.0.0.1:49166 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:33.500 UTC [1] WARNING C-0x56242d327c80: testdb/(nouser)@127.0.0.1:49298 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:33.502 UTC [1] WARNING C-0x56242d327e90: testdb/(nouser)@127.0.0.1:49300 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:34.531 UTC [1] WARNING C-0x56242d327e90: testdb/(nouser)@127.0.0.1:49334 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:36.595 UTC [1] WARNING C-0x56242d327e90: testdb/(nouser)@127.0.0.1:49420 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:36.598 UTC [1] WARNING C-0x56242d327c80: testdb/(nouser)@127.0.0.1:49422 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:38.641 UTC [1] WARNING C-0x56242d327c80: testdb/(nouser)@127.0.0.1:49520 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:38.657 UTC [1] WARNING C-0x56242d327e90: testdb/(nouser)@127.0.0.1:49522 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:40.683 UTC [1] WARNING C-0x56242d327e90: testdb/(nouser)@127.0.0.1:49594 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:40.698 UTC [1] WARNING C-0x56242d327c80: testdb/(nouser)@127.0.0.1:49596 pooler error: pgbouncer cannot connect to server
2021-06-01 15:02:42.209 UTC [1] WARNING lookup failed: my-test-pg-cluster: result=0

I have created a service entry like this

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: my-test-cluster-service-entry
spec:
  hosts:
    - my-test-pg-cluster-0.dev.svc.cluster.local
    - my-test-pg-cluster-1.dev.svc.cluster.local
  location: MESH_INTERNAL
  ports:
  - number: 5432
    name: postgres
    protocol: TCP
  - number: 8008
    name: tcp-patroni
    protocol: TCP
  resolution: NONE

Is there anything I need to add? Any help would be appreciated! Thanks in Advance.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 19 (2 by maintainers)

Most upvoted comments

Could someone advise the solution for minikube v1.25.1 with image acid/pgbouncer:master-21 This does not help

    kubectl patch deployment -n postgres echo-postgres-pooler -p \
        '{"spec": {"template": {"spec": {"dnsConfig": {"options": [{"name": "ndots", "value": "5"}, {"name": "use-vc"}, {"name": "single-request-reopen"}]}}}}}'

@azharullah I prepared docker images with libc dns resolver. Seems it works now. https://hub.docker.com/repository/docker/polymetr/zalando-pgbouncer

I think we need to ask Zolando rebuild pgbouncer’s image to alpine 3.15.4 version. But I don’t know how to do it.

I’ve encountered the same issue with DNS resolution in the pgbouncer pod. I can confirm that upgrading the c-ares library to 1.18 resolved the issue. Luckily the pgbouncer binary is dynamically linked so building a base image with a newer c-ares works.

Using alpine:3.15 as the base seems to do the trick, see below Dockerfile (YMMV):

FROM registry.opensource.zalan.do/acid/pgbouncer:master-22 as acidpgbouncer

FROM alpine:3.15

RUN apk --update add libevent openssl c-ares gettext ca-certificates postgresql-client

RUN addgroup -S pgbouncer && adduser -S pgbouncer

COPY --from=acidpgbouncer /var/log/pgbouncer/. /var/log/pgbouncer/
COPY --from=acidpgbouncer /var/run/pgbouncer/. /var/run/pgbouncer/
COPY --from=acidpgbouncer /etc/pgbouncer/. /etc/pgbouncer/
COPY --from=acidpgbouncer /bin/pgbouncer /bin/
COPY --from=acidpgbouncer /etc/ssl/certs/. /etc/ssl/certs/
COPY --from=acidpgbouncer /entrypoint.sh /

RUN chown -R pgbouncer:pgbouncer /var/log/pgbouncer /var/run/pgbouncer /etc/pgbouncer /etc/ssl/certs

USER pgbouncer:pgbouncer

ENTRYPOINT ["/bin/sh", "/entrypoint.sh"]

@FactorT as I said, pgbouncer is dynamically linked to c-ares. This means that at runtime it will use the c-ares library actually available in the container.

The versions listed in the log from pgbouncer is created at compile time and hence isn’t aware of the versions actually used:

2022-06-29 13:50:27.586 UTC [1] LOG process up: PgBouncer 1.17.0, libevent 2.1.12-stable (epoll), adns: c-ares 1.17.2, tls: OpenSSL 1.1.1n  15 Mar 2022

On debugging this issue further, these are my findings: My DB server is test-db-server in the postgresql namespace.

I exec’ed into the pgbouncer pod: The env has PGHOST=test-db-server set.

NSlookup on just the DB server name does not work since it does not resolve do the right domain:

$ nslookup test-db-server
Server:         x.x.x.x
Address:        x.x.x.x

** server can't find test-db-server.svc.cluster.local: NXDOMAIN

** server can't find test-db-server.svc.cluster.local: NXDOMAIN

** server can't find test-db-server.cluster.local: NXDOMAIN

** server can't find test-db-server.cluster.local: NXDOMAIN

Name:   test-db-server.postgresql.svc.cluster.local
Address: x.x.x.x

But providing the complete DNS name works fine:

$ nslookup test-db-server.postgresql.svc.cluster.local
Server:         x.x.x.x
Address:        x.x.x.x

Name:   test-db-server.postgresql.svc.cluster.local
Address: x.x.x.x

I think modifying the PGHOST passed to PGBouncer with the complete DNS name would work. Does anybody have any ideas on passing custom env variable to the PGBouncer pod from the postgres cluster manifest? I had no luck finding a way for this in the docs.