postgres-operator: rolling update of the postgres cluster fails after upgrading spilo

Version 1.5.0

I’ve tried to upgrade the cluster that runs on spilo-11:1.5-p9 to spilo-12:1.6-p3 Unfortunately the rolling update did not finish successfully. Postgres on a new pod isn’t able to start.

/var/run/postgresql:5432 - rejecting connections
2020-05-20 09:01:04,316 INFO: Lock owner: pg-0; I am pg-1
2020-05-20 09:01:04,316 INFO: Still starting up as a standby.
2020-05-20 09:01:04,317 INFO: Lock owner: pg-0; I am pg-1
2020-05-20 09:01:04,317 INFO: does not have lock
2020-05-20 09:01:04,317 INFO: establishing a new patroni connection to the postgres cluster
2020-05-20 09:01:04,446 INFO: establishing a new patroni connection to the postgres cluster
2020-05-20 09:01:04,450 WARNING: Retry got exception: 'connection problems'
2020-05-20 09:01:04,451 INFO: Error communicating with PostgreSQL. Will try again later

looks like the postgres is in the recovery mode

root@pg-1:/home/postgres# psql -d mydb -U postgres
psql: error: could not connect to server: FATAL:  the database system is starting up

What is the recommended way to upgrade spilo ?

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 16 (7 by maintainers)

Most upvoted comments

@CyberDem0n I managed to reproduce this bug with empty database. Log statements look the same, there is only fatal log event that says the database system is starting up.

Here is the minimal reproduction config

apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
  name: dudko-alexey-postgresql
spec:
  teamId: "dudko-alexey"
  volume:
    size: 4Gi
  numberOfInstances: 2

  dockerImage: <docker image>

  postgresql:
    parameters:
      default_transaction_isolation: repeatable read
      log_statement: all
      max_connections: "16"
      shared_buffers: 128MB
    version: "11"

  resources:
    limits:
      cpu: 2000m
      memory: 512Mi
    requests:
      cpu: 200m
      memory: 128Mi

First apply it with dockerImage: registry.opensource.zalan.do/acid/spilo-12:1.6-p2, w8 till master and stand-by will start, then apply dockerImage: registry.opensource.zalan.do/acid/spilo-12:1.6-p3. The standby will rollout, but the database will be stuck in the infinite starting up state.