cloudnative-pg: Backups stuck in walArchivingFailing phase

I have the following configuration:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgresql
  namespace: mynamespace
spec:
  instances: 2
  maxSyncReplicas: 1
  minSyncReplicas: 1
  primaryUpdateStrategy: unsupervised
  backup:
    barmanObjectStore:
      destinationPath: s3://mybucket/
      wal:
        compression: bzip2
        encryption: AES256
      data:
        compression: bzip2
        encryption: AES256
        immediateCheckpoint: false
        jobs: 2
      s3Credentials:
        accessKeyId:
          name: aws-creds
          key: ACCESS_KEY_ID
        secretAccessKey:
          name: aws-creds
          key: ACCESS_SECRET_KEY
    retentionPolicy: "30d"
  postgresql:
    parameters:
      //my parameters are listed here
  storage:
    size: 1000Gi
  affinity:
    enablePodAntiAffinity: true
    nodeSelector:
      nodetype: db
    tolerations:
    - key: db
      operator: Equal
      value: "true"
      effect: NoSchedule
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
  name: postgresql
  namespace: mynamespace
spec:
  schedule: "0 0 * * *"
  suspend: false
  immediate: true
  backupOwnerReference: self 
  cluster:
    name: postgresql
  

My backup starts after applying the YAML files but after that the backups stuck in the walArchivingFailing phase. I tried waiting for multiple backups, and I also started one manually, but I always got the same result. When I removed the

wal:
        compression: bzip2
        encryption: AES256

part the issue persisted.

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 4
  • Comments: 33 (16 by maintainers)

Most upvoted comments

@Richard87 shouldnt endpointURL be only https://<cloudflare-account-id>.r2.cloudflarestorage.com ?

@phisco interestingly as well, when I try to add the backup job, I get the following error:

{"level":"info","ts":"2023-07-28T13:38:48Z","msg":"backup credentials don't yet have access permissions. Will retry reconciliation loop","logging_pod":"main-cluster-4"}
{"level":"error","ts":"2023-07-28T13:38:49Z","msg":"while getting recover credentials","logging_pod":"main-cluster-4","error":"while getting secret aws-creds: secrets \"aws-creds\" is forbidden: User \"system:serviceaccount:databases:main-cluster\" cannot get resource \"secrets\" in API group \"\" in the namespace \"databases\"","stacktrace":"stacktrace..."}

Here’s the secret:

apiVersion: v1
kind: Secret
metadata:
  name: aws-creds
  namespace: databases
type: Opaque
stringData:
  MINIO_ACCESS_KEY: "backup-user1"
  MINIO_SECRET_KEY: "password1"

The user can list from the bucket: image

I have a Role (created by CNPG): image

Then the RoleBinding (that CNPG made through Helm Release): image

Even though kubectl says it should be able to fetch the secret: image

Do I need to provide something else to the CNPG Cluster to allow it to fetch secrets?

@zozidalom my next step would be to check all env vars inside db pod, if any are empty or contain literal ‘NUL’ string