postgres-operator: PgBackrest fails with unable to create directory '/home/postgres/.ssh'.

Hi! I have som trouble with backups, the job fails with this:

time="2021-11-27T17:08:59Z" level=info msg="crunchy-pgbackrest starts"
time="2021-11-27T17:08:59Z" level=info msg="debug flag set to false"
time="2021-11-27T17:08:59Z" level=info msg="backrest backup command requested"
time="2021-11-27T17:08:59Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1]"
time="2021-11-27T18:09:41Z" level=info msg="output=[]"
time="2021-11-27T18:09:41Z" level=info msg="stderr=[
WARN: no prior backup exists, incr backup has been changed to full
WARN: resumable backup 20211127-144613F of same type exists -- remove invalid files and resume
ERROR: [082]: WAL segment 0000000400000000000000B7 was not archived before the 3600000ms timeout
       HINT: check the archive_command to ensure that all options are correct (especially --stanza).
       HINT: check the PostgreSQL server log for errors.
       HINT: run the 'start' command if the stanza was previously stopped.
WARN: remote-0 process on 'eportaldb-main-49jt-0.eportaldb-pods.eportal.svc.cluster.local.' terminated unexpectedly [42]: Could not create directory '/home/postgres/.ssh'.\r
      Could not chdir to home directory /var/lib/pgsql: No such file or directory
      ERROR: [042]: timeout after 1830000ms waiting for read from 'remote-0'
WARN: remote-0 process on 'eportaldb-main-tpzh-0.eportaldb-pods.eportal.svc.cluster.local.' terminated unexpectedly [42]: Could not create directory '/home/postgres/.ssh'.\r
      Could not chdir to home directory /var/lib/pgsql: No such file or directory
      ERROR: [042]: timeout after 1830000ms waiting for read from 'remote-0'
]"
time="2021-11-27T18:09:41Z" level=fatal msg="command terminated with exit code 82"

I’m not sure where to go from here…

PGO creates som files in the backup repos, but mostly “information”, and no data

The last part of the logs, seems to be repeated at lot:

2021-11-27 18:57:10.654 P00   INFO: archive-push command begin 2.35: [pg_wal/00000003.history] --archive-timeout=7200 --exec-id=222838-5e19fb18 --log-level-console=debug --log-path=/tmp --pg1-path=/pgdata/pg14 --repo2-gcs-bucket=postgres-eportal2 --repo2-gcs-key=<redacted> --repo1-host=eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local. --repo1-host-user=postgres --repo1-path=/pgbackrest/repo1 --repo2-path=/pgbackrest/postgres-operator/eportaldb/repo2 --repo3-path=/pgbackrest/postgres-operator/eportaldb/repo3 --repo3-s3-bucket=postgres-eportal --repo3-s3-endpoint=fra1.digitaloceanspaces.com --repo3-s3-key=<redacted> --repo3-s3-key-secret=<redacted> --repo3-s3-region=fra1 --repo2-type=gcs --repo3-type=s3 --stanza=db
2021-11-27 18:57:10.654 P00  DEBUG:     config/load::cfgLoad: => void
2021-11-27 18:57:10.654 P00  DEBUG:     command/archive/push/push::cmdArchivePush: (void)
2021-11-27 18:57:10.654 P00  DEBUG:     protocol/helper::pgIsLocal: (pgIdx: 0)
2021-11-27 18:57:10.654 P00  DEBUG:     protocol/helper::pgIsLocal: => true
2021-11-27 18:57:10.654 P00  DEBUG:     command/control/common::lockStopTest: (void)
2021-11-27 18:57:10.654 P00  DEBUG:     storage/storage::storageExists: (this: {type: posix, path: {"/"}, write: false}, pathExp: {"/tmp/pgbackrest/db.stop"}, param.timeout: 0)
2021-11-27 18:57:10.654 P00  DEBUG:     storage/storage::storageInfo: (this: {type: posix, path: {"/"}, write: false}, fileExp: {"/tmp/pgbackrest/db.stop"}, param.level: 3, param.ignoreMissing: true, param.followLink: true, param.noPathEnforce: false)
2021-11-27 18:57:10.654 P00  DEBUG:     storage/storage::storageInfo: => {StorageInfo}
2021-11-27 18:57:10.654 P00  DEBUG:     storage/storage::storageExists: => false
2021-11-27 18:57:10.654 P00  DEBUG:     storage/storage::storageExists: (this: {type: posix, path: {"/"}, write: false}, pathExp: {"/tmp/pgbackrest/all.stop"}, param.timeout: 0)
2021-11-27 18:57:10.654 P00  DEBUG:     storage/storage::storageInfo: (this: {type: posix, path: {"/"}, write: false}, fileExp: {"/tmp/pgbackrest/all.stop"}, param.level: 3, param.ignoreMissing: true, param.followLink: true, param.noPathEnforce: false)
2021-11-27 18:57:10.654 P00  DEBUG:     storage/storage::storageInfo: => {StorageInfo}
2021-11-27 18:57:10.654 P00  DEBUG:     storage/storage::storageExists: => false
2021-11-27 18:57:10.654 P00  DEBUG:     command/control/common::lockStopTest: => void
2021-11-27 18:57:10.654 P00  DEBUG:     command/archive/common::walPath: (walFile: {"pg_wal/00000003.history"}, pgPath: {"/pgdata/pg14"}, command: {"archive-push"})
2021-11-27 18:57:10.654 P00  DEBUG:     command/archive/common::walPath: => {"/pgdata/pg14/pg_wal/00000003.history"}
2021-11-27 18:57:10.654 P00  DEBUG:     command/archive/push/push::archivePushCheck: (pgPathSet: true)
2021-11-27 18:57:10.654 P00  DEBUG:     protocol/helper::pgIsLocal: (pgIdx: 0)
2021-11-27 18:57:10.654 P00  DEBUG:     protocol/helper::pgIsLocal: => true
2021-11-27 18:57:10.654 P00  DEBUG:     storage/posix/storage::storagePosixNew: (path: {"/pgdata/pg14"}, param.modeFile: 0000, param.modePath: 0000, param.write: false, param.pathExpressionFunction: null)
2021-11-27 18:57:10.654 P00  DEBUG:     storage/posix/storage::storagePosixNewInternal: (type: posix, path: {"/pgdata/pg14"}, modeFile: 0640, modePath: 0750, write: false, pathExpressionFunction: null, pathSync: true)
2021-11-27 18:57:10.654 P00  DEBUG:     storage/posix/storage::storagePosixNewInternal: => {type: posix, path: {"/pgdata/pg14"}, write: false}
2021-11-27 18:57:10.654 P00  DEBUG:     storage/posix/storage::storagePosixNew: => {type: posix, path: {"/pgdata/pg14"}, write: false}
2021-11-27 18:57:10.654 P00  DEBUG:     postgres/interface::pgControlFromFile: (storage: {type: posix, path: {"/pgdata/pg14"}, write: false})
2021-11-27 18:57:10.654 P00  DEBUG:     storage/storage::storageNewRead: (this: {type: posix, path: {"/pgdata/pg14"}, write: false}, fileExp: {"global/pg_control"}, param.ignoreMissing: false, param.compressible: false, param.limit: null)
2021-11-27 18:57:10.655 P00  DEBUG:     storage/storage::storageNewRead: => {type: posix, name: {"/pgdata/pg14/global/pg_control"}, ignoreMissing: false}
2021-11-27 18:57:10.655 P00  DEBUG:     storage/storage::storageGet: (file: {type: posix, name: {"/pgdata/pg14/global/pg_control"}, ignoreMissing: false}, param.exactSize: 512)
2021-11-27 18:57:10.655 P00  DEBUG:     storage/storage::storageGet: => {used: 512, size: 512}
2021-11-27 18:57:10.655 P00  DEBUG:     postgres/interface::pgControlFromFile: => {version: 140000, systemId: 7034087418294739021, walSegmentSize: 16777216, pageChecksum: true}
2021-11-27 18:57:10.655 P00  DEBUG:     protocol/helper::repoIsLocal: (repoIdx: 0)
2021-11-27 18:57:10.655 P00  DEBUG:     protocol/helper::repoIsLocal: => false
2021-11-27 18:57:10.655 P00  DEBUG:     protocol/helper::protocolRemoteGet: (protocolStorageType: repo, hostIdx: 0)
2021-11-27 18:57:10.655 P00  DEBUG:     protocol/helper::protocolRemoteParamSsh: (protocolStorageType: repo, hostIdx: 0)
2021-11-27 18:57:10.655 P00  DEBUG:     protocol/helper::protocolRemoteParam: (protocolStorageType: repo, hostIdx: 0)
2021-11-27 18:57:10.656 P00  DEBUG:     protocol/helper::protocolRemoteParam: => {["--exec-id=222838-5e19fb18", "--log-level-console=off", "--log-level-file=off", "--log-level-stderr=error", "--process=0", "--remote-type=repo", "--repo=1", "--repo1-path=/pgbackrest/repo1", "--stanza=db", "archive-push:remote"]}
2021-11-27 18:57:10.656 P00  DEBUG:     protocol/helper::protocolRemoteParamSsh: => {["-o", "LogLevel=error", "-o", "Compression=no", "-o", "PasswordAuthentication=no", "postgres@eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local.", "pgbackrest --exec-id=222838-5e19fb18 --log-level-console=off --log-level-file=off --log-level-stderr=error --process=0 --remote-type=repo --repo=1 --repo1-path=/pgbackrest/repo1 --stanza=db archive-push:remote"]}
2021-11-27 18:57:10.656 P00  DEBUG:     common/exec::execNew: (command: {"ssh"}, param: {["-o", "LogLevel=error", "-o", "Compression=no", "-o", "PasswordAuthentication=no", "postgres@eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local.", "pgbackrest --exec-id=222838-5e19fb18 --log-level-console=off --log-level-file=off --log-level-stderr=error --process=0 --remote-type=repo --repo=1 --repo1-path=/pgbackrest/repo1 --stanza=db archive-push:remote"]}, name: {"remote-0 process on 'eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local.'"}, timeout: 1830000)
2021-11-27 18:57:10.656 P00  DEBUG:     common/exec::execNew: => {Exec}
2021-11-27 18:57:10.656 P00  DEBUG:     common/exec::execOpen: (this: {Exec})
2021-11-27 18:57:10.661 P00  DEBUG:     common/exec::execOpen: => void
2021-11-27 18:57:11.655 P00  DEBUG:     protocol/helper::repoIsLocal: (repoIdx: 1)
2021-11-27 18:57:11.655 P00  DEBUG:     protocol/helper::repoIsLocal: => true
2021-11-27 18:57:11.655 P00  DEBUG:     storage/gcs/storage::storageGcsNew: (path: {"/pgbackrest/postgres-operator/eportaldb/repo2"}, write: false, pathExpressionFunction: (function *), bucket: {"postgres-eportal2"}, keyType: service, chunkSize: 4194304, endpoint: {"storage.googleapis.com"}, timeout: 60000, verifyPeer: true, caFile: null, caPath: null)
2021-11-27 18:57:11.655 P00  DEBUG:     storage/posix/storage::storagePosixNew: (path: {"/"}, param.modeFile: 0000, param.modePath: 0000, param.write: false, param.pathExpressionFunction: null)
2021-11-27 18:57:11.655 P00  DEBUG:     storage/posix/storage::storagePosixNewInternal: (type: posix, path: {"/"}, modeFile: 0640, modePath: 0750, write: false, pathExpressionFunction: null, pathSync: true)
2021-11-27 18:57:11.656 P00  DEBUG:     storage/posix/storage::storagePosixNewInternal: => {type: posix, path: {"/"}, write: false}
2021-11-27 18:57:11.656 P00  DEBUG:     storage/posix/storage::storagePosixNew: => {type: posix, path: {"/"}, write: false}
2021-11-27 18:57:11.656 P00  DEBUG:     storage/storage::storageNewRead: (this: {type: posix, path: {"/"}, write: false}, fileExp: {"/etc/pgbackrest/conf.d/gcs-key.json"}, param.ignoreMissing: false, param.compressible: false, param.limit: null)
2021-11-27 18:57:11.656 P00  DEBUG:     storage/storage::storageNewRead: => {type: posix, name: {"/etc/pgbackrest/conf.d/gcs-key.json"}, ignoreMissing: false}
2021-11-27 18:57:11.656 P00  DEBUG:     storage/storage::storageGet: (file: {type: posix, name: {"/etc/pgbackrest/conf.d/gcs-key.json"}, ignoreMissing: false}, param.exactSize: 0)
2021-11-27 18:57:11.656 P00  DEBUG:     storage/storage::storageGet: => {used: 2318, size: 2318}
2021-11-27 18:57:11.656 P00  DEBUG:     common/io/socket/client::sckClientNew: (host: {"oauth2.googleapis.com"}, port: 443, timeout: 60000)
2021-11-27 18:57:11.656 P00  DEBUG:     common/io/socket/client::sckClientNew: => {type: socket, driver: {host: oauth2.googleapis.com, port: 443, timeout: 60000}}
2021-11-27 18:57:11.656 P00  DEBUG:     common/io/tls/client::tlsClientNew: (ioClient: {type: socket, driver: {host: oauth2.googleapis.com, port: 443, timeout: 60000}}, host: {"oauth2.googleapis.com"}, timeout: 60000, verifyPeer: true, caFile: null, caPath: null)
2021-11-27 18:57:11.734 P00  DEBUG:     common/io/tls/client::tlsClientNew: => {type: tls, driver: {ioClient: {type: socket, driver: {host: oauth2.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}
2021-11-27 18:57:11.734 P00  DEBUG:     common/io/http/client::httpClientNew: (ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: oauth2.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, timeout: 60000)
2021-11-27 18:57:11.734 P00  DEBUG:     common/io/http/client::httpClientNew: => {ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: oauth2.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, reusable: 0, timeout: 60000}
2021-11-27 18:57:11.735 P00  DEBUG:     common/io/socket/client::sckClientNew: (host: {"storage.googleapis.com"}, port: 443, timeout: 60000)
2021-11-27 18:57:11.735 P00  DEBUG:     common/io/socket/client::sckClientNew: => {type: socket, driver: {host: storage.googleapis.com, port: 443, timeout: 60000}}
2021-11-27 18:57:11.735 P00  DEBUG:     common/io/tls/client::tlsClientNew: (ioClient: {type: socket, driver: {host: storage.googleapis.com, port: 443, timeout: 60000}}, host: {"storage.googleapis.com"}, timeout: 60000, verifyPeer: true, caFile: null, caPath: null)
2021-11-27 18:57:11.746 P00  DEBUG:     common/io/tls/client::tlsClientNew: => {type: tls, driver: {ioClient: {type: socket, driver: {host: storage.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}
2021-11-27 18:57:11.746 P00  DEBUG:     common/io/http/client::httpClientNew: (ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: storage.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, timeout: 60000)
2021-11-27 18:57:11.746 P00  DEBUG:     common/io/http/client::httpClientNew: => {ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: storage.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, reusable: 0, timeout: 60000}
2021-11-27 18:57:11.746 P00  DEBUG:     storage/gcs/storage::storageGcsNew: => {type: gcs, path: {"/pgbackrest/postgres-operator/eportaldb/repo2"}, write: false}
2021-11-27 18:57:11.747 P00  DEBUG:     info/infoArchive::infoArchiveLoadFile: (storage: {type: gcs, path: {"/pgbackrest/postgres-operator/eportaldb/repo2"}, write: false}, fileName: {"<REPO:ARCHIVE>/archive.info"}, cipherType: none)
2021-11-27 18:57:11.930 P00  DEBUG:     info/infoArchive::infoArchiveLoadFile: => {InfoArchive}
2021-11-27 18:57:11.930 P00  DEBUG:     info/infoPg::infoPgDataCurrentId: (this: {InfoPg})
2021-11-27 18:57:11.930 P00  DEBUG:     info/infoPg::infoPgDataCurrentId: => 0
2021-11-27 18:57:11.930 P00  DEBUG:     info/infoPg::infoPgDataCurrentId: (this: {InfoPg})
2021-11-27 18:57:11.930 P00  DEBUG:     info/infoPg::infoPgDataCurrentId: => 0
2021-11-27 18:57:11.930 P00  DEBUG:     protocol/helper::repoIsLocal: (repoIdx: 2)
2021-11-27 18:57:11.930 P00  DEBUG:     protocol/helper::repoIsLocal: => true
2021-11-27 18:57:11.930 P00  DEBUG:     storage/s3/storage::storageS3New: (path: {"/pgbackrest/postgres-operator/eportaldb/repo3"}, write: false, pathExpressionFunction: (function *), bucket: {"postgres-eportal"}, endPoint: {"fra1.digitaloceanspaces.com"}, uriStyle: host, region: {"fra1"}, keyType: shared, partSize: 5242880, host: null, port: 443, timeout: 60000, verifyPeer: true, caFile: null, caPath: null)
2021-11-27 18:57:11.930 P00  DEBUG:     common/io/socket/client::sckClientNew: (host: {"postgres-eportal.fra1.digitaloceanspaces.com"}, port: 443, timeout: 60000)
2021-11-27 18:57:11.930 P00  DEBUG:     common/io/socket/client::sckClientNew: => {type: socket, driver: {host: postgres-eportal.fra1.digitaloceanspaces.com, port: 443, timeout: 60000}}
2021-11-27 18:57:11.930 P00  DEBUG:     common/io/tls/client::tlsClientNew: (ioClient: {type: socket, driver: {host: postgres-eportal.fra1.digitaloceanspaces.com, port: 443, timeout: 60000}}, host: {"postgres-eportal.fra1.digitaloceanspaces.com"}, timeout: 60000, verifyPeer: true, caFile: null, caPath: null)
2021-11-27 18:57:11.942 P00  DEBUG:     common/io/tls/client::tlsClientNew: => {type: tls, driver: {ioClient: {type: socket, driver: {host: postgres-eportal.fra1.digitaloceanspaces.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}
2021-11-27 18:57:11.943 P00  DEBUG:     common/io/http/client::httpClientNew: (ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: postgres-eportal.fra1.digitaloceanspaces.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, timeout: 60000)
2021-11-27 18:57:11.943 P00  DEBUG:     common/io/http/client::httpClientNew: => {ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: postgres-eportal.fra1.digitaloceanspaces.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, reusable: 0, timeout: 60000}
2021-11-27 18:57:11.943 P00  DEBUG:     storage/s3/storage::storageS3New: => {type: s3, path: {"/pgbackrest/postgres-operator/eportaldb/repo3"}, write: false}
2021-11-27 18:57:11.943 P00  DEBUG:     info/infoArchive::infoArchiveLoadFile: (storage: {type: s3, path: {"/pgbackrest/postgres-operator/eportaldb/repo3"}, write: false}, fileName: {"<REPO:ARCHIVE>/archive.info"}, cipherType: none)
2021-11-27 18:57:12.130 P00  DEBUG:     info/infoArchive::infoArchiveLoadFile: => {InfoArchive}
2021-11-27 18:57:12.130 P00  DEBUG:     info/infoPg::infoPgDataCurrentId: (this: {InfoPg})
2021-11-27 18:57:12.130 P00  DEBUG:     info/infoPg::infoPgDataCurrentId: => 0
2021-11-27 18:57:12.130 P00  DEBUG:     info/infoPg::infoPgDataCurrentId: (this: {InfoPg})
2021-11-27 18:57:12.130 P00  DEBUG:     info/infoPg::infoPgDataCurrentId: => 0
2021-11-27 18:57:12.130 P00  DEBUG:     command/archive/push/push::archivePushCheck: => struct
2021-11-27 18:57:12.130 P00  DEBUG:     command/archive/push/file::archivePushFile: (walSource: {"/pgdata/pg14/pg_wal/00000003.history"}, headerCheck: true, pgVersion: 140000, pgSystemId: 7034087418294739021, archiveFile: {"00000003.history"}, compressType: 2, compressLevel: 6, repoList: *void, priorErrorList: {["repo1: [LockAcquireError] raised from remote-0 protocol on 'eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local.': unable to acquire lock on file '/tmp/pgbackrest/db-archive.lock': Resource temporarily unavailable
                                        HINT: is another pgBackRest process running?
                                        common/lock.c:lockAcquireFile:140:(trace log level required for parameters)
                                        common/lock.c:lockAcquire:(debug log level required for parameters)
                                        command/remote/remote.c:cmdRemote:(debug log level required for parameters)
                                        main.c:main:(debug log level required for parameters)"]})
2021-11-27 18:57:12.130 P00  DEBUG:     storage/storage::storageNewRead: (this: {type: posix, path: {"/"}, write: false}, fileExp: {"/pgdata/pg14/pg_wal/00000003.history"}, param.ignoreMissing: false, param.compressible: false, param.limit: null)
2021-11-27 18:57:12.130 P00  DEBUG:     storage/storage::storageNewRead: => {type: posix, name: {"/pgdata/pg14/pg_wal/00000003.history"}, ignoreMissing: false}
2021-11-27 18:57:12.130 P00  DEBUG:     protocol/helper::repoIsLocal: (repoIdx: 1)
2021-11-27 18:57:12.130 P00  DEBUG:     protocol/helper::repoIsLocal: => true
2021-11-27 18:57:12.130 P00  DEBUG:     storage/gcs/storage::storageGcsNew: (path: {"/pgbackrest/postgres-operator/eportaldb/repo2"}, write: true, pathExpressionFunction: (function *), bucket: {"postgres-eportal2"}, keyType: service, chunkSize: 4194304, endpoint: {"storage.googleapis.com"}, timeout: 60000, verifyPeer: true, caFile: null, caPath: null)
2021-11-27 18:57:12.130 P00  DEBUG:     storage/posix/storage::storagePosixNew: (path: {"/"}, param.modeFile: 0000, param.modePath: 0000, param.write: false, param.pathExpressionFunction: null)
2021-11-27 18:57:12.130 P00  DEBUG:     storage/posix/storage::storagePosixNewInternal: (type: posix, path: {"/"}, modeFile: 0640, modePath: 0750, write: false, pathExpressionFunction: null, pathSync: true)
2021-11-27 18:57:12.130 P00  DEBUG:     storage/posix/storage::storagePosixNewInternal: => {type: posix, path: {"/"}, write: false}
2021-11-27 18:57:12.130 P00  DEBUG:     storage/posix/storage::storagePosixNew: => {type: posix, path: {"/"}, write: false}
2021-11-27 18:57:12.130 P00  DEBUG:     storage/storage::storageNewRead: (this: {type: posix, path: {"/"}, write: false}, fileExp: {"/etc/pgbackrest/conf.d/gcs-key.json"}, param.ignoreMissing: false, param.compressible: false, param.limit: null)
2021-11-27 18:57:12.130 P00  DEBUG:     storage/storage::storageNewRead: => {type: posix, name: {"/etc/pgbackrest/conf.d/gcs-key.json"}, ignoreMissing: false}
2021-11-27 18:57:12.130 P00  DEBUG:     storage/storage::storageGet: (file: {type: posix, name: {"/etc/pgbackrest/conf.d/gcs-key.json"}, ignoreMissing: false}, param.exactSize: 0)
2021-11-27 18:57:12.130 P00  DEBUG:     storage/storage::storageGet: => {used: 2318, size: 2318}
2021-11-27 18:57:12.130 P00  DEBUG:     common/io/socket/client::sckClientNew: (host: {"oauth2.googleapis.com"}, port: 443, timeout: 60000)
2021-11-27 18:57:12.131 P00  DEBUG:     common/io/socket/client::sckClientNew: => {type: socket, driver: {host: oauth2.googleapis.com, port: 443, timeout: 60000}}
2021-11-27 18:57:12.131 P00  DEBUG:     common/io/tls/client::tlsClientNew: (ioClient: {type: socket, driver: {host: oauth2.googleapis.com, port: 443, timeout: 60000}}, host: {"oauth2.googleapis.com"}, timeout: 60000, verifyPeer: true, caFile: null, caPath: null)
2021-11-27 18:57:12.142 P00  DEBUG:     common/io/tls/client::tlsClientNew: => {type: tls, driver: {ioClient: {type: socket, driver: {host: oauth2.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}
2021-11-27 18:57:12.143 P00  DEBUG:     common/io/http/client::httpClientNew: (ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: oauth2.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, timeout: 60000)
2021-11-27 18:57:12.143 P00  DEBUG:     common/io/http/client::httpClientNew: => {ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: oauth2.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, reusable: 0, timeout: 60000}
2021-11-27 18:57:12.143 P00  DEBUG:     common/io/socket/client::sckClientNew: (host: {"storage.googleapis.com"}, port: 443, timeout: 60000)
2021-11-27 18:57:12.143 P00  DEBUG:     common/io/socket/client::sckClientNew: => {type: socket, driver: {host: storage.googleapis.com, port: 443, timeout: 60000}}
2021-11-27 18:57:12.143 P00  DEBUG:     common/io/tls/client::tlsClientNew: (ioClient: {type: socket, driver: {host: storage.googleapis.com, port: 443, timeout: 60000}}, host: {"storage.googleapis.com"}, timeout: 60000, verifyPeer: true, caFile: null, caPath: null)
2021-11-27 18:57:12.231 P00  DEBUG:     common/io/tls/client::tlsClientNew: => {type: tls, driver: {ioClient: {type: socket, driver: {host: storage.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}
2021-11-27 18:57:12.231 P00  DEBUG:     common/io/http/client::httpClientNew: (ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: storage.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, timeout: 60000)
2021-11-27 18:57:12.231 P00  DEBUG:     common/io/http/client::httpClientNew: => {ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: storage.googleapis.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, reusable: 0, timeout: 60000}
2021-11-27 18:57:12.231 P00  DEBUG:     storage/gcs/storage::storageGcsNew: => {type: gcs, path: {"/pgbackrest/postgres-operator/eportaldb/repo2"}, write: true}
2021-11-27 18:57:12.231 P00  DEBUG:     storage/storage::storageNewWrite: (this: {type: gcs, path: {"/pgbackrest/postgres-operator/eportaldb/repo2"}, write: true}, fileExp: {"<REPO:ARCHIVE>/14-1/00000003.history"}, param.modeFile: 0000, param.modePath: 0000, param.user: null, param.group: null, param.timeModified: 0, param.noCreatePath: false, param.noSyncFile: false, param.noSyncPath: false, param.noAtomic: false, param.compressible: true)
2021-11-27 18:57:12.231 P00  DEBUG:     storage/gcs/storage::storageGcsNewWrite: (this: {StorageGcs}, file: {"/pgbackrest/postgres-operator/eportaldb/repo2/archive/db/14-1/00000003.history"})
2021-11-27 18:57:12.231 P00  DEBUG:     storage/gcs/storage::storageGcsNewWrite: => {type: gcs, name: {"/pgbackrest/postgres-operator/eportaldb/repo2/archive/db/14-1/00000003.history"}, modeFile: 0000, modePath: 0000, createPath: true, syncFile: true, syncPath: true, atomic: true}
2021-11-27 18:57:12.231 P00  DEBUG:     storage/storage::storageNewWrite: => {type: gcs, name: {"/pgbackrest/postgres-operator/eportaldb/repo2/archive/db/14-1/00000003.history"}, modeFile: 0000, modePath: 0000, createPath: true, syncFile: true, syncPath: true, atomic: true}
2021-11-27 18:57:12.231 P00  DEBUG:     protocol/helper::repoIsLocal: (repoIdx: 2)
2021-11-27 18:57:12.231 P00  DEBUG:     protocol/helper::repoIsLocal: => true
2021-11-27 18:57:12.231 P00  DEBUG:     storage/s3/storage::storageS3New: (path: {"/pgbackrest/postgres-operator/eportaldb/repo3"}, write: true, pathExpressionFunction: (function *), bucket: {"postgres-eportal"}, endPoint: {"fra1.digitaloceanspaces.com"}, uriStyle: host, region: {"fra1"}, keyType: shared, partSize: 5242880, host: null, port: 443, timeout: 60000, verifyPeer: true, caFile: null, caPath: null)
2021-11-27 18:57:12.231 P00  DEBUG:     common/io/socket/client::sckClientNew: (host: {"postgres-eportal.fra1.digitaloceanspaces.com"}, port: 443, timeout: 60000)
2021-11-27 18:57:12.231 P00  DEBUG:     common/io/socket/client::sckClientNew: => {type: socket, driver: {host: postgres-eportal.fra1.digitaloceanspaces.com, port: 443, timeout: 60000}}
2021-11-27 18:57:12.231 P00  DEBUG:     common/io/tls/client::tlsClientNew: (ioClient: {type: socket, driver: {host: postgres-eportal.fra1.digitaloceanspaces.com, port: 443, timeout: 60000}}, host: {"postgres-eportal.fra1.digitaloceanspaces.com"}, timeout: 60000, verifyPeer: true, caFile: null, caPath: null)
2021-11-27 18:57:12.251 P00  DEBUG:     common/io/tls/client::tlsClientNew: => {type: tls, driver: {ioClient: {type: socket, driver: {host: postgres-eportal.fra1.digitaloceanspaces.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}
2021-11-27 18:57:12.252 P00  DEBUG:     common/io/http/client::httpClientNew: (ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: postgres-eportal.fra1.digitaloceanspaces.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, timeout: 60000)
2021-11-27 18:57:12.252 P00  DEBUG:     common/io/http/client::httpClientNew: => {ioClient: {type: tls, driver: {ioClient: {type: socket, driver: {host: postgres-eportal.fra1.digitaloceanspaces.com, port: 443, timeout: 60000}}, timeout: 60000, verifyPeer: true}}, reusable: 0, timeout: 60000}
2021-11-27 18:57:12.253 P00  DEBUG:     storage/s3/storage::storageS3New: => {type: s3, path: {"/pgbackrest/postgres-operator/eportaldb/repo3"}, write: true}
2021-11-27 18:57:12.253 P00  DEBUG:     storage/storage::storageNewWrite: (this: {type: s3, path: {"/pgbackrest/postgres-operator/eportaldb/repo3"}, write: true}, fileExp: {"<REPO:ARCHIVE>/14-1/00000003.history"}, param.modeFile: 0000, param.modePath: 0000, param.user: null, param.group: null, param.timeModified: 0, param.noCreatePath: false, param.noSyncFile: false, param.noSyncPath: false, param.noAtomic: false, param.compressible: true)
2021-11-27 18:57:12.253 P00  DEBUG:     storage/s3/storage::storageS3NewWrite: (this: {StorageS3}, file: {"/pgbackrest/postgres-operator/eportaldb/repo3/archive/db/14-1/00000003.history"})
2021-11-27 18:57:12.253 P00  DEBUG:     storage/s3/storage::storageS3NewWrite: => {type: s3, name: {"/pgbackrest/postgres-operator/eportaldb/repo3/archive/db/14-1/00000003.history"}, modeFile: 0000, modePath: 0000, createPath: true, syncFile: true, syncPath: true, atomic: true}
2021-11-27 18:57:12.253 P00  DEBUG:     storage/storage::storageNewWrite: => {type: s3, name: {"/pgbackrest/postgres-operator/eportaldb/repo3/archive/db/14-1/00000003.history"}, modeFile: 0000, modePath: 0000, createPath: true, syncFile: true, syncPath: true, atomic: true}
2021-11-27 18:57:12.740 P00  DEBUG:     common/exit::exitSafe: (result: 0, error: true, signalType: 0)
ERROR: [104]: archive-push command encountered error(s):
       repo1: [LockAcquireError] raised from remote-0 protocol on 'eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local.': unable to acquire lock on file '/tmp/pgbackrest/db-archive.lock': Resource temporarily unavailable
       HINT: is another pgBackRest process running?
       common/lock.c:lockAcquireFile:140:(trace log level required for parameters)
       common/lock.c:lockAcquire:(debug log level required for parameters)
       command/remote/remote.c:cmdRemote:(debug log level required for parameters)
       main.c:main:(debug log level required for parameters)
WARN: remote-0 process on 'eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local.' terminated unexpectedly [0]: Could not create directory '/var/lib/pgsql/.ssh'.
2021-11-27 18:57:12.741 P00 DETAIL: statistics: {"http.client":{"total":6},"http.request":{"total":6},"http.session":{"total":6},"socket.client":{"total":6},"socket.session":{"total":6},"tls.client":{"total":6},"tls.session":{"total":6}}
2021-11-27 18:57:12.741 P00   INFO: archive-push command end: aborted with exception [104]
2021-11-27 18:57:12.741 P00  DEBUG:     common/lock::lockRelease: (failOnNoLock: false)
2021-11-27 18:57:12.741 P00  DEBUG:     common/lock::lockRelease: => false
2021-11-27 18:57:12.741 P00  DEBUG:     common/exit::exitSafe: => 104
2021-11-27 18:57:12.741 P00  DEBUG:     main::main: => 104
2021-11-27 18:57:12.743 UTC [118] LOG:  archive command failed with exit code 104
2021-11-27 18:57:12.743 UTC [118] DETAIL:  The failed archive command was: pgbackrest --stanza=db archive-push "pg_wal/00000003.history"
2021-11-27 18:57:12.743 UTC [118] WARNING:  archiving write-ahead log file "00000003.history" failed too many times, will try again later

pg14.log

Environment

  • Platform: GKE
  • Platform Version: v1.21.5-gke.1302
  • PGO Image Tag: ubi8-5.0.4-0
  • Postgres Version: 14
  • Storage: GKE Provider

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 18 (5 by maintainers)

Most upvoted comments

After mounting a public read-write filesystem on /home, and manually creating /home/postgres inside of the repo-host pod, the error “could not create directory /home/postgres/.ssh” dissapeared…

bash-4.4$ ssh localhost -p 2022
Could not create directory '/home/postgres/.ssh'.
No ECDSA host key is known for [localhost]:2022 and you have requested strict checking.
Host key verification failed.
bash-4.4$ cd /home
bash-4.4$ ll
bash: ll: command not found
bash-4.4$ ls -la
total 8
drwxrwsrwx 2 root postgres 4096 Nov 28 10:33 .
drwxr-xr-x 1 root root     4096 Nov 28 10:33 ..
bash-4.4$ mkdir postgres
bash-4.4$ ls
postgres
bash-4.4$ ls -la
total 12
drwxrwsrwx 3 root     postgres 4096 Nov 28 10:34 .
drwxr-xr-x 1 root     root     4096 Nov 28 10:33 ..
drwxr-sr-x 2 postgres postgres 4096 Nov 28 10:34 postgres
bash-4.4$ ssh localhost -p 2022
No ECDSA host key is known for [localhost]:2022 and you have requested strict checking.
Host key verification failed.
bash-4.4$ 

And the backup process seems to work perfectly!

2021-11-28 10:33:40.337 P00   INFO: archive-push command begin 2.35: [pg_wal/00000003.history] --archive-timeout=300 --exec-id=273345-51f27512 --log-level-console=info --log-path=/tmp --pg1-path=/pgdata/pg14 --repo2-gcs-bucket=postgres-eportal2 --repo2-gcs-key=<redacted> --repo1-host=eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local. --repo1-host-user=postgres --repo1-path=/pgbackrest/repo1 --repo2-path=/pgbackrest/postgres-operator/eportaldb/repo2 --repo3-path=/pgbackrest/postgres-operator/eportaldb/repo3 --repo3-s3-bucket=postgres-eportal --repo3-s3-endpoint=fra1.digitaloceanspaces.com --repo3-s3-key=<redacted> --repo3-s3-key-secret=<redacted> --repo3-s3-region=fra1 --repo2-type=gcs --repo3-type=s3 --stanza=db
ERROR: [104]: archive-push command encountered error(s):
       repo1: [UnknownError] remote-0 process on 'eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local.' terminated unexpectedly [255]: ssh: Could not resolve hostname eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local.: Name or service not known
2021-11-28 10:33:42.135 P00   INFO: archive-push command end: aborted with exception [104]
2021-11-28 10:33:42.138 UTC [118] LOG:  archive command failed with exit code 104
2021-11-28 10:33:42.138 UTC [118] DETAIL:  The failed archive command was: pgbackrest --stanza=db archive-push "pg_wal/00000003.history"
2021-11-28 10:33:42.138 UTC [118] WARNING:  archiving write-ahead log file "00000003.history" failed too many times, will try again later

--- Created the home directory here ---

2021-11-28 10:34:42.206 P00   INFO: archive-push command begin 2.35: [pg_wal/00000003.history] --archive-timeout=300 --exec-id=273401-18185d27 --log-level-console=info --log-path=/tmp --pg1-path=/pgdata/pg14 --repo2-gcs-bucket=postgres-eportal2 --repo2-gcs-key=<redacted> --repo1-host=eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local. --repo1-host-user=postgres --repo1-path=/pgbackrest/repo1 --repo2-path=/pgbackrest/postgres-operator/eportaldb/repo2 --repo3-path=/pgbackrest/postgres-operator/eportaldb/repo3 --repo3-s3-bucket=postgres-eportal --repo3-s3-endpoint=fra1.digitaloceanspaces.com --repo3-s3-key=<redacted> --repo3-s3-key-secret=<redacted> --repo3-s3-region=fra1 --repo2-type=gcs --repo3-type=s3 --stanza=db
2021-11-28 10:34:43.309 P00   INFO: pushed WAL file '00000003.history' to the archive
2021-11-28 10:34:43.410 P00   INFO: archive-push command end: completed successfully (1206ms)
2021-11-28 10:34:43.423 P00   INFO: archive-push command begin 2.35: [pg_wal/00000004.history] --archive-timeout=300 --exec-id=273403-40d2f4da --log-level-console=info --log-path=/tmp --pg1-path=/pgdata/pg14 --repo2-gcs-bucket=postgres-eportal2 --repo2-gcs-key=<redacted> --repo1-host=eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local. --repo1-host-user=postgres --repo1-path=/pgbackrest/repo1 --repo2-path=/pgbackrest/postgres-operator/eportaldb/repo2 --repo3-path=/pgbackrest/postgres-operator/eportaldb/repo3 --repo3-s3-bucket=postgres-eportal --repo3-s3-endpoint=fra1.digitaloceanspaces.com --repo3-s3-key=<redacted> --repo3-s3-key-secret=<redacted> --repo3-s3-region=fra1 --repo2-type=gcs --repo3-type=s3 --stanza=db
2021-11-28 10:34:44.105 P00   INFO: pushed WAL file '00000004.history' to the archive
2021-11-28 10:34:44.206 P00   INFO: archive-push command end: completed successfully (784ms)
2021-11-28 10:34:44.269 P00   INFO: archive-push command begin 2.35: [pg_wal/000000020000000000000066] --archive-timeout=300 --exec-id=273405-01f859e0 --log-level-console=info --log-path=/tmp --pg1-path=/pgdata/pg14 --repo2-gcs-bucket=postgres-eportal2 --repo2-gcs-key=<redacted> --repo1-host=eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local. --repo1-host-user=postgres --repo1-path=/pgbackrest/repo1 --repo2-path=/pgbackrest/postgres-operator/eportaldb/repo2 --repo3-path=/pgbackrest/postgres-operator/eportaldb/repo3 --repo3-s3-bucket=postgres-eportal --repo3-s3-endpoint=fra1.digitaloceanspaces.com --repo3-s3-key=<redacted> --repo3-s3-key-secret=<redacted> --repo3-s3-region=fra1 --repo2-type=gcs --repo3-type=s3 --stanza=db
2021-11-28 10:34:45.540 P00   INFO: pushed WAL file '000000020000000000000066' to the archive
2021-11-28 10:34:45.641 P00   INFO: archive-push command end: completed successfully (1373ms)
2021-11-28 10:34:45.656 P00   INFO: archive-push command begin 2.35: [pg_wal/000000020000000000000068.partial] --archive-timeout=300 --exec-id=273410-0e86d3be --log-level-console=info --log-path=/tmp --pg1-path=/pgdata/pg14 --repo2-gcs-bucket=postgres-eportal2 --repo2-gcs-key=<redacted> --repo1-host=eportaldb-repo-host-0.eportaldb-pods.eportal.svc.cluster.local. --repo1-host-user=postgres --repo1-path=/pgbackrest/repo1 --repo2-path=/pgbackrest/postgres-operator/eportaldb/repo2 --repo3-path=/pgbackrest/postgres-operator/eportaldb/repo3 --repo3-s3-bucket=postgres-eportal --repo3-s3-endpoint=fra1.digitaloceanspaces.com --repo3-s3-key=<redacted> --repo3-s3-key-secret=<redacted> --repo3-s3-region=fra1 --repo2-type=gcs --repo3-type=s3 --stanza=db
2021-11-28 10:34:46.937 P00   INFO: pushed WAL file '000000020000000000000068.partial' to the archive
2021-11-28 10:34:47.037 P00   INFO: archive-push command end: completed successfully (1383ms)

So, three possible solutions:

  • Fix the requirement for a home folder (I don’t know if it is required by the backup process or not)
  • Change the home directyor for the user postgress to /tmp/home/postgres or similar
  • Create the /home/postgres/ folder in the Docker image and give it access to the user