postgres-operator: password authentication failed for user \"postgres\""
Hi,
I have stuck in the DB connection problem for a while. I have noticed there has a few issues with common symptom like my case, but none of their solutions works in my case. Can someone please give me some directions?
I have successfully installed the postgres-operator thru helm, created my cluster and run my cluster successfully. But I cannot connect to the DB even within Kubernetes cluster, so I ssh to the DB pod and connect the DB by kubectl exec -it pod/test-testdb-cluster-0 -- bash
and psql -U postgres
. However, I don’t see the role and DB I specified in my yaml(testdbname and testdbowner).
List of roles
Role name | Attributes | Member of
------------+------------------------------------------------------------+-----------
admin | Create DB, Cannot login | {}
postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
robot_zmon | Cannot login | {}
standby | Replication | {}
zalandos | Create DB, Cannot login | {}
List of relations
Schema | Name | Type | Owner
--------+-------------------------+---------------+----------
public | failed_authentication_0 | view | postgres
public | failed_authentication_1 | view | postgres
public | failed_authentication_2 | view | postgres
public | failed_authentication_3 | view | postgres
public | failed_authentication_4 | view | postgres
public | failed_authentication_5 | view | postgres
public | failed_authentication_6 | view | postgres
public | failed_authentication_7 | view | postgres
public | pg_auth_mon | view | postgres
public | pg_stat_kcache | view | postgres
public | pg_stat_kcache_detail | view | postgres
public | pg_stat_statements | view | postgres
public | postgres_log | table | postgres
public | postgres_log_0 | foreign table | postgres
public | postgres_log_1 | foreign table | postgres
public | postgres_log_2 | foreign table | postgres
public | postgres_log_3 | foreign table | postgres
public | postgres_log_4 | foreign table | postgres
public | postgres_log_5 | foreign table | postgres
public | postgres_log_6 | foreign table | postgres
public | postgres_log_7 | foreign table | postgres
(21 rows)
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/nginx-584d4f8b45-jwczj 1/1 Running 0 5d13h
pod/test-testdb-cluster-0 1/1 Running 0 5d13h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/nginx LoadBalancer 172.17.154.178 192.168.1.247 80:30193/TCP 5d13h
service/test-testdb-cluster ClusterIP 172.16.104.168 <none> 5432/TCP 5d13h
service/test-testdb-cluster-config ClusterIP None <none> <none> 5d13h
service/test-testdb-cluster-repl ClusterIP 172.17.204.69 <none> 5432/TCP 5d13h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx 1/1 1 1 5d13h
NAME DESIRED CURRENT READY AGE
replicaset.apps/nginx-584d4f8b45 1 1 1 5d13h
NAME READY AGE
statefulset.apps/test-testdb-cluster 1/1 5d13h
Log from the cluster:
kubectl logs pod/test-testdb-cluster-0
2020-08-17 09:54:30,947 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?)
2020-08-17 09:54:32,951 - bootstrapping - INFO - Could not connect to 169.254.169.254, assuming local Docker setup
2020-08-17 09:54:32,952 - bootstrapping - INFO - No meta-data available for this provider
2020-08-17 09:54:32,952 - bootstrapping - INFO - Looks like your running local
2020-08-17 09:54:32,981 - bootstrapping - INFO - Configuring bootstrap
2020-08-17 09:54:32,981 - bootstrapping - INFO - Configuring crontab
2020-08-17 09:54:32,982 - bootstrapping - INFO - Skipping creation of renice cron job due to lack of SYS_NICE capability
2020-08-17 09:54:32,982 - bootstrapping - INFO - Configuring pgbouncer
2020-08-17 09:54:32,982 - bootstrapping - INFO - No PGBOUNCER_CONFIGURATION was specified, skipping
2020-08-17 09:54:32,982 - bootstrapping - INFO - Configuring wal-e
2020-08-17 09:54:32,982 - bootstrapping - INFO - Configuring pam-oauth2
2020-08-17 09:54:32,982 - bootstrapping - INFO - No PAM_OAUTH2 configuration was specified, skipping
2020-08-17 09:54:32,982 - bootstrapping - INFO - Configuring pgqd
2020-08-17 09:54:32,983 - bootstrapping - INFO - Configuring certificate
2020-08-17 09:54:32,983 - bootstrapping - INFO - Generating ssl certificate
2020-08-17 09:54:33,073 - bootstrapping - INFO - Configuring patroni
2020-08-17 09:54:33,081 - bootstrapping - INFO - Writing to file /home/postgres/postgres.yml
2020-08-17 09:54:33,081 - bootstrapping - INFO - Configuring log
2020-08-17 09:54:33,081 - bootstrapping - INFO - Configuring standby-cluster
2020-08-17 09:54:34,367 INFO: No PostgreSQL configuration items changed, nothing to reload.
2020-08-17 09:54:34,390 WARNING: Postgresql is not running.
2020-08-17 09:54:34,390 INFO: Lock owner: None; I am test-testdb-cluster-0
2020-08-17 09:54:34,394 INFO: pg_controldata:
pg_control version number: 1201
Catalog version number: 201909212
Database system identifier: 6861875284178866242
Database cluster state: in production
pg_control last modified: Mon Aug 17 09:45:25 2020
Latest checkpoint location: 0/30C2A38
Latest checkpoint's REDO location: 0/30C2A00
Latest checkpoint's REDO WAL file: 000000020000000000000003
Latest checkpoint's TimeLineID: 2
Latest checkpoint's PrevTimeLineID: 2
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:882
Latest checkpoint's NextOID: 32768
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 480
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 882
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid: 0
Latest checkpoint's newestCommitTsXid: 0
Time of latest checkpoint: Mon Aug 17 09:45:11 2020
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: on
max_connections setting: 100
max_worker_processes setting: 8
max_wal_senders setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 1
Mock authentication nonce: 70a4642487623f0d3db3448c6cdccb40f243ae5656da8a85aa9f58c7992db5bc
2020-08-17 09:54:34,403 INFO: doing crash recovery in a single user mode
2020-08-17 09:54:37,807 WARNING: Postgresql is not running.
2020-08-17 09:54:37,807 INFO: Lock owner: None; I am test-testdb-cluster-0
2020-08-17 09:54:37,811 INFO: pg_controldata:
pg_control version number: 1201
Catalog version number: 201909212
Database system identifier: 6861875284178866242
Database cluster state: shut down
pg_control last modified: Mon Aug 17 09:54:37 2020
Latest checkpoint location: 0/4000028
Latest checkpoint's REDO location: 0/4000028
Latest checkpoint's REDO WAL file: 000000020000000000000004
Latest checkpoint's TimeLineID: 2
Latest checkpoint's PrevTimeLineID: 2
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:882
Latest checkpoint's NextOID: 32768
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 480
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid: 0
Latest checkpoint's newestCommitTsXid: 0
Time of latest checkpoint: Mon Aug 17 09:54:37 2020
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: on
max_connections setting: 100
max_worker_processes setting: 8
max_wal_senders setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 1
Mock authentication nonce: 70a4642487623f0d3db3448c6cdccb40f243ae5656da8a85aa9f58c7992db5bc
2020-08-17 09:54:37,811 INFO: Lock owner: None; I am test-testdb-cluster-0
2020-08-17 09:54:37,823 INFO: Lock owner: None; I am test-testdb-cluster-0
2020-08-17 09:54:37,824 INFO: starting as a secondary
2020-08-17 09:54:38,050 INFO: postmaster pid=61
/var/run/postgresql:5432 - no response
2020-08-17 09:54:38 UTC [61]: [1-1] 5f3a53de.3d 0 LOG: Auto detecting pg_stat_kcache.linux_hz parameter...
2020-08-17 09:54:38 UTC [61]: [2-1] 5f3a53de.3d 0 LOG: pg_stat_kcache.linux_hz is set to 1000000
2020-08-17 09:54:38 UTC [61]: [3-1] 5f3a53de.3d 0 LOG: starting PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, 64-bit
2020-08-17 09:54:38 UTC [61]: [4-1] 5f3a53de.3d 0 LOG: listening on IPv4 address "0.0.0.0", port 5432
2020-08-17 09:54:38 UTC [61]: [5-1] 5f3a53de.3d 0 LOG: listening on IPv6 address "::", port 5432
2020-08-17 09:54:38 UTC [61]: [6-1] 5f3a53de.3d 0 LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2020-08-17 09:54:38 UTC [61]: [7-1] 5f3a53de.3d 0 LOG: redirecting log output to logging collector process
2020-08-17 09:54:38 UTC [61]: [8-1] 5f3a53de.3d 0 HINT: Future log output will appear in directory "../pg_log".
/var/run/postgresql:5432 - accepting connections
/var/run/postgresql:5432 - accepting connections
2020-08-17 09:54:39,091 INFO: establishing a new patroni connection to the postgres cluster
2020-08-17 09:54:39,142 WARNING: Could not activate Linux watchdog device: "Can't open watchdog device: [Errno 2] No such file or directory: '/dev/watchdog'"
2020-08-17 09:54:39,152 ERROR: Can not fetch local timeline and lsn from replication connection
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/patroni/postgresql/__init__.py", line 685, in get_local_timeline_lsn_from_replication_connection
with self.get_replication_connection_cursor(**self.config.local_replication_address) as cur:
File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/usr/local/lib/python3.6/dist-packages/patroni/postgresql/__init__.py", line 679, in get_replication_connection_cursor
with get_connection_cursor(**conn_kwargs) as cur:
File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/usr/local/lib/python3.6/dist-packages/patroni/postgresql/connection.py", line 43, in get_connection_cursor
with psycopg2.connect(**kwargs) as conn:
File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 127, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: FATAL: password authentication failed for user "standby"
FATAL: no pg_hba.conf entry for replication connection from host "127.0.0.1", user "standby", SSL off
2020-08-17 09:54:39,164 INFO: promoted self to leader by acquiring session lock
server promoting
2020-08-17 09:54:39,169 INFO: cleared rewind state after becoming the leader
2020-08-17 09:54:40,179 INFO: Lock owner: test-testdb-cluster-0; I am test-testdb-cluster-0
2020-08-17 09:54:40,185 INFO: Lock owner: test-testdb-cluster-0; I am test-testdb-cluster-0
2020-08-17 09:54:40,247 INFO: no action. i am the leader with the lock
SET
DO
DO
DO
NOTICE: extension "pg_auth_mon" already exists, skipping
CREATE EXTENSION
NOTICE: version "1.0" of extension "pg_auth_mon" is already installed
ALTER EXTENSION
GRANT
NOTICE: extension "pg_cron" already exists, skipping
CREATE EXTENSION
NOTICE: version "1.2" of extension "pg_cron" is already installed
ALTER EXTENSION
ALTER POLICY
REVOKE
GRANT
GRANT
CREATE FUNCTION
REVOKE
GRANT
REVOKE
GRANT
REVOKE
GRANT
GRANT
NOTICE: extension "file_fdw" already exists, skipping
CREATE EXTENSION
DO
NOTICE: relation "postgres_log" already exists, skipping
CREATE TABLE
GRANT
NOTICE: relation "postgres_log_0" already exists, skipping
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
NOTICE: relation "postgres_log_1" already exists, skipping
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
NOTICE: relation "postgres_log_2" already exists, skipping
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
NOTICE: relation "postgres_log_3" already exists, skipping
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
NOTICE: relation "postgres_log_4" already exists, skipping
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
NOTICE: relation "postgres_log_5" already exists, skipping
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
NOTICE: relation "postgres_log_6" already exists, skipping
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
NOTICE: relation "postgres_log_7" already exists, skipping
CREATE FOREIGN TABLE
GRANT
CREATE VIEW
ALTER VIEW
GRANT
RESET
SET
NOTICE: drop cascades to 3 other objects
DETAIL: drop cascades to type zmon_utils.system_information
drop cascades to function zmon_utils.get_database_cluster_information()
drop cascades to function zmon_utils.get_database_cluster_system_information()
DROP SCHEMA
NOTICE: extension "plpython3u" already exists, skipping
DO
NOTICE: language "plpythonu" does not exist, skipping
DROP LANGUAGE
NOTICE: function plpython_call_handler() does not exist, skipping
DROP FUNCTION
NOTICE: function plpython_inline_handler(internal) does not exist, skipping
DROP FUNCTION
NOTICE: function plpython_validator(oid) does not exist, skipping
DROP FUNCTION
CREATE SCHEMA
GRANT
SET
CREATE TYPE
CREATE FUNCTION
CREATE FUNCTION
GRANT
You are now connected to database "postgres" as user "postgres".
NOTICE: schema "user_management" already exists, skipping
CREATE SCHEMA
GRANT
SET
CREATE FUNCTION
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
GRANT
RESET
NOTICE: extension "pg_stat_statements" already exists, skipping
CREATE EXTENSION
NOTICE: extension "pg_stat_kcache" already exists, skipping
CREATE EXTENSION
NOTICE: extension "set_user" already exists, skipping
CREATE EXTENSION
NOTICE: version "1.6" of extension "set_user" is already installed
ALTER EXTENSION
GRANT
GRANT
NOTICE: schema "metric_helpers" already exists, skipping
CREATE SCHEMA
GRANT
GRANT
SET
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
RESET
You are now connected to database "template1" as user "postgres".
NOTICE: schema "user_management" already exists, skipping
CREATE SCHEMA
GRANT
SET
CREATE FUNCTION
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
CREATE FUNCTION
REVOKE
GRANT
COMMENT
GRANT
RESET
NOTICE: extension "pg_stat_statements" already exists, skipping
CREATE EXTENSION
NOTICE: extension "pg_stat_kcache" already exists, skipping
CREATE EXTENSION
NOTICE: extension "set_user" already exists, skipping
CREATE EXTENSION
NOTICE: version "1.6" of extension "set_user" is already installed
ALTER EXTENSION
GRANT
GRANT
NOTICE: schema "metric_helpers" already exists, skipping
CREATE SCHEMA
GRANT
GRANT
SET
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
CREATE FUNCTION
REVOKE
GRANT
GRANT
CREATE VIEW
REVOKE
GRANT
GRANT
RESET
2020-08-17 09:54:50,179 INFO: Lock owner: test-testdb-cluster-0; I am test-testdb-cluster-0
2020-08-17 09:54:50,234 INFO: no action. i am the leader with the lock
2020-08-17 09:55:00,179 INFO: Lock owner: test-testdb-cluster-0; I am test-testdb-cluster-0
2020-08-17 09:55:00,233 INFO: no action. i am the leader with the lock
2020-08-17 09:55:10,179 INFO: Lock owner: test-testdb-cluster-0; I am test-testdb-cluster-0
2020-08-17 09:55:10,192 INFO: no action. i am the leader with the lock
2020-08-17 09:55:20,179 INFO: Lock owner: test-testdb-cluster-0; I am test-testdb-cluster-0
2020-08-17 09:55:20,184 INFO: no action. i am the leader with the lock
2020-08-17 09:55:30,179 INFO: Lock owner: test-testdb-cluster-0; I am test-testdb-cluster-0
2020-08-17 09:55:30,226 INFO: no action. i am the leader with the lock
My yaml:
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: test-testdb-cluster
namespace: test
spec:
teamId: "test"
volume:
size: 1Gi
numberOfInstances: 1
enableMasterLoadBalancer: false
enableReplicaLoadBalancer: false
users:
# database owner
testdbowner:
- superuser
- createdb
testdb2owner:
- createdb
# role for application foo
# foo_user: # or 'foo_user: []'
#databases: name->owner
databases:
testdbname: testdbowner
testdb2name: testdb2owner
preparedDatabases:
testdbname:
extensions:
postgis: data
postgresql:
version: "12"
Helm3 installation cmd:
helm install postgres --namespace postgres charts/postgres-operator --set configKubernetes.cluster_domain=MY_DOMAIN --set configLoadBalancer.db_hosted_zone=db.MY_DOMAIN -f ./charts/postgres-operator/values-crd.yaml
I have tried the 1.4.0, 1.5.0 and master branch/tag. I have also tried to install helm without values-crd.yaml, but none of them works. Any ideas?
Thanks a lot.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 2
- Comments: 16 (3 by maintainers)
Ran in to this issue as well. Recreated the
postgresql
cluster with an existing volume. The new cluster started up but the postgres operator is failing with the following error log:pq: password authentication failed for user \"postgres\""
Recovery procedure
Any updates on the issue not requiring manual intervention?
Well, we have the same problem and I probably figured out why - at least for our case.
We use longhorn for storage provisioning and want to keep the data whatever it takes and reuse Volumes, so we can shut down the whole cluster and set it up again. The users get created for the first time in our database and provide a secret postgres.project-postgres.credentials.postgresql.acid.zalan.do which has the correct password to the entry in pg_shadow. When we restart the whole cluster / namespace, the “old” password of postgres (and of course for standby and all other roles) is still persistent due to the volume, but a new password will be generated and provided as new secret. What’s missing here (I think) is that the initdb should alter the user password when the user already exists. Does this makes sense? https://github.com/zalando/postgres-operator/blob/692c721854e4667102778e8e69e3dd12e47984b5/pkg/util/users/users.go#L51 should handle this (should also be called with EventSync at cluster startup)
Can you give me an advice @Jan-M
or the other way around:
env | grep PASS
echo -n "grepped-password" | base64