postgres-operator: Permission issue with tls cert after upgrading to 1.5
After upgrading 1.4 -> 1.5
my cluster couldn’t init.
I’ve checked that certs are mounted into container, and I’m able to read them as a root user. Not sure where to look next.
Logs from cluster pods:
...
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Etc/UTC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
/usr/lib/postgresql/12/bin/pg_ctl -D /home/postgres/pgdata/pgroot/data -l logfile start
2020-05-20 14:40:25 UTC [301]: [1-1] 5ec54159.12d 0 FATAL: could not load server certificate file "/tls/tls.crt": Permission denied
2020-05-20 14:40:25 UTC [301]: [2-1] 5ec54159.12d 0 LOG: database system is shut down
2020-05-20 14:40:25,108 INFO: postmaster pid=301
/var/run/postgresql:5432 - no response
2020-05-20 14:40:25,121 INFO: removing initialize key after failed attempt to bootstrap the cluster
2020-05-20 14:40:25,137 INFO: renaming data directory to /home/postgres/pgdata/pgroot/data_2020-05-20-14-40-25
2020-05-20 14:40:25,587 INFO: Lock owner: None; I am grafana-cluster-0
Traceback (most recent call last):
File "/usr/local/bin/patroni", line 11, in <module>
load_entry_point('patroni==1.6.5', 'console_scripts', 'patroni')()
File "/usr/local/lib/python3.6/dist-packages/patroni/__init__.py", line 235, in main
return patroni_main()
File "/usr/local/lib/python3.6/dist-packages/patroni/__init__.py", line 199, in patroni_main
patroni.run()
File "/usr/local/lib/python3.6/dist-packages/patroni/__init__.py", line 135, in run
logger.info(self.ha.run_cycle())
File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1370, in run_cycle
info = self._run_cycle()
File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1277, in _run_cycle
return self.post_bootstrap()
File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1173, in post_bootstrap
self.cancel_initialization()
File "/usr/local/lib/python3.6/dist-packages/patroni/ha.py", line 1168, in cancel_initialization
raise PatroniException('Failed to bootstrap cluster')
patroni.exceptions.PatroniException: 'Failed to bootstrap cluster'
/run/service/patroni: finished with code=1 signal=0
/run/service/patroni: exceeded maximum number of restarts 5
stopping /run/service/patroni
timeout: finish: .: (pid 303) 10s, want down
Permissions in container:
root@grafana-cluster-0:/home/postgres# ls -la /tls
total 4
drwxrwxrwt 3 root root 120 May 20 15:16 .
drwxr-xr-x 1 root root 4096 May 20 15:17 ..
drwxr-xr-x 2 root root 80 May 20 15:16 ..2020_05_20_15_16_36.928412376
lrwxrwxrwx 1 root root 31 May 20 15:16 ..data -> ..2020_05_20_15_16_36.928412376
lrwxrwxrwx 1 root root 14 May 20 15:16 tls.crt -> ..data/tls.crt
lrwxrwxrwx 1 root root 14 May 20 15:16 tls.key -> ..data/tls.key
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 23 (6 by maintainers)
postgresql
resourcepostgresql
resource to use cert secretspec.spiloFSGroup: 103
topostgresql
resourceand magically things work!
this took a lot of running around.
i think the documentation could be increased in this area. https://postgres-operator.readthedocs.io/en/latest/user/#custom-tls-certificates is a start but
if applications are going to be fussy because of forced tls using an invalid certificate, there should be more setup instructions to facilitate good communication between clients and the new database.
This is happening for me even for fresh installation- on 1.5.0
In regards to toggling
securityContext
(containing the FS Group) not triggering a rolling update - that sounds like a bug to be resolved.