openshift-ansible: x509: certificate signed by unknown authority

Hi Team,

Trying to implement OpenShift on 3 masters/2 nodes/3 etcds/1 LB

Ansible Script looks like below:

[OSEv3:children]
masters
nodes
etcd
lb

[OSEv3:vars]
ansible_ssh_user=root
deployment_type=origin
openshift_master_cluster_method=native
openshift_master_cluster_hostname=oc-master.domain.com
openshift_master_cluster_public_hostname=oc-master.domain.com
openshift_master_default_subdomain=apps.oc-master.domain.com

openshift_hosted_registry_storage_kind=nfs
openshift_hosted_registry_storage_access_modes=['ReadWriteMany']
openshift_hosted_registry_storage_host=isilon01.domain.com
openshift_hosted_registry_storage_nfs_directory=/ifs/data/production/openshift
openshift_hosted_registry_storage_volume_name=registry
openshift_hosted_registry_storage_volume_size=20Gi

openshift_hosted_metrics_deploy=true
openshift_hosted_metrics_storage_kind=nfs
openshift_hosted_metrics_storage_access_modes=['ReadWriteOnce']
openshift_hosted_metrics_storage_host=isilon01.domain.com
openshift_hosted_metrics_storage_nfs_directory=/ifs/data/production/openshift
openshift_hosted_metrics_storage_volume_name=metrics

openshift_hosted_logging_deploy=true
openshift_hosted_logging_storage_kind=nfs
openshift_hosted_logging_storage_access_modes=['ReadWriteOnce']
openshift_hosted_logging_storage_host=isilon01.domain.com
openshift_hosted_logging_storage_nfs_directory=/ifs/data/production/openshift
openshift_hosted_logging_storage_volume_name=logging

openshift_master_api_port=8443
openshift_master_console_port=8443



openshift_node_iptables_sync_period=5s


logrotate_scripts=[{"name": "syslog", "path": "/var/log/cron\n/var/log/maillog\n/var/log/messages\n/var/log/secure\n/var/log/spooler\n", "options": ["daily", "rotate 7", "compress", "sharedscripts", "missingok"], "scripts": {"postrotate": "/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true"}}]

openshift_clock_enabled=true

[masters]
oc-master[1:3].domain.com

[etcd]
oc-etcd[1:3].domain.com

[lb]
oc-master.domain.com #containerized=false

[nodes]
oc-master[1:3].domain.com
oc-node[1:2].domainm ##openshift_node_labels="{'region': 'primary', 'zone': 'default'}"

note: tbielawa edited above block to use code styling

ansible --version ansible 2.2.1.0 config file = /etc/ansible/ansible.cfg configured module search path = Default w/o overrides

Error while running ansible:

TASK [openshift_examples : Import Centos Image streams] ************************
fatal: [oc-master1.domain.com]: FAILED! => {"changed": false, "cmd": ["oc", "create", "-n", "openshift", "-f", "/usr/share/openshift/examples/image-streams/image-streams-centos7.json"], "delta": "0:00:00.357048", "end": "2017-03-28 12:55:13.441164", "failed": true, "failed_when_result": true, "rc": 1, "start": "2017-03-28 12:55:13.084116", "stderr": "Error from server: Get https://oc-master.domain.com:8443/api/v1/namespaces/openshift/resourcequotas: x509: certificate signed by unknown authority\nError from server: Get https://oc-master.domain.com:8443/api/v1/namespaces/openshift/resourcequotas: x509: certificate signed by unknown authority\nError from server: Get https://oc-master.domain.com:8443:8443/api/v1/namespaces/openshift/resourcequotas: x509: certificate signed by unknown authority\nError from server: 
...
...
x509: certificate signed by unknown authority", "stdout": "", "stdout_lines": [], "warnings": []}
	to retry, use: --limit @/root/openshift-ansible/playbooks/byo/config.retry

PLAY RECAP *********************************************************************
localhost                  : ok=10   changed=0    unreachable=0    failed=0   
oc-etcd1.domain.com      : ok=90   changed=1    unreachable=0    failed=0   
oc-etcd2.domain.com      : ok=82   changed=1    unreachable=0    failed=0   
oc-etcd3.domain.com      : ok=82   changed=1    unreachable=0    failed=0   
oc-master.domain.com     : ok=70   changed=0    unreachable=0    failed=0   
oc-master1.domain.com    : ok=299  changed=13   unreachable=0    failed=1   
oc-master2.domain.com    : ok=245  changed=9    unreachable=0    failed=0   
oc-master3.domain.com    : ok=245  changed=9    unreachable=0    failed=0   
oc-node1.domain.com      : ok=112  changed=2    unreachable=0    failed=1   
oc-node2.domain.com      : ok=112  changed=2    unreachable=0    failed=1
  • Your operating system and version, ie: RHEL 7.2, Fedora 23 ($ cat /etc/redhat-release) OS: RHEL 7.3 running on vmware

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 3
  • Comments: 34 (9 by maintainers)

Most upvoted comments

Some of the best OpenShift SSL debugging information I’ve seen yet. Thanks for posting @abutcher, helped debug issues where internal certificates expired and the renew playbook failed.

Hey @rahul334481, based on the CNs of those CA certificates it looks like there is a different CA certificate on each master. If this is the case then each master’s serving certificate may have been signed by each individual master’s CA certificate (rather than all certificates being signed by a single, common CA certificate), meaning that none of the masters can talk to one another. I’m really curious how this could have occurred.

The CA certificate on each master is identical when I configure an HA cluster using the master branch.

[root@master1 ~]# openssl x509 -noout -issuer -in /etc/origin/master/ca.crt
issuer= /CN=openshift-signer@1490734436
[root@master1 ~]# md5sum /etc/origin/master/ca.crt 
b9692b2d6f948547d70e7913c1e166aa  /etc/origin/master/ca.crt

[root@master2 ~]# openssl x509 -noout -issuer -in /etc/origin/master/ca.crt
issuer= /CN=openshift-signer@1490734436
[root@master2 ~]# md5sum /etc/origin/master/ca.crt
b9692b2d6f948547d70e7913c1e166aa  /etc/origin/master/ca.crt

[root@master3 ~]# openssl x509 -noout -issuer -in /etc/origin/master/ca.crt
issuer= /CN=openshift-signer@1490734436
[root@master3 ~]# md5sum /etc/origin/master/ca.crt 
b9692b2d6f948547d70e7913c1e166aa  /etc/origin/master/ca.crt

Are you using the master branch of openshift-ansible? If not, which version or branch?

What procedure was used to install this cluster? Curious if a single run of playbooks/byo/config.yml against this inventory resulted in this state or if the cluster was configured in pieces.

If this is a new cluster I would recommend running uninstall to clean up the mismatched certificates / configuration and then reinstall.

ansible-playbook -i <inventory> playbooks/adhoc/uninstall.yml

I’m on freenode IRC as abutcher if you want to reach out there.