openshift-ansible: Openshift ansible failed at openshift_node : Wait for master API on centos 7

Description

Provide a brief description of your issue here. For example:

5 CentOS 7 VM:

  • 3 Master
  • 5 Nodes 1 DNS resolving thunder.io

Using openshift ansible v3.7.0

this is the inventory:

[OSEv3:children]
masters
nodes
etcd
glusterfs

[masters]
172.16.1.31 openshift_ip=172.16.1.31
172.16.1.32 openshift_ip=172.16.1.32
172.16.1.33 openshift_ip=172.16.1.33

[etcd]
172.16.1.31 openshift_ip=172.16.1.31
172.16.1.32 openshift_ip=172.16.1.32
172.16.1.33 openshift_ip=172.16.1.33

[nodes]
172.16.1.31 openshift_ip=172.16.1.31 openshift_schedulable=false
172.16.1.32 openshift_ip=172.16.1.32 openshift_schedulable=false
172.16.1.33 openshift_ip=172.16.1.33 openshift_schedulable=false
172.16.1.34 openshift_ip=172.16.1.34 openshift_node_labels="{'region': 'infra', 'zone': 'default'}"
172.16.1.35 openshift_ip=172.16.1.35 openshift_node_labels="{'region': 'infra', 'zone': 'default'}"

[glusterfs]
172.16.1.31 glusterfs_ip=172.16.1.31 glusterfs_devices='[ "/dev/gfssda", "/dev/gfssdb" ]'
172.16.1.32 glusterfs_ip=172.16.1.32 glusterfs_devices='[ "/dev/gfssda", "/dev/gfssdb" ]'
172.16.1.33 glusterfs_ip=172.16.1.33 glusterfs_devices='[ "/dev/gfssda", "/dev/gfssdb" ]'

[OSEv3:vars]
ansible_ssh_user=root
enable_excluders=False
enable_docker_excluder=False
ansible_service_broker_install=False

containerized=True
os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant'
openshift_disable_check=disk_availability,docker_storage,memory_availability,docker_image_availability

openshift_node_kubelet_args={'pods-per-core': ['10']}

deployment_type=origin
openshift_deployment_type=origin

openshift_master_cluster_method=native
openshift_master_cluster_hostname=console.openshift.io
openshift_master_cluster_public_hostname=console.openshift.io

openshift_release=v3.7.0
openshift_pkg_version=v3.7.0
openshift_image_tag=v3.7.0
openshift_service_catalog_image_version=v3.7.0
template_service_broker_image_version=v3.7.0
openshift_metrics_image_version=v3.7.0
#openshift_logging_image_version=v3.7.0

osm_use_cockpit=true
openshift_metrics_install_metrics=True
openshift_logging_install_logging=True

openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]

openshift_public_hostname=console.openshift.io
openshift_master_default_subdomain=apps.openshift.io
Version

Please put the following version information in the code block indicated below.

  • Your ansible version per ansible --version
ansible --version
ansible 2.4.2.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Aug  4 2017, 00:39:18) [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)]
Steps To Reproduce

Minimal install of Centos 7 Launch ansible-playbook -i ./inventory-glusterfs.ini openshift-ansible/playbooks/byo/config.yml Wait for TASK [openshift_node : Wait for master API to become available before proceeding] ****

  1. [step 2]
Expected Results

OpS install correctly

Observed Results

Describe what is actually happening.

FAILED - RETRYING: Wait for master API to become available before proceeding (1 retries left).
fatal: [172.16.1.34]: FAILED! => {"attempts": 120, "changed": false, "cmd": ["curl", "--silent", "--t": "0:00:00.099533", "end": "2018-01-24 17:10:02.637832", "msg": "non-zero return code", "rc": 60, "s
fatal: [172.16.1.35]: FAILED! => {"attempts": 120, "changed": false, "cmd": ["curl", "--silent", "--t": "0:00:00.099456", "end": "2018-01-24 17:10:30.403782", "msg": "non-zero return code", "rc": 60, "s
        to retry, use: --limit @/root/openshift-ansible/playbooks/byo/config.retry

PLAY RECAP ******************************************************************************************
172.16.1.31                : ok=442  changed=37   unreachable=0    failed=0
172.16.1.32                : ok=373  changed=29   unreachable=0    failed=0
172.16.1.33                : ok=373  changed=29   unreachable=0    failed=0
172.16.1.34                : ok=146  changed=5    unreachable=0    failed=1
172.16.1.35                : ok=145  changed=5    unreachable=0    failed=1
localhost                  : ok=14   changed=0    unreachable=0    failed=0

INSTALLER STATUS ************************************************************************************
Initialization             : Complete
Health Check               : Complete
etcd Install               : Complete
Master Install             : Complete
Master Additional Install  : Complete
Node Install               : In Progress
        This phase can be restarted by running: playbooks/byo/openshift-node/config.yml

Failure summary:

  1. Hosts:    172.16.1.34, 172.16.1.35
     Play:     Configure nodes
     Task:     Wait for master API to become available before proceeding
     Message:  non-zero return code

For long output or logs, consider using a gist

Additional Information

Provide any additional information which may help us diagnose the issue.

  • Your operating system and version, ie: RHEL 7.2, Fedora 23 ($ cat /etc/redhat-release)
  • Your inventory file (especially any non-standard configuration parameters)
  • Sample code, etc
if I run from my failed nodes :
curl --tlsv1.2 --cacert origin/master/ca-bundle.crt https://console.thunder.io:8443/healthz/ready
this error prompt:
curl: (77) Problem with the SSL CA cert (path? access rights?)

there is the content of origin/master/ca-bundle.crt of the master (actually 172.16.1.31)

ls /etc/origin/master/
admin.crt                          ca.crt                frontproxy-ca.crt         master.etcd-ca.crt         master.proxy-client.key               openshift-master.crt         service-signer.crt
admin.key                          ca.key                front-proxy-ca.key        master.etcd-client.crt     master.server.crt                     openshift-master.key         service-signer.key
admin.kubeconfig                   ca.serial.txt         frontproxy-ca.key         master.etcd-client.csr     master.server.key                     openshift-master.kubeconfig  session-secrets.yaml
aggregator-front-proxy.crt         client-ca-bundle.crt  frontproxy-ca.serial.txt  master.etcd-client.key     named_certificates                    policy.json
aggregator-front-proxy.key         etcd.server.crt       htpasswd                  master.kubelet-client.crt  openshift-aggregator.crt              scheduler.json
aggregator-front-proxy.kubeconfig  etcd.server.key       master-config.yaml        master.kubelet-client.key  openshift-aggregator.key              serviceaccounts.private.key
ca-bundle.crt                      front-proxy-ca.crt    master-config.yaml.orig   master.proxy-client.crt    openshift-ansible-catalog-console.js  serviceaccounts.public.key

Can you help me ? Thx a lot

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 41

Most upvoted comments

HI @maximegy sorry for the late response. The path to crt file is usually /etc/origin/master/ca-bundle.crt . So run sudo curl --verbose --tlsv1.2 --cacert /etc/origin/master/ca-bundle.crt https://console.thunder.io:8443/healthz/ready

Run curl --tlsv1.2 --cacert origin/master/ca-bundle.crt https://console.thunder.io:8443/healthz/ready with sudo and share the output.