openshift-ansible: FAILED - RETRYING: Wait for control plane pods to appear

Description I’m trying a new installation of MASTER branch and v3.10.0 ,it’s failing when installing the Master with:

Control plane install failed. Version Ansible: ansible 2.6.2 openshift_release=v3.10.0 openshift_image_tag=v3.10.0 openshift_pkg_version=-3.10.0-1.el7.git.0.0c4577e RPM: package openshift-ansible is not installed Steps To Reproduce Follow all pre-requisits git clone https://github.com/openshift/openshift-ansible cd openshift-ansible ansible-playbook playbooks/prerequisites.yml ansible-playbook playbooks/deploy_cluster.yml

TASK [openshift_control_plane : Wait for control plane pods to appear] ************************************************************************************************************************************************************
Tuesday 14 August 2018  16:39:24 +0800 (0:00:00.086)       0:22:42.301 ******** 
FAILED - RETRYING: Wait for control plane pods to appear (60 retries left).
FAILED - RETRYING: Wait for control plane pods to appear (59 retries left).
...............
FAILED - RETRYING: Wait for control plane pods to appear (1 retries left).
failed: [10.10.244.212] (item=__omit_place_holder__5e245b7f796113e2f9ba55e6c4a882ef0471a251) => {"attempts": 60, "changed": false, "item": "__omit_place_holder__5e245b7f796113e2f9ba55e6c4a882ef0471a251", "msg": {"cmd": "/bin/oc get pod master-__omit_place_holder__5e245b7f796113e2f9ba55e6c4a882ef0471a251-10.10.244.212 -o json -n kube-system", "results": [{}], "returncode": 1, "stderr": "The connection to the server 10.10.244.212:8443 was refused - did you specify the right host or port?\n", "stdout": ""}}

journalctl -flu docker.service

Aug 14 16:46:15 10-10-244-212 dockerd-current[26428]: F0814 08:46:15.849128       1 start_api.go:68] could not load config file "/etc/origin/master/master-config.yaml" due to an error: error reading config: only encoded map or array can be decoded into a struct
Aug 14 16:46:15 10-10-244-212 dockerd-current[26428]: time="2018-08-14T16:46:15.911550511+08:00" level=error msg="containerd: deleting container" error="exit status 1: \"container 30483e504b05f46127fb81b73dab375fb5429096535b0611a07bcdae7505a25c does not exist\\none or more of the container deletions failed\\n\""
Aug 14 16:46:15 10-10-244-212 dockerd-current[26428]: time="2018-08-14T16:46:15.924794132+08:00" level=warning msg="30483e504b05f46127fb81b73dab375fb5429096535b0611a07bcdae7505a25c cleanup: failed to unmount secrets: invalid argument"
Aug 14 16:46:18 10-10-244-212 dockerd-current[26428]: time="2018-08-14T16:46:18.576959765+08:00" level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in container 75cfd311e6f33a696b4935b380294e3f6158723a9352357f8aaff3b9da14d31f"

About this issue

Original URL
State: closed
Created 6 years ago
Reactions: 7
Comments: 22 (7 by maintainers)

Most upvoted comments

Folks, ‘Wait for control plane pods to appear’ failing means API server failed to start. There might be a billion reasons for that - unreachable pod image pullspec, wrong API server configuration, something wring with docker etc.

Lets not post ‘I have this issue too’ comments, because the same symptom doesn’t mean the same issue is causing it - or that ansible playbooks are incorrect.

vrutkovs on Aug 29, 2018

Make sure the hostname command on your host give FQDN, if not set it with hostnamectl set-hostname <FQDN hostname>

Add below properties in /etc/sysconfig/network-scripts/ifcfg-<interface_name>

NM_CONTROLLED=yes
PEERDNS=yes

Also to fix upstream DNS quickly you can follow below suppose your DNS addresses are 10.99.1.2 and 10.99.1.3

# nmcli con mod ens192 ipv4.dns 10.99.1.2,10.99.1.3
# systemctl restart NetworkManager
# systemctl restart dnsmasq
# cat /etc/dnsmasq.d/origin-upstream-dns.conf
# cat /etc/resolv.conf

vinodmsharma on Jan 8, 2019

@aland-zhang The detailed logs are in failed pods. Checkout the log with

docker ps -a
docker logs -f --tail 10 <container-id>

jhaohai on Aug 16, 2018