openshift-ansible: could not start DNS, unable to read config file: open /etc/origin/node/resolv.conf: no such file or directory
Description
Uninstall openshift reinstall after installation
ansible-playbook /data/openshift-ansible/playbooks/adhoc/uninstall.yml
ansible-playbook /data/openshift-ansible/playbooks/byo/config.yml
Version
ansible 2.3.2.0
openshift-ansible 2017-10-19 update from master , commitid ca6581dbd5bf06152ad8a321e1fb45911a91cce4
ansible log
TASK [openshift_manage_node : Wait for Node Registration] **************************************************************************************************************************************************************************************
Thursday 19 October 2017 21:32:38 +0800 (0:00:00.078) 0:03:00.870 ******
FAILED - RETRYING: Wait for Node Registration (50 retries left).
ok: [master -> master]
FAILED - RETRYING: Wait for Node Registration (49 retries left).
FAILED - RETRYING: Wait for Node Registration (48 retries left).
FAILED - RETRYING: Wait for Node Registration (47 retries left).
FAILED - RETRYING: Wait for Node Registration (46 retries left).
FAILED - RETRYING: Wait for Node Registration (45 retries left).
FAILED - RETRYING: Wait for Node Registration (44 retries left).
FAILED - RETRYING: Wait for Node Registration (43 retries left).
FAILED - RETRYING: Wait for Node Registration (42 retries left).
FAILED - RETRYING: Wait for Node Registration (41 retries left).
FAILED - RETRYING: Wait for Node Registration (40 retries left).
message log
Oct 19 21:24:19 node1 systemd: origin-node.service holdoff time over, scheduling restart.
Oct 19 21:24:19 node1 systemd: Starting OpenShift Node...
Oct 19 21:24:19 node1 dnsmasq[4965]: setting upstream servers from DBus
Oct 19 21:24:19 node1 dnsmasq[4965]: using nameserver 127.0.0.1#53 for domain in-addr.arpa
Oct 19 21:24:19 node1 dnsmasq[4965]: using nameserver 127.0.0.1#53 for domain cluster.local
Oct 19 21:24:20 node1 origin-node: I1019 21:24:20.297680 17564 start_node.go:251] Reading node configuration from /etc/origin/node/node-config.yaml
Oct 19 21:24:20 node1 origin-node: I1019 21:24:20.406336 17564 node.go:123] Initializing SDN node of type "redhat/openshift-ovs-subnet" with configured hostname "node1" (IP ""), iptables sync period "30s"
Oct 19 21:24:20 node1 origin-node: I1019 21:24:20.416313 17564 docker.go:364] Connecting to docker on unix:///var/run/docker.sock
Oct 19 21:24:20 node1 origin-node: I1019 21:24:20.416379 17564 docker.go:384] Start docker client with request timeout=2m0s
Oct 19 21:24:20 node1 origin-node: W1019 21:24:20.418569 17564 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Oct 19 21:24:20 node1 origin-node: F1019 21:24:20.438965 17564 start_node.go:140] could not start DNS, unable to read config file: open /etc/origin/node/resolv.conf: no such file or directory
Oct 19 21:24:20 node1 systemd: origin-node.service: main process exited, code=exited, status=255/n/a
Oct 19 21:24:20 node1 dnsmasq[4965]: setting upstream servers from DBus
Oct 19 21:24:20 node1 systemd: Failed to start OpenShift Node.
Oct 19 21:24:20 node1 systemd: Unit origin-node.service entered failed state.
Oct 19 21:24:20 node1 systemd: origin-node.service failed.
Temporary solution
-
Delete /etc/resolv.conf includes 99-origin-dns content
-
Manually create /etc/origin/node/resolv.conf
echo 'nameserver 192.168.1.142' > /etc/origin/node/resolv.conf
Normal ansible log
TASK [openshift_manage_node : Wait for Node Registration] **************************************************************************************************************************************************************************************
Thursday 19 October 2017 21:32:38 +0800 (0:00:00.078) 0:03:00.870 ******
FAILED - RETRYING: Wait for Node Registration (50 retries left).
ok: [master -> master]
FAILED - RETRYING: Wait for Node Registration (49 retries left).
FAILED - RETRYING: Wait for Node Registration (48 retries left).
FAILED - RETRYING: Wait for Node Registration (47 retries left).
FAILED - RETRYING: Wait for Node Registration (46 retries left).
FAILED - RETRYING: Wait for Node Registration (45 retries left).
FAILED - RETRYING: Wait for Node Registration (44 retries left).
FAILED - RETRYING: Wait for Node Registration (43 retries left).
FAILED - RETRYING: Wait for Node Registration (42 retries left).
FAILED - RETRYING: Wait for Node Registration (41 retries left).
FAILED - RETRYING: Wait for Node Registration (40 retries left).
ok: [node1 -> master]
The origin-node starts normally
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 15 (13 by maintainers)
Commits related to this issue
- Default to /etc/resolv.conf instead of /etc/origin/node/resolv.conf There is no task that currently sets up /etc/origin/node/resolv.conf but the service is configured to load that file when it starts... — committed to dmsimard/openshift-ansible by deleted user 7 years ago
- Default to /etc/resolv.conf instead of /etc/origin/node/resolv.conf There is no task that currently sets up /etc/origin/node/resolv.conf but the service is configured to load that file when it starts... — committed to dmsimard/openshift-ansible by deleted user 7 years ago
- Default to /etc/resolv.conf instead of /etc/origin/node/resolv.conf There is no task that currently sets up /etc/origin/node/resolv.conf but the service is configured to load that file when it starts... — committed to dmsimard/openshift-ansible by deleted user 7 years ago
I found the problem with @jfchevrette (Thanks JF!).
The issue is that our environment configures eth0 in
/etc/sysconfig/network-scripts/ifcfg-eth0to explicitely not use NetworkManager for that interface:This means that the interface is not controlled by NetworkManager and therefore restarting NetworkManager does not bring that interface up and the dispatcher script does not run for that interface. Just by commenting out
NM_CONTROLLED=NOin the ifcfg-eth0 file and restarting NetworkManager created the /etc/origin/node/resolv.conf properly.I think a proper “fix” in openshift-ansible would be to add a check that verifies if the interface is in the output of “nmcli con”, if it’s not, fail with a friendly message. I’ll send a PR for that.
@dmsimard my preference is the check.