openshift-ansible: OKD 3.9 + Cisco ACI Install Error

Description

Infrastructure:

  • 1 bastion
  • 1 master
  • 1 infra
  • 1 node

Install method: Ansible (Advanced Install)

The install process never completes upon integrating with Cisco ACI (PR#8221).

Version
ansible --version:
ansible 2.6.2
  (...)
  python version = 2.7.5

git describe:
openshift-ansible-3.9.40-1-35-g337e91b
Steps To Reproduce
  1. Merge PR#8221 into release-3.9 (locally)
  2. Run ansible-playbook -i /root/ansible/inventory.yaml /root/openshift-ansible/playbooks/deploy_cluster.yml -v
Expected Results

Openshift Origin installs and integrates with Cisco ACI without issues

Observed Results

(Some) Pods regarding ACI keep CrashLooping, router-1-deploy Errors, docker-registry-1-deploy stuck on ContainerCreating.

On the master node the aci-containers-host-xxx runs fine without restarts, on Node and Infra nodes, it always Crashes.

Install process fails upon TASK [openshift_hosted : Poll for Openshift pod deployment success]

TASK [openshift_hosted : Poll for OpenShift pod deployment success] *************************************************************************************************************
FAILED - RETRYING: Poll for OpenShift pod deployment success (60 retries left).
...
FAILED - RETRYING: Poll for OpenShift pod deployment success (1 retries left).

failed: [ptstporiginmaster.stp.pt] (item=[{u'name': u'router', u'certificate': {u'keyfile': u'/etc/origin/master/openshift-router.key', u'certfile': u'/etc/origin/master/openshift-router.crt', u'cafile': u'/etc/origin/master/ca.crt'}, u'replicas': u'1', u'namespace': u'default', u'serviceaccount': u'router', u'stats_port': 1936, u'edits': [{u'action': u'put', u'key': u'spec.strategy.rollingParams.intervalSeconds', u'value': 1}, {u'action': u'put', u'key': u'spec.strategy.rollingParams.updatePeriodSeconds', u'value': 1}, {u'action': u'put', u'key': u'spec.strategy.activeDeadlineSeconds', u'value': 21600}], u'images': u'openshift/origin-${component}:${version}', u'selector': u'region=infra', u'ports': [u'80:80', u'443:443']}, {'_ansible_parsed': True, 'stderr_lines': [], u'cmd': [u'oc', u'get', u'deploymentconfig', u'router', u'--namespace', u'default', u'--config', u'/etc/origin/master/admin.kubeconfig', u'-o', u'jsonpath={ .status.latestVersion }'], u'end': u'2018-08-13 16:08:58.371020', '_ansible_no_log': False, u'stdout': u'1', 'failed': False, '_ansible_item_result': True, u'changed': True, 'item': {u'name': u'router', u'certificate': {u'certfile': u'/etc/origin/master/openshift-router.crt', u'keyfile': u'/etc/origin/master/openshift-router.key', u'cafile': u'/etc/origin/master/ca.crt'}, u'replicas': u'1', u'namespace': u'default', u'serviceaccount': u'router', u'selector': u'region=infra', u'edits': [{u'action': u'put', u'value': 1, u'key': u'spec.strategy.rollingParams.intervalSeconds'}, {u'action': u'put', u'value': 1, u'key': u'spec.strategy.rollingParams.updatePeriodSeconds'}, {u'action': u'put', u'value': 21600, u'key': u'spec.strategy.activeDeadlineSeconds'}], u'images': u'openshift/origin-${component}:${version}', u'stats_port': 1936, u'ports': [u'80:80', u'443:443']}, u'delta': u'0:00:00.237233', u'stderr': u'', u'rc': 0, u'invocation': {u'module_args': {u'creates': None, u'executable': None, u'_uses_shell': False, u'_raw_params': u"oc get deploymentconfig router --namespace default --config /etc/origin/master/admin.kubeconfig -o jsonpath='{ .status.latestVersion }'", u'removes': None, u'argv': None, u'warn': True, u'chdir': None, u'stdin': None}}, 'stdout_lines': [u'1'], u'start': u'2018-08-13 16:08:58.133787', '_ansible_ignore_errors': None, '_ansible_item_label': {u'name': u'router', u'certificate': {u'certfile': u'/etc/origin/master/openshift-router.crt', u'keyfile': u'/etc/origin/master/openshift-router.key', u'cafile': u'/etc/origin/master/ca.crt'}, u'replicas': u'1', u'namespace': u'default', u'serviceaccount': u'router', u'stats_port': 1936, u'edits': [{u'action': u'put', u'value': 1, u'key': u'spec.strategy.rollingParams.intervalSeconds'}, {u'action': u'put', u'value': 1, u'key': u'spec.strategy.rollingParams.updatePeriodSeconds'}, {u'action': u'put', u'value': 21600, u'key': u'spec.strategy.activeDeadlineSeconds'}], u'images': u'openshift/origin-${component}:${version}', u'selector': u'region=infra', u'ports': [u'80:80', u'443:443']}}]) => {"attempts": 60, "changed": true, "cmd": ["oc", "get", "replicationcontroller", "router-1", "--namespace", "default", "--config", "/etc/origin/master/admin.kubeconfig", "-o", "jsonpath={ .metadata.annotations.openshift\\.io/deployment\\.phase }"], "delta": "0:00:00.226742", "end": "2018-08-13 16:19:29.493914", "failed_when_result": true, "item": [{"certificate": {"cafile": "/etc/origin/master/ca.crt", "certfile": "/etc/origin/master/openshift-router.crt", "keyfile": "/etc/origin/master/openshift-router.key"}, "edits": [{"action": "put", "key": "spec.strategy.rollingParams.intervalSeconds", "value": 1}, {"action": "put", "key": "spec.strategy.rollingParams.updatePeriodSeconds", "value": 1}, {"action": "put", "key": "spec.strategy.activeDeadlineSeconds", "value": 21600}], "images": "openshift/origin-${component}:${version}", "name": "router", "namespace": "default", "ports": ["80:80", "443:443"], "replicas": "1", "selector": "region=infra", "serviceaccount": "router", "stats_port": 1936}, {"_ansible_ignore_errors": null, "_ansible_item_label": {"certificate": {"cafile": "/etc/origin/master/ca.crt", "certfile": "/etc/origin/master/openshift-router.crt", "keyfile": "/etc/origin/master/openshift-router.key"}, "edits": [{"action": "put", "key": "spec.strategy.rollingParams.intervalSeconds", "value": 1}, {"action": "put", "key": "spec.strategy.rollingParams.updatePeriodSeconds", "value": 1}, {"action": "put", "key": "spec.strategy.activeDeadlineSeconds", "value": 21600}], "images": "openshift/origin-${component}:${version}", "name": "router", "namespace": "default", "ports": ["80:80", "443:443"], "replicas": "1", "selector": "region=infra", "serviceaccount": "router", "stats_port": 1936}, "_ansible_item_result": true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": true, "cmd": ["oc", "get", "deploymentconfig", "router", "--namespace", "default", "--config", "/etc/origin/master/admin.kubeconfig", "-o", "jsonpath={ .status.latestVersion }"], "delta": "0:00:00.237233", "end": "2018-08-13 16:08:58.371020", "failed": false, "invocation": {"module_args": {"_raw_params": "oc get deploymentconfig router --namespace default --config /etc/origin/master/admin.kubeconfig -o jsonpath='{ .status.latestVersion }'", "_uses_shell": false, "argv": null, "chdir": null, "creates": null, "executable": null, "removes": null, "stdin": null, "warn": true}}, "item": {"certificate": {"cafile": "/etc/origin/master/ca.crt", "certfile": "/etc/origin/master/openshift-router.crt", "keyfile": "/etc/origin/master/openshift-router.key"}, "edits": [{"action": "put", "key": "spec.strategy.rollingParams.intervalSeconds", "value": 1}, {"action": "put", "key": "spec.strategy.rollingParams.updatePeriodSeconds", "value": 1}, {"action": "put", "key": "spec.strategy.activeDeadlineSeconds", "value": 21600}], "images": "openshift/origin-${component}:${version}", "name": "router", "namespace": "default", "ports": ["80:80", "443:443"], "replicas": "1", "selector": "region=infra", "serviceaccount": "router", "stats_port": 1936}, "rc": 0, "start": "2018-08-13 16:08:58.133787", "stderr": "", "stderr_lines": [], "stdout": "1", "stdout_lines": ["1"]}], "rc": 0, "start": "2018-08-13 16:19:29.267172", "stderr": "", "stderr_lines": [], "stdout": "Failed", "stdout_lines": ["Failed"]}
        to retry, use: --limit @/root/openshift-ansible/playbooks/deploy_cluster.retry

PLAY RECAP **********************************************************************************************************************************************************************
localhost                  : ok=12   changed=0    unreachable=0    failed=0
ptstporigininfra.stp.pt    : ok=154  changed=59   unreachable=0    failed=0
ptstporiginmaster.stp.pt   : ok=507  changed=190  unreachable=0    failed=1
ptstporiginnode.stp.pt     : ok=154  changed=59   unreachable=0    failed=0


INSTALLER STATUS ****************************************************************************************************************************************************************
Initialization             : Complete (0:00:49)
Health Check               : Complete (0:00:38)
etcd Install               : Complete (0:02:30)
Master Install             : Complete (0:06:08)
Master Additional Install  : Complete (0:00:23)
Node Install               : Complete (0:09:40)
Hosted Install             : In Progress (0:14:31)
        This phase can be restarted by running: playbooks/openshift-hosted/config.yml



Failure summary:


  1. Hosts:    ptstporiginmaster.stp.pt
     Play:     Poll for hosted pod deployments
     Task:     Poll for OpenShift pod deployment success
     Message:  All items completed
Additional Information
cat /etc/redhat-release:
Red Hat Enterprise Linux Server release 7.5 (Maipo)

The inventory file & output for several oc describe is available in a gist: https://gist.github.com/CelsoSantos/8fc68092901e1fad7d3a9cb5e0059a00

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 20

Most upvoted comments

Hi @CelsoSantos ,

Did you manage to get your cluster up and running ? If you need help with the ACI CNI part I can probably help you out.

Let me know, Cam