openshift-ansible: Glusterfs config.yml deploy produce: Wait for heketi pod Message Failed without returning a message
Hi Everyone, I’m trying to overcome this issue in my deployment for around a week, I ran the deployment scripts prerequisites.yml, deploy_cluster.yml, uninstall.yml, again and again still same issue.
* ansible 2.9.2
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/site-packages/ansible
executable location = /usr/bin/ansible
python version = 2.7.5 (default, Aug 7 2019, 00:51:29) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
* The output of `git describe`
`git version 1.8.3.1`
If you're running from playbooks installed via RPM
* The output of `rpm -q openshift-ansible`
package openshift-ansible is not installed (but I'm sure it is!)
[root@os-master openshift-ansible]# ls
ansible.cfg CONTRIBUTING.md examples images meta playbooks README_CONTAINERIZED_INSTALLATION.md roles test
BUILD.md DEPLOYMENT_TYPES.md hack inventory openshift-ansible.spec pytest.ini README.md setup.cfg test-requirements.txt
conftest.py docs HOOKS.md LICENSE OWNERS README_CONTAINER_IMAGE.md requirements.txt setup.py tox.ini
Place the output between the code block below:
Steps To Reproduce
- Run ansible-playbook /root/openshift-ansible/playbooks/prerequisites.yml
- Deploy new single master cluster using ansible-playbook /root/openshift-ansible/playbooks/openshift-glusterfs/config.yml
- Same happen also with ansible-playbook /root/openshift-ansible/playbooks/deploy_cluster.yml 100% reproducible, even wit a minimal inventory file and new VMs.
Expected Results
GlusterFS is deployed and ready for PV and PVC
Observed Results
Describe what is actually happening.
PLAY RECAP **************************************************************************************************************************************************************************************** localhost : ok=12 changed=0 unreachable=0 failed=0 skipped=4 rescued=0 ignored=0 os-infra.mydomain.com : ok=144 changed=36 unreachable=0 failed=0 skipped=163 rescued=0 ignored=0 os-master.mydomain.com : ok=471 changed=193 unreachable=0 failed=1 skipped=589 rescued=0 ignored=0 os-node.mydomain.com : ok=129 changed=36 unreachable=0 failed=0 skipped=159 rescued=0 ignored=0 os-storage.mydomain.com : ok=129 changed=36 unreachable=0 failed=0 skipped=159 rescued=0 ignored=0
INSTALLER STATUS ********************************************************************************************************************************************************************************** Initialization : Complete (0:00:26) Health Check : Complete (0:00:07) Node Bootstrap Preparation : Complete (0:03:14) etcd Install : Complete (0:00:41) Master Install : Complete (0:04:26) Master Additional Install : Complete (0:00:40) Node Join : Complete (0:00:43) GlusterFS Install : In Progress (0:13:46) This phase can be restarted by running: playbooks/openshift-glusterfs/new_install.yml
Failure summary:
Hosts: os-master.mydomain.com Play: Configure GlusterFS Task: Wait for heketi pod Message: Failed without returning a message.`
Always when deploying Glusterfs it failed Failed without returning a message and after Wait for heketi pod FAILED - RETRYING: Wait for heketi pod (1 retries left).
fatal: [os-master.example.comthedawn.com]: FAILED! => {"attempts": 30, "changed": false, "module_results": {"cmd": "/usr/bin/oc get pod --selector=glusterfs=heketi-storage-pod -o json -n glusterfs", "results": [{"apiVersion": "v1", "items": [{"apiVersion": "v1", "kind": "Pod", "metadata": {"annotations": {"openshift.io/deployment-config.latest-version": "1", "openshift.io/deployment-config.name": "heketi-storage", "openshift.io/deployment.name": "heketi-storage-1", "openshift.io/scc": "privileged"}, "creationTimestamp": "2020-01-24T11:26:08Z", "generateName": "heketi-storage-1-", "labels": {"deployment": "heketi-storage-1", "deploymentconfig": "heketi-storage", "glusterfs": "heketi-storage-pod", "heketi": "storage-pod"}, "name": "heketi-storage-1-bvn52", "namespace": "glusterfs", "ownerReferences": [{"apiVersion": "v1", "blockOwnerDeletion": true, "controller": true, "kind": "ReplicationController", "name": "heketi-storage-1", "uid": "4fa4ac45-3e9c-11ea-a06a-000c29fad897"}], "resourceVersion": "8520", "selfLink": "/api/v1/namespaces/glusterfs/pods/heketi-storage-1-bvn52", "uid": "51645ba4-3e9c-11ea-a06a-000c29fad897"}, "spec": {"containers": [{"env": [{"name": "HEKETI_USER_KEY", "value": "kumZhoQMqSxUcCGAeIXiiZfYSnxOIQrQkGp2T1ev6AM="}, {"name": "HEKETI_ADMIN_KEY", "value": "Wcct2uvr6AI8bXUtFIJ9IdgJxyqdW+P1qKWSntk9MCg="}, {"name": "HEKETI_CLI_USER", "value": "admin"}, {"name": "HEKETI_CLI_KEY", "value": "Wcct2uvr6AI8bXUtFIJ9IdgJxyqdW+P1qKWSntk9MCg="}, {"name": "HEKETI_EXECUTOR", "value": "kubernetes"}, {"name": "HEKETI_FSTAB", "value": "/var/lib/heketi/fstab"}, {"name": "HEKETI_SNAPSHOT_LIMIT", "value": "14"}, {"name": "HEKETI_KUBE_GLUSTER_DAEMONSET", "value": "1"}, {"name": "HEKETI_IGNORE_STALE_OPERATIONS", "value": "true"}, {"name": "HEKETI_DEBUG_UMOUNT_FAILURES", "value": "true"}], "image": "docker.io/heketi/heketi:latest", "imagePullPolicy": "IfNotPresent", "livenessProbe": {"failureThreshold": 3, "httpGet": {"path": "/hello", "port": 8080, "scheme": "HTTP"}, "initialDelaySeconds": 30, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 3}, "name": "heketi", "ports": [{"containerPort": 8080, "protocol": "TCP"}], "readinessProbe": {"failureThreshold": 3, "httpGet": {"path": "/hello", "port": 8080, "scheme": "HTTP"}, "initialDelaySeconds": 3, "periodSeconds": 10, "successThreshold": 1, "timeoutSeconds": 3}, "resources": {}, "terminationMessagePath": "/dev/termination-log", "terminationMessagePolicy": "File", "volumeMounts": [{"mountPath": "/var/lib/heketi", "name": "db"}, {"mountPath": "/etc/heketi", "name": "config"}, {"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount", "name": "heketi-storage-service-account-token-kmvc8", "readOnly": true}]}], "dnsPolicy": "ClusterFirst", "imagePullSecrets": [{"name": "heketi-storage-service-account-dockercfg-xf57n"}], "nodeName": "os-master.example.comthedawn.com", "priority": 0, "restartPolicy": "Always", "schedulerName": "default-scheduler", "securityContext": {}, "serviceAccount": "heketi-storage-service-account", "serviceAccountName": "heketi-storage-service-account", "terminationGracePeriodSeconds": 30, "volumes": [{"glusterfs": {"endpoints": "heketi-db-storage-endpoints", "path": "heketidbstorage"}, "name": "db"}, {"name": "config", "secret": {"defaultMode": 420, "secretName": "heketi-storage-config-secret"}}, {"name": "heketi-storage-service-account-token-kmvc8", "secret": {"defaultMode": 420, "secretName": "heketi-storage-service-account-token-kmvc8"}}]}, "status": {"conditions": [{"lastProbeTime": null, "lastTransitionTime": "2020-01-24T11:26:08Z", "status": "True", "type": "Initialized"}, {"lastProbeTime": null, "lastTransitionTime": "2020-01-24T11:26:08Z", "message": "containers with unready status: [heketi]", "reason": "ContainersNotReady", "status": "False", "type": "Ready"}, {"lastProbeTime": null, "lastTransitionTime": null, "message": "containers with unready status: [heketi]", "reason": "ContainersNotReady", "status": "False", "type": "ContainersReady"}, {"lastProbeTime": null, "lastTransitionTime": "2020-01-24T11:26:08Z", "status": "True", "type": "PodScheduled"}], "containerStatuses": [{"image": "docker.io/heketi/heketi:latest", "imageID": "", "lastState": {}, "name": "heketi", "ready": false, "restartCount": 0, "state": {"waiting": {"reason": "ContainerCreating"}}}], "hostIP": "192.168.1.212", "phase": "Pending", "qosClass": "BestEffort", "startTime": "2020-01-24T11:26:08Z"}}], "kind": "List", "metadata": {"resourceVersion": "", "selfLink": ""}}], "returncode": 0}, "state": "list"}
For long output or logs, consider using a gist
Additional Information
I believe it’s related to this bug, but Maybe I’m missing the is the workaround?
I'm using as standalone ESXi VMware as the hypervisor, and an RPM install of Origin.
ansible 2.9.2
Origin 3.11
Centos 7 as the OS for the nodes
`[root@os-master ~]# docker version
Client:
Version: 1.13.1
API version: 1.26
Package version: docker-1.13.1-103.git7f2769b.el7.centos.x86_64
Go version: go1.10.3
Git commit: 7f2769b/1.13.1
Built: Sun Sep 15 14:06:47 2019
OS/Arch: linux/amd64
Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Package version: docker-1.13.1-103.git7f2769b.el7.centos.x86_64
Go version: go1.10.3
Git commit: 7f2769b/1.13.1
Built: Sun Sep 15 14:06:47 2019
OS/Arch: linux/amd64
Experimental: false`
Here is my inventory:
`[OSEv3:children]
masters
etcd
nodes
glusterfs
[OSEv3:vars]
ansible_ssh_user=root
openshift_deployment_type=origin
openshift_release="3.11"
openshift_image_tag="v3.11"
openshift_master_default_subdomain=apps.mydomain.com
openshift_docker_selinux_enabled=True
openshift_check_min_host_memory_gb=16
openshift_check_min_host_disk_gb=50
openshift_disable_check=docker_image_availability
openshift_master_dynamic_provisioning_enabled=true
openshift_registry_selector="role=infra"
openshift_hosted_registry_storage_kind=glusterfs
openshift_metrics_install_metrics=true
openshift_metrics_cassandra_storage_type=pv
openshift_metrics_hawkular_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_cassandra_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_heapster_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_metrics_storage_volume_size=20Gi
openshift_metrics_cassandra_pvc_storage_class_name="glusterfs-registry-block"
openshift_logging_install_logging=true
openshift_logging_es_pvc_dynamic=true openshift_logging_storage_kind=dynamic
openshift_logging_kibana_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_curator_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_nodeselector={"node-role.kubernetes.io/infra": "true"}
openshift_logging_es_pvc_size=20Gi
openshift_logging_es_pvc_storage_class_name="glusterfs-registry-block"
openshift_storage_glusterfs_registry_namespace=infra-storage
openshift_storage_glusterfs_registry_storageclass=false
openshift_storage_glusterfs_registry_storageclass_default=false
openshift_storage_glusterfs_registry_block_deploy=true
openshift_storage_glusterfs_registry_block_host_vol_create=true
openshift_storage_glusterfs_registry_block_host_vol_size=100
openshift_storage_glusterfs_registry_block_storageclass=true
openshift_storage_glusterfs_registry_block_storageclass_default=false
[masters]
os-master.mydomain.com
[etcd]
os-master.mydomain.com
[nodes]
os-master.mydomain.com openshift_node_group_name="node-config-master"
os-infra.mydomain.com openshift_node_group_name="node-config-infra"
os-storage.mydomain.com openshift_node_group_name="node-config-compute"
os-node.mydomain.com openshift_node_group_name="node-config-compute"
[glusterfs]
os-infra.mydomain.com glusterfs_ip='192.168.1.213' glusterfs_devices='["/dev/sdb"]'
os-node.mydomain.com glusterfs_ip='192.168.1.214' glusterfs_devices='["/dev/sdb"]'
os-storage.mydomain.com glusterfs_ip='192.168.1.215' glusterfs_devices='["/dev/sdb"]'`
Can someone please advice what should I do to be able to successfully deploy?
Many Thanks on advance.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 35
Before installing the glusterfs 6 or 7 libs you want to uninstall glusterfs3:
You need GlusterFS installed manually on ALL hosts I believe.
retry by adding below line
openshift_storage_glusterfs_is_native=false
I have below version and its working fine