openshift-ansible: cluster install fails on GlusterFs stage (3.7)

Description

On a new install on a multi master setup with a glusterfs storage setup, the install fail at the “Wait for heketi Pod” task.

Sometimes it will get stuck on the image pull phase and sometimes it will be because the heketi pod is stuck on a crash loop.

Version

Please put the following version information in the code block indicated below.

Your ansible version per ansible --version

ansible 2.5.0 config file = /home/ansibleuser/openshift-ansible/ansible.cfg configured module search path = [u’/home/ansibleuser/.ansible/plugins/modules’, u’/usr/share/ansible/plugins/modules’] ansible python module location = /usr/lib/python2.7/site-packages/ansible executable location = /usr/bin/ansible python version = 2.7.5 (default, Aug 4 2017, 00:39:18) [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)]

If you’re operating from a git clone:

The output of git describe openshift-ansible-3.7.42-1-26-g388d11b

Steps To Reproduce

launch 3.7 playbook ansible-playbook -i ./hosts/cluster-installation playbooks/byo/openshift-glusterfs/config.yml

Expected Results

Cluster up and running and glusterfs configured

Observed Results

Describe what is actually happening.

Here are the logs of the heketi-storage container :

| Setting up heketi database – | – | No database file found | Database volume found: 10.39.57.31:heketidbstorage on /var/lib/heketi type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072) | Database file is expected, waiting… | Database file did not appear, exiting.

Additional Information

Provide any additional information which may help us diagnose the issue. CentOS Linux release 7.4.1708

My config file

#Configuration globale cluster [OSEv3:children] masters etcd nodes glusterfs glusterfs_registry

#VARIABLES GLOBALES CLUSTER [OSEv3:vars] #etcd openshift_use_etcd_system_container=True

#ansible ansible_ssh_user=ansibleuser ansible_become=true ansible_service_broker_image_prefix=openshift/ ansible_service_broker_registry_url=“registry.access.redhat.com”

#checks disk openshift_check_min_host_disk_gb=13 #firewall os_firewall_use_firewalld=True

#deployment configuration openshift_deployment_type=origin #openshift_version=3.9.0 #openshift_pkg_version=3.7.1 #containerized=true

#configuration glusterfs openshift_storage_glusterfs_namespace=glusterfs openshift_storage_glusterfs_name=storage

#configuration registry interne openshift_hosted_registry_storage_kind=glusterfs openshift_registry_selector=“region=infranodes” openshift_hosted_registry_replicas=3 openshift_hosted_registry_storage_volume_size=190Gi

#configuration routers openshift_router_selector=“region=routingnodes”

#configuration noeuds standard osm_default_node_selector=“region=standardnodes”

#configuration points d’acces master et api openshift_master_cluster_hostname=master-lb.mycompany.internal openshift_master_cluster_public_hostname=console.mycompany.com openshift_master_default_subdomain=mycompany.com openshift_master_api_port=8443 openshift_master_console_port=8443 openshift_master_session_name=ssn openshift_public_ip=“xx.xx.xx.xx”

#configuration du certificats des routeurs openshift_hosted_router_certificate={“certfile”: “/home/ansibleuser/openshift-ansible/customCertificates/STAR_mycompany.crt”, “keyfile”: “/home/ansibleuser/openshift-ansible/customCertificates/mycompany.key”, “cafile”: “/home/ansibleuser/openshift-ansible/customCertificates/COMODORSADomainValidationSecureServerCA.crt”}

#configuration du ldap openshift_master_identity_providers=[{‘name’: ‘picv4_ldap’, ‘challenge’: ‘true’, ‘login’: ‘true’, ‘kind’: ‘LDAPPasswordIdentityProvider’, ‘attributes’: {‘id’: [‘dn’], ‘email’: [‘mail’], ‘name’: [‘cn’], ‘preferredUsername’: [‘uid’]}, ‘bindDN’: ‘uid=ldapbind,cn=users,cn=accounts,dc=ggd,dc=mycompany’, ‘bindPassword’: ‘tetetetetetge’, ‘ca’: ‘’, ‘insecure’: ‘true’, ‘url’: ‘ldap://ldap.picv4.mycompany:389/cn=users,cn=accounts,dc=picv4,dc=mycompany?uid’}]

#configuration de la politique d’audit openshift_master_audit_config={“enabled”: true, “auditFilePath”: “/var/log/openpaas-oscp-audit/openpaas-oscp-audit.log”, “maximumFileRetentionDays”: 14, “maximumFileSizeMegabytes”: 500, “maximumRetainedFiles”: 5}

#configuration logs cluster openshift_logging_install_logging=“true” openshift_logging_es_pvc_dynamic=“true” openshift_logging_es_pvc_size=“100G” openshift_logging_curator_default_days=“2” openshift_logging_curator_run_hour=“24” openshift_master_logging_public_url=“https://logs.mycompany.com”

openshift_logging_es_nodeselector=“region=infranodes” openshift_logging_kibana_ops_nodeselector=“region=infranodes” openshift_logging_curator_ops_nodeselector=“region=infranodes”

#configuration metrics openshift_metrics_install_metrics=“true” openshift_metrics_cassandra_storage_type=“dynamic” openshift_metrics_duration=7 openshift_metrics_cassandra_pvc_size=“20G” openshift_metrics_cassandra_replicas=1 openshift_metrics_cassandra_limits_memory=“2Gi” openshift_metrics_cassandra_limits_cpu=“2000m” openshift_metrics_cassandra_nodeselector=“region=infranodes” openshift_master_metrics_public_url=“https://metrics.mycompany.com”

#NOEUDS GLUSTER FS [glusterfs] storage01.mycompany.internal glusterfs_devices=‘[ “/dev/sdc”]’ glusterfs_ip=10.39.57.31 storage02.mycompany.internal glusterfs_devices=‘[ “/dev/sdc”]’ glusterfs_ip=10.39.57.32 storage03.mycompany.internal glusterfs_devices=‘[ “/dev/sdc”]’ glusterfs_ip=10.39.57.33 storage04.mycompany.internal glusterfs_devices=‘[ “/dev/sdc”]’ glusterfs_ip=10.39.57.34 #config glusterfs [glusterfs:vars] openshift_storage_glusterfs_nodeselector=“glusterfs=standardstorage” openshift_storage_glusterfs_wipe=“true”

#NOEUDS GLUSTER FS DEDIES AU REGISTRY INTERNE [glusterfs_registry] storage-registry01.mycompany.internal glusterfs_devices=‘[ “/dev/sdc”]’ glusterfs_ip=10.39.57.41 storage-registry02.mycompany.internal glusterfs_devices=‘[ “/dev/sdc”]’ glusterfs_ip=10.39.57.42 storage-registry03.mycompany.internal glusterfs_devices=‘[ “/dev/sdc”]’ glusterfs_ip=10.39.57.43

#NOEUDS DU CLUSTER

#Groupe des VMS Master [masters] master0[1:2].mycompany.internal

#noeuds etcd [etcd] etcd01.mycompany.internal etcd02.mycompany.internal etcd03.mycompany.internal

Noeuds Openshift

[nodes]

#Infra Nodes infranode0[1:2].mycompany.internal openshift_node_labels=“{‘region’ : ‘infranodes’}” openshift_schedulable=true

#Pic nodes picnode0[1:2].mycompany.internal openshift_node_labels=“{‘region’ : ‘picnodes’}” openshift_schedulable=true

#Compilation nodes compilnode0[1:2].mycompany.internal openshift_node_labels=“{‘region’ : ‘compilnodes’}” openshift_schedulable=true

#routing nodes routeur0[1:2].mycompany.internal openshift_node_labels=“{‘region’ : ‘routingnodes’}”

#standard nodes node0[1:2].mycompany.internal openshift_node_labels=“{‘region’ : ‘standardnodes’}” openshift_schedulable=true

#masters master0[1:2].mycompany.internal openshift_node_labels=“{‘region’ : ‘masters’}” openshift_schedulable=true

#glusterfs nodes storage0[1:4].mycompany.internal openshift_node_labels=“{‘region’ : ‘standardstorage’}”

#glusterfs registry nodes storage-registry0[1:3].mycompany.internal openshift_node_labels=“{‘region’ : ‘registrystorage’}”

#variables specifiques noeuds openshift [nodes:vars] openshift_docker_options=–log-driver json-file --log-opt max-size=1M --log-opt max-file=3 --selinux-enabled

EXTRA INFORMATION GOES HERE

About this issue

Original URL
State: closed
Created 6 years ago
Comments: 19 (11 by maintainers)

Most upvoted comments

Ok thank you all for your assistance.

It was a pleasure. I will open a new ticket concerning that issue if i don’t manage to find an explanation by myself

ahmadou on Apr 5, 2018