rancher: GlusterFS PV failed after kubelet restart
Hello,
we have created a Cluster with Rancher (infos below) and installed a GlusterFS Cluster with heketi described here (https://github.com/heketi/heketi/wiki/Kubernetes-Integration/832a65e365b4644a1c64ac47601893a3fdb52daf)
- Three GluterFS Servers inside the same K8S Cluster
- Heketi as Provisioner
When upgrading Rancher from 2.1.3 to 2.1.8 all kubelet processes are restarted, after the process restart pods beginning to fail because they could not read/write to their existing Gluster volume mounts.
It looks like when restarting kubelet all existing mounts are getting terminated, which is OK if a single kubelet process restart on one GlusterFS Node occurred, but when on all three GlusterFS Nodes the process is restarted in a short period of time (which happened on a rancher update/or cluster upgrade), all clients with a PVC will fail.
We could reproduce the issue when we tried to add an additional kubelet parameter to our cluster.yaml, which also forces a kubelet restart on all nodes.
The kubelet log show errors like this:
32067 kubelet.go:1616] Unable to mount volumes for pod \"XXXXXX(c3751cf0-51c9-11e9-9d82-005056897a94)\": timeout expired waiting for volumes to attach or mount for pod \"XXX\"/\"XXXX\". list of unmounted volumes=[volume]. list of unattached volumes=[volume default-token-qlvbv]; skipping pod\n","stream":"stderr","time":"2019-04-02T06:16:17.54248644Z"}
{"log":"E0402 06:08:09.838966 32067 nestedpendingoperations.go:267] Operation for \"\\\"kubernetes.io/glusterfs/c3751cf0-51c9-11e9-9d82-005056897a94-pvc-e62d65b3-ed76-11e8-98e8-00508-005056897a94: transport endpoint is not connected\"\n","stream":"stderr","time":"2019-04-02T06:08:09.839246077Z"}
What kind of request is this (question/bug/enhancement/feature request): question/bug
Steps to reproduce (least amount of steps as possible):
Result:
Other details that may be helpful:
Environment information
- Rancher version (
rancher/rancher
/rancher/server
image tag or shown bottom left in the UI): Rancher:v2.1.8
UI:v2.1.21
- Installation option (single install/HA):
- single
Cluster information
- Cluster type (Hosted/Infrastructure Provider/Custom/Imported): Provider Custom v1.11.3-rancher1-1
- Network Provider: Canal
- Machine type (cloud/VM/metal) and specifications (CPU/memory): VM (CentOS 7.5.1804 3.10.0-862.14.4.el7.x86_64)
- Kubernetes version (use
kubectl version
):
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.3", GitCommit:"a4529464e4629c21224b3d52edfe0ea91b072862", GitTreeState:"clean", BuildDate:"2018-09-09T17:53:03Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
- Docker version (use
docker version
):
Client:
Version: 17.12.1-ce
API version: 1.35
Go version: go1.9.4
Git commit: 7390fc6
Built: Tue Feb 27 22:15:20 2018
OS/Arch: linux/amd64
Server:
Engine:
Version: 17.12.1-ce
API version: 1.35 (minimum version 1.12)
Go version: go1.9.4
Git commit: 7390fc6
Built: Tue Feb 27 22:17:54 2018
OS/Arch: linux/amd64
Experimental: false
Cluster Yaml:
addon_job_timeout: 30
authentication:
strategy: "x509"
bastion_host:
ssh_agent_auth: false
ignore_docker_version: true
#
# # Currently only nginx ingress provider is supported.
# # To disable ingress controller, set `provider: none`
# # To enable ingress on specific nodes, use the node_selector, eg:
# provider: nginx
# node_selector:
# app: ingress
#
ingress:
provider: "none"
kubernetes_version: "v1.11.3-rancher1-1"
monitoring:
provider: "metrics-server"
#
# # If you are using calico on AWS
#
# network:
# plugin: calico
# calico_network_provider:
# cloud_provider: aws
#
# # To specify flannel interface
#
# network:
# plugin: flannel
# flannel_network_provider:
# iface: eth1
#
# # To specify flannel interface for canal plugin
#
# network:
# plugin: canal
# canal_network_provider:
# iface: eth1
#
network:
options:
flannel_backend_type: "vxlan"
plugin: "canal"
#
# services:
# kube_api:
# service_cluster_ip_range: 10.43.0.0/16
# kube_controller:
# cluster_cidr: 10.42.0.0/16
# service_cluster_ip_range: 10.43.0.0/16
# kubelet:
# cluster_domain: cluster.local
# cluster_dns_server: 10.43.0.10
#
services:
etcd:
creation: "12h"
extra_args:
election-timeout: "5000"
heartbeat-interval: "500"
retention: "72h"
snapshot: false
kube-api:
pod_security_policy: false
service_node_port_range: "30000-32767"
kubelet:
fail_swap_on: false
ssh_agent_auth: false
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 2
- Comments: 23 (3 by maintainers)
When the Kubelet is terminated/restarted, the fuse processes sharing the same cgroup as the container are killed, causing havoc for the mounted volumes:
On systemd distros Kubernetes works around this by forking mount process in its own cgroup using
systemd-run
: https://github.com/kubernetes/kubernetes/blob/release-1.14/pkg/util/mount/mount_linux.go#L108-L135But since the binary is not present in the containerised Kubelet’s PATH, this mechanism gets skipped, indicated by the following log entry in Kubelet container:
By adding an extra bind mount to the Kubelet configuration in the cluster YAML the problem should be fixed. E.g.:
https://rancher.com/docs/rancher/v2.x/en/cluster-admin/editing-clusters/#editing-cluster-as-yaml
This has not been resolved. Why close it? @deniseschannon
@adampl I had to configure the extra bind. I removed the bind and upgraded to 1.15.5 but the issue resurfaced once I restarted a kubelet. I then added the bind again and no longer experienced the issue after a kubelet restart.
As far as I can see, the bind + 1.15.5 fixed the issue. I’m still waiting for a go to upgrade our other clusters and see whether this sticks.