rancher: [BUG] - Abnormally High Load Average on Kubernetes 1.25.9
Rancher Server Setup
- Rancher version: 2.7.3
- Installation option (Docker install/Helm Chart): Helm Chart RKE1
- Proxy/Cert Details:
helm get values rancher -n cattle-system
USER-SUPPLIED VALUES:
hostname: rancher.example.com.br
ingress:
tls:
source: secret
privateCA: true
Information about the Cluster
- Kubernetes version: v1.25.9 for Custom Clusters and v1.24.10 for the Imported cluster
- Cluster Type (Local/Downstream): Downstream
- If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider): The problem happens in both Custom and Imported clusters. Hosts are created using templates in Vsphere.
User Information
- What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom): Admin
Describe the bug Our Workers and Masters nodes are freezing sporadically throughout the week. The symptoms are the same for all our clusters. For some reason, when the hosts (workers and masters) have a request for resources like CPU, Memory or IOPS, the server freezes. It’s possible to access the server but the Load average is so high any kind of command take minutes to complete. When the server freezes, the only way to recover is to reset it in the vSphere (hard stop). The guest OS stops responding.
To Reproduce This issue occurs sporadically throughout the week, particularly when there is a high demand for resources.
Result The servers freeze under high resource demand, causing a significant delay in command execution.
Expected Result The servers should be able to handle high resource demand without freezing or causing significant delays in command execution.
Screenshots Here are some screenshots that can help with understanding the issue. The screenshots include information from the top and iotop commands captured on one host where the problem occurred.
Additional context This issue is happening in both our Custom and Imported clusters. The cluster configurations are provided above. Our KubeletConfiguration for all clusters is as follows:
#Arquivo gerado via Ansible.
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
#https://kubernetes.io/docs/reference/config-api/kubelet-config.v1beta1/
#address is the IP address for the Kubelet to serve on (set to 0.0.0.0 for all interfaces). Default: "0.0.0.0"
address: "10.0.142.42"
#serializeImagePulls when enabled, tells the Kubelet to pull images one at a time.
serializeImagePulls: false
#runtimeRequestTimeout is the timeout for all runtime requests except long running requests - pull, logs, exec and attach. Default: "2m"
runtimeRequestTimeout: "30m"
#evictionHard is a map of signal names to quantities that defines hard eviction thresholds.
evictionHard:
memory.available: "100Mi"
nodefs.available: "3%"
nodefs.inodesFree: "3%"
imagefs.available: "8%"
#evictionMaxPodGracePeriod is the maximum allowed grace period (in seconds) to use when terminating pods in response to a soft eviction threshold being met.
evictionMaxPodGracePeriod: 60
#failSwapOn tells the Kubelet to fail to start if swap is enabled on the node. Default: true
failSwapOn: true
#containerLogMaxSize is a quantity defining the maximumApologies for the cut-off in the previous message. Here's the complete version:
OS Version and Docker Version
docker --version
Docker version 20.10.12, build e91ed57
uname -a
Linux paranagua 3.10.0-1160.76.1.el7.x86_64 #1 SMP Wed Aug 10 16:21:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
Imported Cluster configuration
#Arquivo RKE de configuração do Rancher-Ouze-PRD RKE1.
#Conferir os nomes no https://rancher-ouze.example.com.br/dashboard/c/local/explorer/node. Nomes devem estar iguais dos hosts já existentes
nodes:
- address: anchieta
port: "22"
internal_address: ""
role:
- controlplane
- etcd
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
- address: aracruz
port: "22"
internal_address: ""
role:
- controlplane
- etcd
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
- address: piuma
port: "22"
internal_address: ""
role:
- controlplane
- etcd
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
- address: vilavelha
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
- address: saomateus
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
- address: marataizes
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
- address: saovicente
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
- address: praiadasdunas
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
- address: praiadoespelho
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
- address: praiadocalhau
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
- address: praiadacosta
port: "22"
internal_address: ""
role:
- worker
hostname_override: ""
user: rancher
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
services:
etcd:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
win_extra_args: {}
win_extra_binds: []
win_extra_env: []
external_urls: []
ca_cert: ""
cert: ""
key: ""
path: ""
uid: 0
gid: 0
retention: ""
creation: ""
kube-api:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
win_extra_args: {}
win_extra_binds: []
win_extra_env: []
service_cluster_ip_range: 10.43.0.0/16
service_node_port_range: ""
pod_security_policy: false
always_pull_images: false
kube-controller:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
win_extra_args: {}
win_extra_binds: []
win_extra_env: []
cluster_cidr: 10.42.0.0/16
service_cluster_ip_range: 10.43.0.0/16
scheduler:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
win_extra_args: {}
win_extra_binds: []
win_extra_env: []
kubelet:
extra_args:
config: /var/lib/kubelet/kubelet-config.yml
v: '1'
extra_binds:
- >-
/var/lib/kubelet/kubelet-config.yml:/var/lib/kubelet/kubelet-config.yml
fail_swap_on: false
generate_serving_certificate: false
extra_env: []
win_extra_args: {}
win_extra_binds: []
win_extra_env: []
cluster_domain: cluster.local
infra_container_image: ""
cluster_dns_server: 10.43.0.10
kubeproxy:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
win_extra_args: {}
win_extra_binds: []
win_extra_env: []
network:
plugin: canal
options: {}
mtu: 0
node_selector: {}
tolerations: []
authentication:
strategy: x509
sans: []
addons: ""
addons_include: []
ssh_key_path: ~/.ssh/id_rsa
ssh_cert_path: ""
ssh_agent_auth: false
authorization:
mode: rbac
options: {}
kubernetes_version: "v1.24.10-rancher4-1"
private_registries: []
ingress:
provider: ""
options: {}
node_selector: {}
extra_args: {}
dns_policy: ""
extra_envs: []
extra_volumes: []
extra_volume_mounts: []
http_port: 0
https_port: 0
network_mode: ""
tolerations: []
default_http_backend_priority_class_name: ""
nginx_ingress_controller_priority_class_name: ""
cluster_name: ""
cloud_provider:
name: ""
prefix_path: ""
win_prefix_path: ""
addon_job_timeout: 0
bastion_host:
address: ""
port: ""
user: ""
ssh_key: ""
ssh_key_path: ""
ssh_cert: ""
ssh_cert_path: ""
ignore_proxy_env_vars: false
restore:
restore: false
snapshot_name: ""
rotate_encryption_key: false`
Custom Cluster Configuration
answers: {}
docker_root_dir: /var/lib/docker
enable_cluster_alerting: false
enable_cluster_monitoring: false
enable_network_policy: false
fleet_workspace_name: fleet-default
local_cluster_auth_endpoint:
ca_certs: |-
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----
enabled: false
fqdn: proxy-devops.example.com.br
name: devops
rancher_kubernetes_engine_config:
addon_job_timeout: 45
authentication:
strategy: x509
authorization: {}
bastion_host:
ignore_proxy_env_vars: false
ssh_agent_auth: false
cloud_provider: {}
dns:
linear_autoscaler_params:
cores_per_replica: 128
max: 0
min: 1
nodes_per_replica: 4
prevent_single_point_failure: true
node_selector: null
nodelocal:
node_selector: null
update_strategy:
rolling_update: {}
options: null
reversecidrs: null
stubdomains: null
tolerations: null
update_strategy:
rolling_update: {}
upstreamnameservers: null
enable_cri_dockerd: true
ignore_docker_version: false
ingress:
default_backend: false
default_ingress_class: true
http_port: 0
https_port: 0
provider: nginx
kubernetes_version: v1.25.9-rancher2-1
monitoring:
provider: metrics-server
replicas: 1
network:
mtu: 0
options:
flannel_backend_type: vxlan
plugin: canal
restore:
restore: false
rotate_encryption_key: false
services:
etcd:
backup_config:
enabled: true
interval_hours: 12
retention: 6
s3_backup_config:
access_key: ACCESS_KEY
bucket_name: backup
endpoint: s3
folder: cluster
region: sa
safe_timestamp: false
timeout: 300
creation: 12h
extra_args:
election-timeout: '5000'
heartbeat-interval: '500'
gid: 0
retention: 72h
snapshot: false
uid: 0
kube-api:
always_pull_images: false
pod_security_policy: false
secrets_encryption_config:
enabled: false
service_node_port_range: 30000-32767
kube-controller: {}
kubelet:
extra_args:
config: /var/lib/kubelet/kubelet-config.yml
v: '1'
extra_binds:
- >-
/var/lib/kubelet/kubelet-config.yml:/var/lib/kubelet/kubelet-config.yml
fail_swap_on: true
generate_serving_certificate: false
kubeproxy: {}
scheduler: {}
ssh_agent_auth: false
upgrade_strategy:
drain: false
max_unavailable_controlplane: '2'
max_unavailable_worker: 10%
node_drain_input:
delete_local_data: false
force: false
grace_period: -1
ignore_daemon_sets: true
timeout: 1800
top output of a degraded Master server:
top - 14:26:32 up 4 days, 4 min, 1 user, load average: 71.90, 68.14, 56.26
Tasks: 266 total, 5 running, 261 sleeping, 0 stopped, 0 zombie
%Cpu(s): 4.5 us, 48.8 sy, 0.0 ni, 1.3 id, 40.7 wa, 0.0 hi, 4.7 si, 0.0 st
KiB Mem : 8008932 total, 126876 free, 7646316 used, 235740 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 91368 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
45 root 20 0 0 0 0 R 68.7 0.0 16:05.64 kswapd0
2015 root 20 0 11.0g 237468 0 S 26.3 3.0 145:49.37 etcd
2904 root 20 0 2234468 58024 0 S 14.6 0.7 92:46.55 kubelet
23396 root 20 0 824904 16996 0 D 13.4 0.2 3:44.04 kube-controller
4186 root 20 0 2115420 28424 0 S 12.9 0.4 47:25.29 calico-node
4528 root 20 0 6608308 5.5g 0 S 12.9 71.8 160:26.56 agent
23395 root 20 0 760160 23908 0 S 12.1 0.3 3:01.25 kube-scheduler
1997 root 20 0 754228 17648 0 R 10.2 0.2 3:35.35 kube-proxy
1299 root 20 0 1573760 49636 0 S 6.1 0.6 46:00.94 dockerd
3389 root 20 0 821892 38044 0 S 5.2 0.5 2:07.18 agent
4187 root 20 0 1303600 19288 0 S 4.4 0.2 0:25.61 calico-node
6 root 20 0 0 0 0 S 2.3 0.0 1:15.08 ksoftirqd/0
4679 prometh+ 20 0 713644 11760 0 S 2.3 0.1 0:31.56 pushprox-client
1125 root 20 0 1118140 32900 0 S 2.1 0.4 6:43.15 containerd
4120 root 20 0 1482884 18792 0 D 2.1 0.2 3:52.02 flanneld
1106 prometh+ 20 0 727004 12928 0 D 1.7 0.2 5:01.89 node_exporter
4384 root 20 0 712432 10564 0 S 1.7 0.1 0:27.45 containerd-shim
27606 postfix 20 0 92120 1256 184 D 1.7 0.0 0:00.50 local
14 root 20 0 0 0 0 S 1.5 0.0 0:42.46 ksoftirqd/1
2019 root 20 0 1677868 685256 0 S 1.5 8.6 137:46.18 kube-apiserver
27607 root 20 0 92020 1208 192 D 1.3 0.0 0:00.50 pickup
823 root 20 0 425060 1616 84 D 1.0 0.0 4:24.60 vmtoolsd
2526 root 20 0 742088 18136 0 S 1.0 0.2 40:30.95 cri-dockerd
4564 prometh+ 20 0 713644 11780 0 S 1.0 0.1 0:29.65 pushprox-client
19 root 20 0 0 0 0 S 0.8 0.0 0:42.79 ksoftirqd/2
1119 root 20 0 574284 15424 0 S 0.8 0.2 0:30.85 tuned
4824 prometh+ 20 0 713644 10768 0 S 0.8 0.1 0:28.58 pushprox-client
4840 prometh+ 20 0 713644 10116 0 S 0.8 0.1 0:28.02 pushprox-client
25910 root 20 0 743384 6584 0 D 0.8 0.1 0:12.26 calico
27624 root 20 0 28268 340 108 D 0.8 0.0 0:00.04 iptables-legacy
27324 root 20 0 245160 1088 96 R 0.6 0.0 0:03.11 sssd_be
27621 root 20 0 28268 340 108 D 0.6 0.0 0:00.03 iptables-legacy
1 root 20 0 191420 1860 84 D 0.4 0.0 3:06.07 systemd
1123 root 20 0 251272 18084 16648 S 0.4 0.2 0:14.04 rsyslogd
4806 root 20 0 712176 10768 0 S 0.4 0.1 0:29.73 containerd-shim
27539 root 20 0 172944 996 100 R 0.4 0.0 0:01.34 top
9 root 20 0 0 0 0 S 0.2 0.0 1:51.52 rcu_sched
24 root 20 0 0 0 0 S 0.2 0.0 0:19.55 ksoftirqd/3
544 root 20 0 0 0 0 S 0.2 0.0 1:24.18 xfsaild/dm-0
627 root 20 0 72976 32096 31740 D 0.2 0.4 0:06.22 systemd-journal
818 dbus 20 0 60308 712 0 S 0.2 0.0 0:27.90 dbus-daemon
1137 zabbix 20 0 31988 616 404 S 0.2 0.0 0:53.24 zabbix_agentd
1832 root 20 0 712432 9212 0 S 0.2 0.1 0:27.42 containerd-shim
3289 root 20 0 2500 108 0 S 0.2 0.0 0:07.62 tini
3792 root 20 0 2500 108 0 S 0.2 0.0 0:07.00 tini
4788 root 20 0 712432 10320 0 S 0.2 0.1 0:32.79 containerd-shim
27512 root 20 0 0 0 0 R 0.2 0.0 0:00.08 kworker/3:2
27622 root 20 0 1412 144 56 D 0.2 0.0 0:00.01 iptables
27623 root 20 0 1412 100 12 D 0.2 0.0 0:00.01 iptables
2 root 20 0 0 0 0 S 0.0 0.0 0:00.03 kthreadd
4 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 24 (6 by maintainers)
@gmanera ,
To set
CATTLE_REQUEST_CACHE_DISABLED
onrancher
:On
cattle-cluster-agent
on the downstream clusters:If you have the
Rancher CLI
, do:(Based on this comment )
Note that setting the env var will trigger an update on the deployment, i.e. new pods will be created, so a potential interruption of the service if there is only one pod.
Also, this comment provides steps for evaluating whether the fix will work for you.
Yes I just did it 😃 https://github.com/rancher/rancher/issues/41906
Hi @jiaqiluo ,
Thank you for your prompt response, my friend. You’re truly awesome for our community.
Here are our answers to your queries:
The problem is occurring across all four of our clusters. We have recently upgraded both the Rancher version and the K8S version. However, it is worth mentioning that we frequently perform upgrades. We initially started our clusters on Rancher 2.6.0, and we are currently on version 2.7.3. Whenever a new Rancher version is released, we upgrade it along with the K8S version. A few months ago, we began with K8S version 1.22, and now we are using version 1.24.10, with Longhorn 1.4.1 deployed across all clusters.
Yes, the problem is happening on both types of nodes, namely the Master and Workers. It’s a bit tricky to provide a precise answer, but we have observed frozen nodes with RabbitMQ running (alongside Longhorn managing the RabbitMQ volumes), resource-intensive pods consuming significant CPU/memory resources (without any associated volumes/IOPS), and masters experiencing issues when running Helm commands, such as upgrading the kube-prometheus-stack. This problem has occurred approximately 30 times across different nodes, clusters, and node types.
In accordance with your suggestion, I have updated all nodes across all four clusters. Yesterday, I performed the update to address the kernel version issue. Currently, the kernel version is as follows:
Prior to the upgrade, the kernel version was:
3.10.0-1160.76.1.el7.x86_64
echo 1 > /proc/sys/vm/drop_caches
Thank you again for your assistance and guidance. We truly appreciate your help.