amazon-eks-ami: EKS k8s 1.19 - AMI 1.19-v20210414 - (combined from similar events): System OOM encountered, victim process:

What happened: System OOM happens in some of nodes after upgrade to ks8 1.19 (using AMI v20210414). It seems this happens more for bigger nodes, like r5.4x and r5.8x.

  Warning  SystemOOM  18m                   kubelet, ip-10-10-10-10.eu-central-1.compute.internal  System OOM encountered, victim process: iptables, pid: 13778
  Warning  SystemOOM  18m                   kubelet, ip-10-10-10-10.eu-central-1.compute.internal  System OOM encountered, victim process: iptables, pid: 13782
  Warning  SystemOOM  18m                   kubelet, ip-10-10-10-10.eu-central-1.compute.internal  System OOM encountered, victim process: iptables, pid: 13836
  Warning  SystemOOM  18m                   kubelet, ip-10-10-10-10.eu-central-1.compute.internal  System OOM encountered, victim process: iptables, pid: 13853
  Warning  SystemOOM  17m                   kubelet, ip-10-10-10-10.eu-central-1.compute.internal  System OOM encountered, victim process: iptables, pid: 18796
  Warning  SystemOOM  17m                   kubelet, ip-10-10-10-10.eu-central-1.compute.internal  System OOM encountered, victim process: iptables, pid: 18808
  Warning  SystemOOM  17m                   kubelet, ip-10-10-10-10.eu-central-1.compute.internal  System OOM encountered, victim process: iptables, pid: 18819
  Warning  SystemOOM  17m (x538 over 2d6h)  kubelet, ip-10-10-10-10.eu-central-1.compute.internal  (combined from similar events): System OOM encountered, victim process: iptables, pid: 18883
  Warning  SystemOOM  17m                   kubelet, ip-10-10-10-10.eu-central-1.compute.internal  System OOM encountered, victim process: iptables, pid: 18854
  Warning  SystemOOM  17m                   kubelet, ip-10-10-10-10.eu-central-1.compute.internal  System OOM encountered, victim process: iptables, pid: 18880

What you expected to happen: We did not have System OOM issue in EKS k8s 1.18

How to reproduce it (as minimally and precisely as possible): EKS k8s 1.19 Nodes AMI: amazon-eks-node-1.19-v20210414 Region: Frankfurt (eu-central-1)

Anything else we need to know?:

Environment:

  • AWS Region: Frankfurt
  • Instance Type(s): r5.4xlarge, r5.8xlarge
  • EKS Platform version (use aws eks describe-cluster --name <name> --query cluster.platformVersion): eks.4
  • Kubernetes version (use aws eks describe-cluster --name <name> --query cluster.version): 1.19
  • AMI Version: amazon-eks-node-1.19-v20210414
  • Kernel (e.g. uname -a): Linux ip-10-10-10-10.eu-central-1.compute.internal 5.4.105-48.177.amzn2.x86_64 #1 SMP Tue Mar 16 04:56:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  • Release information (run cat /etc/eks/release on a node):
BASE_AMI_ID="ami-0849ada759754b5f5"
BUILD_TIME="Wed Apr 14 20:10:54 UTC 2021"
BUILD_KERNEL="5.4.105-48.177.amzn2.x86_64"
ARCH="x86_64"

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 3
  • Comments: 18 (4 by maintainers)

Most upvoted comments

@imriss we’re not there yet. We discovered that in order to downgrade to that AMI (or in general, to start using custom ami instead of the default “latest” one) we need to rebuild the nodegroups from scratch, so it’s taking a bit more time than initially planned