kubernetes: Nil pointer dereference in KCM after v1 HPA patch request
What happened?
If we create a v2beta2 HPA object with fields specified for the scaleUp or scaleDown behavior, these values can be modified through the v1 version of the same object by changing the annotation for autoscaling.alpha.kubernetes.io/behavior. If we patch the scaleUp or scaleDown field in the annotation to null, the patch request will get accepted without the default values for scaleUp or scaleDown being applied. This will result in a panic in KCM when it tries to deference a nil pointer:
E1207 02:43:43.082791 10 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 2238 [running]:
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.logPanic(0x41c0960, 0x747b350)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x95
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x86
panic(0x41c0960, 0x747b350)
/usr/local/go/src/runtime/panic.go:965 +0x1b9
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).stabilizeRecommendationWithBehaviors(0xc000a700b0, 0xc0004bd760, 0x1a, 0x0, 0xc001a4c570, 0x500000004, 0x100000004, 0x0, 0x7fe0a1a18988, 0x7fe0a1a26ef8, ...)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:885 +0x3c
...
This is panic is occurring here: https://github.com/kubernetes/kubernetes/blob/v1.21.2/pkg/controller/podautoscaler/horizontal.go#L885
What did you expect to happen?
The patch request to the v1 HPA object should have applied the default value for the scaleUp or scaleDown fields when the annotation edit set it to null. When the v2beta2 object is updated, omitting the scaleUp or scaleDown properties will result in default values being applied so it is assumed that no fields in this struct are nil: https://github.com/kubernetes/kubernetes/blob/6ac2d8edc8606ab387924b8b865b4a69630080e0/pkg/apis/autoscaling/v2/defaults.go#L104
How can we reproduce it (as minimally and precisely as possible)?
Using v1.21.2:
- Create an example HPA object using autoscaling/v2beta2
cat <<EOF | kubectl apply -n kube-system -f -
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: coredns-scaler
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: coredns
minReplicas: 4
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
behavior:
scaleDown:
policies:
- type: Pods
value: 4
periodSeconds: 60
- type: Percent
value: 10
periodSeconds: 60
selectPolicy: Min
stabilizationWindowSeconds: 300
EOF
- Edit the v1 version of the HPA object to set scaleUp as null in the annotation:
kubectl edit hpa.v1.autoscaling coredns-scaler -n kube-system
Change this annotation:
autoscaling.alpha.kubernetes.io/behavior: '{"ScaleUp":{"StabilizationWindowSeconds":0,"SelectPolicy":"Max","Policies":[{"Type":"Pods","Value":4,"PeriodSeconds":15},{"Type":"Percent","Value":100,"PeriodSeconds":15}]},"ScaleDown":{"StabilizationWindowSeconds":300,"SelectPolicy":"Min","Policies":[{"Type":"Pods","Value":4,"PeriodSeconds":60},{"Type":"Percent","Value":10,"PeriodSeconds":60}]}}'
To this:
autoscaling.alpha.kubernetes.io/behavior: '{"ScaleUp":null,"ScaleDown":{"StabilizationWindowSeconds":300,"SelectPolicy":"Min","Policies":[{"Type":"Pods","Value":4,"PeriodSeconds":60},{"Type":"Percent","Value":10,"PeriodSeconds":60}]}}'
- Now when we fetch the v2beta2 object, the scaleUp spec is missing:
kubectl get hpa.v2beta2.autoscaling coredns-scaler -n kube-system -o yaml
...
spec:
behavior:
scaleDown:
policies:
- periodSeconds: 60
type: Pods
value: 4
- periodSeconds: 60
type: Percent
value: 10
selectPolicy: Min
stabilizationWindowSeconds: 300
maxReplicas: 5
metrics:
...
- KCM intermittently becomes unhealthy:
sh-4.2$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
Checking KCM logs:
E1215 00:11:52.291834 9 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 2339 [running]:
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.logPanic(0x41c0960, 0x747b350)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x95
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x86
panic(0x41c0960, 0x747b350)
/usr/local/go/src/runtime/panic.go:965 +0x1b9
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).stabilizeRecommendationWithBehaviors(0xc000d911e0, 0xc0012a7440, 0x1a, 0x0, 0xc001605bf0, 0x500000004, 0x10000
0004, 0x0, 0x7f4d5e8af368, 0x7f4d5e8b1fc0, ...)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:885 +0x3c
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).normalizeDesiredReplicasWithBehaviors(0xc000d911e0, 0xc0000fc700, 0xc0012a7440, 0x1a, 0x100000004, 0x4, 0x1)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:781 +0x127
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).reconcileAutoscaler(0xc000d911e0, 0xc00096be00, 0xc0012a7440, 0x1a, 0x0, 0x0)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:679 +0x2070
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).reconcileKey(0xc000d911e0, 0xc0012a7440, 0x1a, 0x3eca300, 0xc000d8ae48, 0x17113ec)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:371 +0x1b7
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).processNextWorkItem(0xc000d911e0, 0x0)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:225 +0xd8
k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).worker(0xc000d911e0)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:213 +0x2f
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc001dd1100)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc001dd1100, 0x512cde0, 0xc001dd90b0, 0x512df01, 0xc000e22d80)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0x9b
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc001dd1100, 0x3b9aca00, 0x0, 0xc001dd2701, 0xc000e22d80)
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x98
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc001dd1100, 0x3b9aca00, 0xc000e22d80) /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d
created by k8s.io/kubernetes/pkg/controller/podautoscaler.(*HorizontalController).Run
/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/controller/podautoscaler/horizontal.go:177 +0x245
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x17b41bc]
Anything else we need to know?
No response
Kubernetes version
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.2-eks-06eac09", GitCommit:"5f6d83fe4cb7febb5f4f4e39b3b2b64ebbbe3e97", GitTreeState:"clean", BuildDate:"2021-09-13T14:23:18Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.2-eks-06eac09", GitCommit:"5f6d83fe4cb7febb5f4f4e39b3b2b64ebbbe3e97", GitTreeState:"clean", BuildDate:"2021-09-13T14:20:15Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
Cloud provider
OS version
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
Install tools
Container runtime (CRI) and and version (if applicable)
Related plugins (CNI, CSI, …) and versions (if applicable)
About this issue
- Original URL
- State: open
- Created 3 years ago
- Reactions: 1
- Comments: 15 (10 by maintainers)
@natherz97 @pacoxu
GenerateHPAScaleUpRules
sets default values for nil values then why this error is showing.Guidance needed. Could you elaborate more about that what is the cause of this bug ?