kubernetes: kube-scheduler panics on malformed label
What happened?
A deployment was inadvertently created with a malformed label in requiredAffinityTerms
which appears to have caused the kube-scheduler to panic and crash.
Error from kube-scheduler pods:
E0107 05:56:44.099432 1 scheduling_queue.go:547] Error getting label selectors for pod: REDACTED.
E0107 05:56:44.099573 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 353 [running]:
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1b0f760, 0x2d9c290)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x95
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/workspace/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x89
panic(0x1b0f760, 0x2d9c290)
/usr/local/go/src/runtime/panic.go:969 +0x1b9
Events from REDACTED pod:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4m44s default-scheduler 0/40 nodes are available: 40 parsing pod: requiredAffinityTerms: invalid label value: "-api-gateway-mockservice": at key: "release": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?').
Warning FailedScheduling 4m44s default-scheduler 0/40 nodes are available: 40 parsing pod: requiredAffinityTerms: invalid label value: "-api-gateway-mockservice": at key: "release": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?').
Warning FailedScheduling 110s default-scheduler 0/40 nodes are available: 40 parsing pod: requiredAffinityTerms: invalid label value: "-api-gateway-mockservice": at key: "release": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?').
Warning FailedScheduling 51s default-scheduler 0/40 nodes are available: 40 parsing pod: requiredAffinityTerms: invalid label value: "-api-gateway-mockservice": at key: "release": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?').
Normal NotTriggerScaleUp 4m40s cluster-autoscaler pod didn't trigger scale-up: 51 parsing pod: requiredAffinityTerms: invalid label value: "-api-gateway-mockservice": at key: "release": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')
What did you expect to happen?
kube-scheduler should not panic.
How can we reproduce it (as minimally and precisely as possible)?
Submit a deployment with a malformed label value.
Anything else we need to know?
No response
Kubernetes version
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", GitCommit:"86ec240af8cbd1b60bcc4c03c20da9b98005b92e", GitTreeState:"clean", BuildDate:"2021-12-16T11:33:37Z", GoVersion:"go1.17.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.13", GitCommit:"2444b3347a2c45eb965b182fb836e1f51dc61b70", GitTreeState:"clean", BuildDate:"2021-11-17T13:00:29Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.23) and server (1.20) exceeds the supported minor version skew of +/-1
Cloud provider
AWS
OS version
$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
$ uname -a
Linux ip-10-117-7-107.eu-west-1.compute.internal 3.10.0-1160.45.1.el7.x86_64 #1 SMP Wed Oct 13 17:20:51 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Install tools
Container runtime (CRI) and and version (if applicable)
cri-o
Related plugins (CNI, CSI, …) and versions (if applicable)
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 26 (17 by maintainers)
I agree, the pod should’ve not been admitted in the first place.
I think the issue is here: https://github.com/kubernetes/kubernetes/blob/8c69e5d25bfdbd266b384db29493bc9bad60f092/staging/src/k8s.io/apimachinery/pkg/apis/meta/v1/validation/validation.go#L57
we validate the label key but not value for MatchExpressions; we should add the following to validate the values:
we do that already with MatchLabels, but not for MatchExpressions:
https://github.com/kubernetes/kubernetes/blob/8c69e5d25bfdbd266b384db29493bc9bad60f092/staging/src/k8s.io/apimachinery/pkg/apis/meta/v1/validation/validation.go#L75
It appears that error (event) was emitted by the kube-scheduler (see initial post above). If the kube-apiserver detected the error, then the manifest should have been rejected and the kube-scheduler would never have tried to schedule a pod in that deployment.
So it would seem that the apiserver needs an update to catch such malformed label values. It does that in
metadata.labels
, but not inlabelSelector
sections.With malformed
metadata.labels: { release: -api-gateway-mockservice }
:The Deployment "label-test" is invalid: metadata.labels: Invalid value: "-api-gateway-mockservice": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')
With the above malformed
affinity
, the apiserver accepted the Deployment.Yes, I suppose we still need a bug fix.
Yes you are right, I think this issue no longer exists since https://github.com/kubernetes/kubernetes/commit/c7fef196b60856420ce3f0470acd1093ab1d9b5f So I think we can close this issue