kubernetes: k8s liveness probe fails during pod termination
What happened?
I’ve tested with k8s v1.21 on Amazon EKS and k8s v1.22, v1.21, and v1.20 in Minikube
I installed a pod containing a simple http server having some /alive endpoint. In the pod spec, specify a liveness probe that uses this endpoint.
No liveness probe failure while pod is running, as expected.
But after executing a kubectl delete po command, will see k8s event that liveness probe failed.
I can reliably reproduce this. tried with servers written in spring boot, golang, and python/gunicorn
Then I modified the server, added a second endpoint /slow that sleeps for 30 seconds
I call this endpoint, then immediately execute kubectl delete po then kubectl get po -o yaml
Can see that deletionTimestamp is now non-null, but liveness probe continually fails while app spends 30 seconds trying to shut-down
Starting minikube command minikube start --extra-config=kubelet.v=4, I never see log Pod deletion requested, setting probe result to success from https://github.com/kubernetes/kubernetes/blob/v1.21.0/pkg/kubelet/prober/worker.go#L234
What did you expect to happen?
Based on this logic, I would expect liveness probe not to fail during shutdown
How can we reproduce it (as minimally and precisely as possible)?
Install a pod containing an http server. Server should expose an endpoint that’s used by livenessProbe in the Pod spec
Delete the pod.
Now, execute kubectl get events, see the liveness probe failure event(s)
for example,
apiVersion: v1
kind: Pod
metadata:
name: test-server
spec:
containers:
- name: test-server
image: waltersmatthew/goserver:latest
ports:
- name: http
containerPort: 8080
protocol: TCP
livenessProbe:
httpGet:
path: /alive
port: http
periodSeconds: 1
Anything else we need to know?
No response
Kubernetes version
1.21.0, 1.21.2 among others
Cloud provider
AWS EKS, minikube
OS version
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
Install tools
Container runtime (CRI) and and version (if applicable)
Related plugins (CNI, CSI, …) and versions (if applicable)
About this issue
- Original URL
- State: open
- Created 2 years ago
- Reactions: 1
- Comments: 21 (14 by maintainers)
@matthyx Here the existing pod is updated, it’s no problem, but the configsource always sends a copy for the pod to config source channel, see https://github.com/kubernetes/kubernetes/blob/a0dfd958d5c5e8c2bd5fd30e1d2e9031e9aba1b6/pkg/kubelet/config/config.go#L300, so is the problem I said.