kubernetes: Pod does not go into unready state while terminating
What happened:
Pod that requires a certain time to drain remains in READY state while terminating, until just before it becomes fully terminated.
What you expected to happen:
Once a Pod is deleted, it should go into unready state (within a short time), as the documentation says:
If you want to be able to drain requests when the Pod is deleted, you do not necessarily need a readiness probe; on deletion, the Pod automatically puts itself into an unready state regardless of whether the readiness probe exists. The Pod remains in the unready state while it waits for the containers in the Pod to stop.
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#when-should-you-use-a-readiness-probe (Emphasis is mine.)
How to reproduce it (as minimally and precisely as possible):
- Create a Pod resource
$ cat test.yaml
apiVersion: v1
kind: Pod
metadata:
name: test
namespace: default
spec:
containers:
- args:
- sleep
- "1000000"
image: busybox
name: test
$ kubectl apply -f ./test.yaml
sleep
command is a good example here because sleep
command does not handle SIGTERM and will keep running until it is killed.
- Delete the Pod
$ kubectl delete pod test
- Pod keeps READY state while it waits for the container in the Pod to stop, against the doc
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
test 1/1 Terminating 0 44s
The values of Pod Condition Ready
and ContainersReady
also remain true
.
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version
):
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T16:31:21Z", GoVersion:"go1.16.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.6-gke.1000", GitCommit:"3ae0998c5052f420a17cb96bacf860ec5d6822a3", GitTreeState:"clean", BuildDate:"2021-04-29T09:17:16Z", GoVersion:"go1.15.10b5", Compiler:"gc", Platform:"linux/amd64"}
I also confirmed with v1.19.9-gke.1400
.
- Cloud provider or hardware configuration: GKE
- OS (e.g:
cat /etc/os-release
)
NAME="Container-Optimized OS"
ID=cos
PRETTY_NAME="Container-Optimized OS from Google"
HOME_URL="https://cloud.google.com/container-optimized-os/docs"
BUG_REPORT_URL="https://cloud.google.com/container-optimized-os/docs/resources/support-policy#contact_us"
KERNEL_COMMIT_ID=c19d150c6bd658510ec786390aec80ad476c7578
GOOGLE_CRASH_ID=Lakitu
GOOGLE_METRICS_PRODUCT_ID=26
VERSION=89
VERSION_ID=89
BUILD_ID=16108.403.15
- Kernel (e.g.
uname -a
):
Linux cs-291166234657-default-boost-vqzbq 5.4.104+ #1 SMP Fri Apr 30 09:52:02 PDT 2021 x86_64 GNU/Linux
- Install tools: GKE
- Network plugin and version (if this is a network-related bug):
- Others:
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 25 (12 by maintainers)
~Posted a fix for this https://github.com/kubernetes/kubernetes/pull/110191~
https://github.com/kubernetes/kubernetes/blob/80056f73a614b21c7d2165d65f3b74a2fbf2264e/pkg/controller/endpoint/endpoints_controller.go#L447-L450
https://github.com/kubernetes/kubernetes/blob/80056f73a614b21c7d2165d65f3b74a2fbf2264e/pkg/controller/endpointslice/utils.go#L44-L54
https://github.com/kubernetes/kubernetes/blob/ea0764452222146c47ec826977f49d7001b0ea8c/pkg/controller/util/endpoint/controller_utils.go#L126-L135
Refer to the codes above, the pod will be treated as an unready endpoint if the pod is terminating. I guess that’s what the document means. And if that’s true, maybe a PR for the doc is needed.
I found the same using the test pod described above:
I dumped the yaml for about 10-20 seconds (just repeatedly via cli), the output shows the pod status.Conditions never goes into ‘Ready: False’ until the container is terminated.
I don’t think the documentation is accurate. That phrasing was added relatively recently via: https://github.com/kubernetes/website/pull/22603