kubernetes: Some static pods fail to start
What happened?
if kubelet start with kube-apiserver not ready,during this period,someone remove static pod files,and kube-apiserver recovery,someone update static pod files, move to dir, find kubelet not restart this static pods
What did you expect to happen?
All static pods should restart.
How can we reproduce it (as minimally and precisely as possible)?
- add a static pod yaml to kubelet manifest dir, and wait pod start
- kubelet api source not ready(apiserver down or network problem)
- restart kubelet
- remove this static pod yaml, wait kubelet print
skipping delete because sources aren't ready yet
- recover kube-apiserver
- update this pod yaml(update image vesion) , move updated pod yaml to kubelet manifest dir
kubelet will not restart new pods
Anything else we need to know?
we find that podWorker managePodLoop will check whether pod allow start;
// allowPodStart tries to start the pod and returns true if allowed, otherwise
// it requeues the pod and returns false. If the pod will never be able to start
// because data is missing, or the pod was terminated before start, canEverStart
// is false.
func (p *podWorkers) allowPodStart(pod *v1.Pod) (canStart bool, canEverStart bool) {
if !kubetypes.IsStaticPod(pod) {
// TODO: Do we want to allow non-static pods with the same full name?
// Note that it may disable the force deletion of pods.
return true, true
}
p.podLock.Lock()
defer p.podLock.Unlock()
status, ok := p.podSyncStatuses[pod.UID]
if !ok {
klog.ErrorS(nil, "Pod sync status does not exist, the worker should not be running", "pod", klog.KObj(pod), "podUID", pod.UID)
return false, false
}
if status.IsTerminationRequested() {
return false, false
}
if !p.allowStaticPodStart(status.fullname, pod.UID) {
p.workQueue.Enqueue(pod.UID, wait.Jitter(p.backOffPeriod, workerBackOffPeriodJitterFactor))
status.working = false
return false, true
}
return true, true
}
// allowStaticPodStart tries to start the static pod and returns true if
// 1. there are no other started static pods with the same fullname
// 2. the uid matches that of the first valid static pod waiting to start
func (p *podWorkers) allowStaticPodStart(fullname string, uid types.UID) bool {
startedUID, started := p.startedStaticPodsByFullname[fullname]
if started {
return startedUID == uid
}
waitingPods := p.waitingToStartStaticPodsByFullname[fullname]
// TODO: This is O(N) with respect to the number of updates to static pods
// with overlapping full names, and ideally would be O(1).
for i, waitingUID := range waitingPods {
// has pod already terminated or been deleted?
status, ok := p.podSyncStatuses[waitingUID]
if !ok || status.IsTerminationRequested() || status.IsTerminated() {
continue
}
// another pod is next in line
if waitingUID != uid {
p.waitingToStartStaticPodsByFullname[fullname] = waitingPods[i:]
return false
}
// we are up next, remove ourselves
waitingPods = waitingPods[i+1:]
break
}
if len(waitingPods) != 0 {
p.waitingToStartStaticPodsByFullname[fullname] = waitingPods
} else {
delete(p.waitingToStartStaticPodsByFullname, fullname)
}
p.startedStaticPodsByFullname[fullname] = uid
return true
}
kubelet will skip handle pod remove event when apiserver config source not ready; so pod worker not marked this pod as a Termination Requested, so not remove uid from startedStaticPodsByFullname;
when apiserver recovers,move an updated pod manifest, uid changes, so allowStaticPodStart return false, lead podworker not to do sync work.
/sig node @rphillips @gjkim42 @smarterclayton
Kubernetes version
$ kubectl version
# paste output here
Cloud provider
OS version
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, …) and versions (if applicable)
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 17 (17 by maintainers)
I tried the same steps on v1.26.2 and was able to repro (nginx was stuck and not terminated).
This is likely fixed due to https://github.com/kubernetes/kubernetes/pull/113145 which was merged as part of v1.27 cycle because it properly handles terminating orphaned pods.
restart kubelet
remove nginx.yaml from dir
delete reject rules
now use crictl ps, container exist
watch containers status, pod never restart, and change images to tomcat
use master commit f0791b50143856177878e21bb44beb5e3e36cc78 to reproduce this issue