kubernetes: Create pod failed and not retry due to controller manager ignore timeout error
Is this a BUG REPORT or FEATURE REQUEST?:
Uncomment only one, leave it on its own line:
/kind bug
What happened:
In some case, controller manager failed to create pod as replicas number defined in ReplicaSet spec. Normally if replicaset controller fails to create a pod, it will try later.
(https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/replicaset/replica_set.go#L437)
func (rsc *ReplicaSetController) processNextWorkItem() bool {
key, quit := rsc.queue.Get()
if quit {
return false
}
defer rsc.queue.Done(key)
err := rsc.syncHandler(key.(string))
if err == nil {
rsc.queue.Forget(key)
return true
}
utilruntime.HandleError(fmt.Errorf("Sync %q failed with %v", key, err))
rsc.queue.AddRateLimited(key)
return true
}
But manageReplicas func in syncHandler ignore timeout error for some reason(https://github.com/kubernetes/kubernetes/pull/49915), then if a create pod request really timeout during accessing apiserver, this failed key will not add to the queue again. Then the pod should be created will never be created again.
(https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/replicaset/replica_set.go#L497)
err := rsc.podControl.CreatePodsWithControllerRef(rs.Namespace, &rs.Spec.Template, rs, controllerRef)
if err != nil && errors.IsTimeout(err) {
// Pod is created but its initialization has timed out.
// If the initialization is successful eventually, the
// controller will observe the creation via the informer.
// If the initialization fails, or if the pod keeps
// uninitialized for a long time, the informer will not
// receive any update, and the controller will create a new
// pod when the expectation expires.
return nil
}
return err
I saw this happened in a production environment.
I0808 15:42:02.252808 1 controller_utils.go:158] Updating ready status of pod app-485859248-app-v1-0-cfb4f7c97-5bqx2 to false
I0808 15:42:02.268176 1 controller_utils.go:158] Updating ready status of pod app-485859248-app-v1-0-cfb4f7c97-n5tj8 to false
I0808 15:45:32.723956 1 controller_utils.go:95] Starting deletion of pod data-infra/app-485859248-app-v1-0-cfb4f7c97-5bqx2
I0808 15:45:32.732262 1 replica_set.go:477] Too few replicas for ReplicaSet data-infra/app-485859248-app-v1-0-cfb4f7c97, need 4, creating 1
I0808 15:45:32.740614 1 controller_utils.go:95] Starting deletion of pod data-infra/app-485859248-app-v1-0-cfb4f7c97-n5tj8
I0808 15:46:32.737094 1 event.go:218] Event(v1.ObjectReference{Kind:"ReplicaSet", Namespace:"data-infra", Name:"app-485859248-app-v1-0-cfb4f7c97", UID:"cd2edaad-956c-11e8-abf4-00163e042223", APIVersion:"extensions", ResourceVersion:"5302543", FieldPath:""}): type: 'Warning' reason: 'FailedCreate' Error creating: the server was unable to return a response in the time allotted, but may still be processing the request (post pods)
What you expected to happen:
Controller manager will retry and create pod again.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version): v1.10.1 - Cloud provider or hardware configuration:
- OS (e.g. from /etc/os-release): centos 7
- Kernel (e.g.
uname -a): - Install tools:
- Others:
/cc @caesarxuchao
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 12
- Comments: 24 (8 by maintainers)
I am hitting this bug, too. If ReplicaSetController create pod request really timeout during accessing apiserver, this failed key will not add to the queue again. Then the pod should be created will never be created again. in manageReplicas, if create pod timeout < 5 minute, the queue has no key, the controller never sync when no key , so it never create a new pod even through the expectation expires.
rs status : kubectl get rs --all-namespaces NAMESPACE NAME DESIRED CURRENT READY AGE kube-system kubenab-6995bdbc8c 3 2 2 5d9h
kube-controller-manager log: I1204 04:13:23.299122 1 event.go:258] Event(v1.ObjectReference{Kind:“ReplicaSet”, Namespace:“kube-system”, Name:“kubenab-6995bdbc8c”, UID:“09bc4316-b142-4db2-a914-894659714b66”, APIVersion:“apps/v1”, ResourceVersion:“1174161”, FieldPath:“”}): type: ‘Warning’ reason: ‘FailedCreate’ Error creating: the server was unable to return a response in the time allotted, but may still be processing the request (post pods)
kube-apiserver log: W1204 04:12:53.308340 1 dispatcher.go:136] Failed calling webhook, failing open kubenab-mutate.kubenab.com: failed calling webhook “kubenab-mutate.kubenab.com”: Post https://kubenab.kube-system.svc:443/mutate?timeout=30s: context deadline exceeded (Client.Timeout exceeded while awaiting headers) E1204 04:12:53.308372 1 dispatcher.go:137] failed calling webhook “kubenab-mutate.kubenab.com”: Post https://kubenab.kube-system.svc:443/mutate?timeout=30s: context deadline exceeded (Client.Timeout exceeded while awaiting headers) E1204 04:13:23.308698 1 writers.go:163] apiserver was unable to write a JSON response: http: Handler timeout E1204 04:13:23.308736 1 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:“http: Handler timeout”} I1204 04:13:23.309878 1 trace.go:81] Trace[874044225]: “Create /api/v1/namespaces/kube-system/pods” (started: 2019-12-04 04:12:23.298432879 +0000 UTC m=+190068.900812512) (total time: 1m0.011414107s): Trace[874044225]: [30.010012786s] [30.00986498s] About to store object in database Trace[874044225]: [1m0.011414107s] [30.001401321s] END W1204 04:13:23.312196 1 dispatcher.go:105] Failed calling webhook, failing open kubenab-validate.kubenab.com: failed calling webhook “kubenab-validate.kubenab.com”: Post https://kubenab.kube-system.svc:443/validate?timeout=30s: context deadline exceeded (Client.Timeout exceeded while awaiting headers) E1204 04:13:23.312214 1 dispatcher.go:106] failed calling webhook “kubenab-validate.kubenab.com”: Post https://kubenab.kube-system.svc:443/validate?timeout=30s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
I think I ran into this situation as well. It was caused by a similar situation as described by @seeruk . A change to a Terraform module resulted in ASG replacement. The change was supposed to be deployed by hand on each ASG individually, which though not as graceful as using taints to drain the nodes in the old ASG, would have probably allowed Kubernetes to reschedule the unexpectedly terminated pods without too much difficulty. Instead, the change was inadvertently applied to a large number of ASGs all at roughly the same time. This caused many ReplicaSets to report 0 current and 0 ready pods, and to have the “Error creating: the server was unable to return a response in the time allotted, but may still be processing the request (post pods)” event.
Luckily, manually scaling each ReplicaSet to 0 and then back to the original scale gave Kubernetes the kick it needed to create the pods.