kubernetes: AssumePod failed when scheduling StatefulSet

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

Pod created by StatefulSet failed to schedule, with error:

AssumePod failed: pod <namespace>/<pod> state wasn't initial but get assumed

What you expected to happen:

Pod scheduled to node correctly.

How to reproduce it (as minimally and precisely as possible):

I don’t yet have reproducible steps, from a fresh cluster. I’m using the Prometheus Operator to create a StatefulSet running alertmanager. I have deleted the StatefulSet and recreated numerous times, but always end up with this error. Changing the name of the StatefulSet fixes the problem.

Environment:

  • Kubernetes version (use kubectl version): v1.8.4-gke.0
  • Cloud provider or hardware configuration: GKE
  • OS (e.g. from /etc/os-release): Google cos

Also wondering if there is another workaround for this on GKE where I can’t restart the scheduler (which I suspect would fix this?)

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 28 (14 by maintainers)

Most upvoted comments

The cache having stale information or not receiving events is a known issue, but the root-cause is not yet identified. So, 1.9 does not have a fix for the issue.

@gmile I see, should be GKE magic 😃

Anyway, we have a fix patch in 1.10: https://github.com/kubernetes/kubernetes/pull/61069, which seems not included in v1.9.4 cc @krmayankk

I was hit by this problem, restart of all schedulers helped

The root cause was that we had a problem with storage, both for etcd and schedulers - so looks like an information cached was not correct (that pod was scheduled, but in fact there was an error setting a new value)