kubernetes: Kubernetes Pods not scheduled due to "Insufficient CPU" when CPU resources are available

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.7", GitCommit:"a2cba278cba1f6881bb0a7704d9cac6fca6ed435", GitTreeState:"clean", BuildDate:"2016-09-12T23:15:30Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"3", GitVersion:"v1.3.7", GitCommit:"a2cba278cba1f6881bb0a7704d9cac6fca6ed435", GitTreeState:"clean", BuildDate:"2016-09-12T23:08:43Z", GoVersion:"go1.6.2", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: AWS, masters (Count: 3, Size: m3.medium), minions (Count 5, Size m4.xlarge)
  • OS (e.g. from /etc/os-release): 14.04.5 LTS, Trusty Tahr
  • Kernel (e.g. uname -a): Master: 3.13.0-95-generic Minion: 4.4.0-38-generic
  • Install tools: Ansible using modified contrib playbooks: https://github.com/kubernetes/contrib/tree/master/ansible
  • Others:

What happened: When scheduling pods with a low resource request for CPU (15m) We recieve the message “Insufficient CPU” across all nodes attempting to schedule the pod. We are using multi container pods and running a describe pods shows nodes with available resources to schedule the pods. However k8s refuses to schedule across all nodes.
kubectl_output.txt

What you expected to happen:

How to reproduce it (as minimally and precisely as possible): Below is a sample manifest that we can use to produce the output.
manifest.txt

We end up scheduling pods up until about 10-14 pods and then we run into this problem. See graph below

screen shot 2016-09-29 at 11 30 56 am

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 17
  • Comments: 73 (6 by maintainers)

Commits related to this issue

Most upvoted comments

We are having the same problem and cannot restart the master since we are in GKE.

Kube-dashboard says that only 0.05 cpu unit is being occupied. Why pod cannot be scheduled?


i just removed resources limit and request specs, it works for the present…


Another one can’t be scheduled at this time again…

   ...
    spec:
      containers:
        - name: default-http-backend
          image: gcr.io/xxx/default-http-backend:latest
          ports:
            - containerPort: 8000
              name: http
          resources:
            requests:
              cpu: 10m

applying very little amount of resource made this work for this time.

I know that it won’t solve all the problems mentioned on the page but it my case it was a typo caused by copy-n-paste.

As you can see in the documentation:

The expression 0.1 is equivalent to the expression 100m, which can be read as “one hundred millicpu”.

But some people including me can accidentally copy values of cpu limits from memory limits and just use wrong syntax.

So the solution in this case is to replace “Mi” with “m”.

Wrong: cpu: 100Mi. Correct: cpu: 100m or cpu: 0.1.

I had this same issue, GKE has a default LimitRange with default limits for CPU request set to 100m, this can be checked by running kubectl get limitrange -o=yaml -n default|your-namespace.

This limit is applied to every container. So for instance, if you have a 4 cores node, and assuming that each pod created has 2 containers, it will allow only for around ~20 pods to be created, at least that was what I understood about it.

The workaround is to change the default limit by changing/removing the LimitRange, and removing old pods so they are recreated with the new defaults, or specifically adding a different limit range to your pod config.

Some reading material: https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-and-a-cpu-limit https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/#create-a-limitrange-and-a-pod https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#how-pods-with-resource-limits-are-run https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits

Having a similar issue, there’s almost 4 whole CPUs available within the cluster, a new Pod is requesting 500m (half) & scheduler reports insufficient CPU in all nodes. 😱 Working with GKE, Kubernetes master version 1.9.2-gke.1.

@nsidhaye So, turns out my problem was: We had 30 cores to use, every deployment had a default request/limit set to 1 core request, 2 cores limit. Even tho the apps weren’t consuming more than 50mi cpu, it was “locking” 1 whole core, meaning we hit the limit of 30 apps pretty quickly.

We had to redeploy most of our apps based on real resource consumption, using Prometheus/Grafana, we checked what is the average CPU consumption (and memory) for each pod, calculate how much it shoudl request and updated those values.

If you do a kubectl describe nodes you should see how much resources you already requested by node, and it should point you in the correct direction of fixing your issue.

Is there any official update / feedback on this topic?

This isn’t really an ‘issue’ and I think we should close this question. See here for a further explanation: https://stackoverflow.com/a/45585916/1663462

There main issue is probably that it’s not very intuitive why one gets the error message - even though it is the intended and correct behavior.

In this scenario AWS/Azure/whatever can report incorrect CPU usage. Use kubectl describe node xxxx to check each node. You’ll probably find that the CPU usage on the node is too high (see image below - this is showing a healthy state, but you may see that this is e.g. 80% in your own case). You may need to delete some resources from the node (e.g. any unused pods that aren’t required) in order to successfully schedule new pods onto the node.

image

Hi,

the issue has been submitted in 2016. Any idea when it will be fixed? I have Openshift Origin 3.7 and this is killing me…

We got the same problem. Unable to create more pods Insufficient cpu while all nodes are on ~5 - 10% cpu load 60-70% cpu limit (kubectl describe node). Restarting the Master Node seems to successfully schedule the pods.

I had the same issue, but spending some time debugging, I found the issue. It is not a bug in my case. Try checking kubectl describe nodes {node name}, and sum the total cpu requests by other pods.

In my case, there were insufficient cpu resources left. Even though the overall cpu is not much utilised and free resources are available, the cpu requests from each pods are reserved and allocated dedicatedly in the nodes. Try reducing the cpu requests from other pod, and the pending pods will get scheduled automatically…

Nothing?

How do you restart the nodes? I’m using Google Cloud platform… Would I SSH into the compute instances and restart?

I think the reason there is no node cpu Requests can satisfied pod cpu request. use this command check all node cpu request , kubectl describe nodes {NodeName} . if pod request cpu add current request cpu more than 100% , kube-scheduler will get a event "Kubernetes Pods not scheduled due to “Insufficient CPU” "

This issue is coming up on two years now, we’re seeing this and its holding us back from going to production.

Same issue on GKE, just added a fresh new instance (micro) but it won’t schedule even the smallest pod on it. Eg:

   Requests:
     cpu:		1Mi
     memory:		64Mi
...
 26m		1s		95	default-scheduler			Warning		FailedScheduling	No nodes are available that match all of the following predicates:: Insufficient cpu (5).

Though on the fresh node there is enough cpu available:

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests	CPU Limits	Memory Requests	Memory Limits
  ------------	----------	---------------	-------------
  320m (34%)	100m (10%)	262Mi (44%)	414Mi (69%)

Other nodes are pretty packed at ~95% cpu allocated on each node, though even there it should schedule a 1m cpu pod.

Seeing this too on k8s bare-metal

For me, creating all the deployments and services in a different namespace (other than default) fixed this issue. On GKE

@k82cn

Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  31s (x637 over 3h)  default-scheduler  No nodes are available that match all of the predicates: Insufficient cpu (7), Insufficient memory (3), PodToleratesNodeTaints (5)

My resources on one of my nodes (they are all pretty much the same)

Capacity:
 cpu:     32
 memory:  65690484Ki
 pods:    110
Allocatable:
 cpu:     32
 memory:  65588084Ki
 pods:    110

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits    Memory Requests  Memory Limits
  ------------  ----------    ---------------  -------------
  26817m (83%)  30617m (95%)  12784Mi (19%)    32756Mi (51%)

It doesn’t make sense to use the POD Limit, which should kill my pod in case of memory leak or some abnormal CPU usage, to limit POD scheduling. If I have 100 pods with each of them have one limit, is very unlikely that they will all be running at its peak limit at the same time. Kubernetes should know about the used resources on my node for that.

So, either I have to create pods that will have veeery little requests and will not be able to handle any burst of requests for example, or I will waste a huge amount of resources because of it.

seems a really old issue, any new logs for this issue?

Did anyone resolve this issue?

I’m trying to install ElasticSearch on my AKS cluster using helm. I get an error telling me there isn’t enough cpu. I have prometheus on my cluster but there are no spikes in cpu that would go close to 100%.

Could someone help me understand the output of kubectl describe node? What is the relationship between requests and limits, what should the numbers look like?

My problem was caused by CPU limits at IAM & admin -> Quotas -> Compute Engine API (CPU), had to request for more resources available for my environment since upgraded limits my pods scale up easily.

Also seeing this issue. We’re on GKE.

Same issue, running 1.9.6-gke.1

I might be seeing this as well (on GKE):

I have a deployment with:

[...]
        resources:
          requests:
            cpu: 1
            memory: 3G
[...]
        resources:
          requests:
            cpu: 9G
            memory: 52G
[...]

trying to deploy to a cluster that has 3 nodes with 15.89 CPU allocatable and 57.65 GB memory allocatable but getting Insufficient cpu (3), Insufficient memory (6) for scheduling.

Doing stuff like bumping the second container down to:

   requests:
      cpu:        4G
      memory:     22G

results in the same scheduling issue.

Thanks @chrissound, I have another cluster with high cpu instances and indeed they haven’t shown any issues.

Not sure if related, but I recently got an email from GCP pointing to these docs https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/. GKE will start doing this after 1.7.6 upgrade.