kubernetes: Warning FailedGetResourceMetric horizontal-pod-autoscaler missing request for cpu
What happened:
HPA always has a target of <unknown>/70% and events that say:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedComputeMetricsReplicas 36m (x12 over 38m) horizontal-pod-autoscaler failed to get cpu utilization: missing request for cpu
Warning FailedGetResourceMetric 3m50s (x136 over 38m) horizontal-pod-autoscaler missing request for cpu
- There is a single container in the pods and it has resource requests and limits set.
- The metrics-server is running
- All pods have metrics show in
kubectl top pod - All pods have metrics in
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/pods"
Here’s the HPA in YAML:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
annotations:
autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2019-06-25T09:56:21Z","reason":"SucceededGetScale","message":"the
HPA controller was able to get the target''s current scale"},{"type":"ScalingActive","status":"False","lastTransitionTime":"2019-06-25T09:56:21Z","reason":"FailedGetResourceMetric","message":"the
HPA was unable to compute the replica count: missing request for cpu"}]'
creationTimestamp: "2019-06-25T09:56:06Z"
labels:
app: restaurant-monitor
env: prd01
grafana: saFkkx6ik
rps_region: eu01
team: vendor
name: myapp
namespace: default
resourceVersion: "56108423"
selfLink: /apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers/myapp
uid: 7345f8fb-972f-11e9-935d-02a07544d854
spec:
maxReplicas: 25
minReplicas: 14
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: myapp
targetCPUUtilizationPercentage: 70
status:
currentReplicas: 15
desiredReplicas: 0
kind: List
metadata:
resourceVersion: ""
selfLink: ""
What you expected to happen:
No <unknown> in HPA target
How to reproduce it (as minimally and precisely as possible):
I can’t be sure. It’s only a single HPA in our cluster. 10 other HPAs are working OK.
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version): 1.12.6 - Cloud provider or hardware configuration: EKS
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 54
- Comments: 69 (11 by maintainers)
I ran into this as well and this fixed it for me:
I am running pods with more than one container. In my case, the other container is a linkerd sidecar. I was setting the resource requests and limits for my deployment but did not set resources for linkerd proxy.
You must set resources for all containers within a pod otherwise you will get the error “failed to get cpu utilization”. Maybe this error message could be updated?
Hope this helps!
I removed the cluster, and rebuild it from scratch. The problem doesn’t appear anymore.
https://github.com/kubernetes/kubernetes/blob/cd89631620f62ea8e43b55be9c6a7b06bc39274f/pkg/controller/podautoscaler/metrics/rest_metrics_client.go#L63-L69
HPA use only selector to filter pods without checking ownership, I think this is a horrible mistake!
Had same issue with a deployment that could not scale because of the
"failed to get cpu utilization: missing request for cpu"error that the HPA of the deployment was showing.
Finally got it fixed now.
Here the reasons & background:
My deployment consist of a
The “main app” container had “resources” set. Both “sidecar” containers had not.
So first problem were the missing “resources” specs on both sidecar containers.
Such behavior with multiple containers in the POD is described in https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
The second problem was that the “Job” that ran before the actual app deployment has ALSO to have “resources” defined.
And THAT was really unexpected.
That is something where @max-rocket-internet also stumbled upon & what i’ve tested then. @max-rocket-internet - thanks for the hint 🍺
So, TIL:
@hex108 I am not working on this. 😃
Ahhh, I deleted those
Completedpods and suddenly the HPA is back in action:I had the same case with sidecars, setting request/limits for all containers fixed the problem. Thanks!
@hyprnick This worked for me. I had to add resource requests/limits to a sidecar container and remove “Completed” jobs from the namespace.
But these
Completedpods are not from the deployment that is specified in the HPA, they are created from aJob. Sure they don’t haveresourcesset but they should be ignored by the HPA, right?Here’s the pod JSON from one:
I will test creating more
Completedpods WITHOUTresourcesset and see if the issue returns. And then test creating moreCompletedpods WITHresourcesand see if it’s OK.Issues go stale after 90d of inactivity. Mark the issue as fresh with
/remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.If this issue is safe to close now please do so with
/close.Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale
I suspect the reason most of us are here is our beloved nginx-ingress-controller fails to autoscale leading to many tears and much frustration. It seems the chart is giving same labels to the defaultbackend that overlap with the controller deployment causing this issue. Set this in your values yaml to get around it:
Feel free to change the labels to whatever you want if you want to be more creative. 😄
Other option is to set:
This is cleaner option, but you cannot set it to this without a maintenance if you are currently deployed because you need to redeploy the controller in this instance because the label is immutable.
I had a
linkerdproxy container injected into the pods with norequestnorlimitsdefined. Once I defined them and reinitiated the deployments hpa was happily working.Reading the discussion it seems to me the error message
missing request for cpucan have multiple causes, which adds to the confusion. IMO a good action item would be to make the message more detailed, e.g. pointing to which pod and container didn’t have the requests set.I did the injection on the Namespaces level:
Alternatively you could apply on the Deployment level:
That’s great but it’s still a bug for people who only have a single container in their pods.
I’ve asked for reviews on https://github.com/kubernetes/kubernetes/pull/86044
@max-rocket-internet I submitted a tentative PR #86044
@alexvaque In my case, I had to add the request resource to the deployment to fix the issue
Interesting. I was able to create an HPA for a simple sample deployment but Nginx is still unable to retrieve metrics.
Looks like this is related to how HPA uses labels https://github.com/helm/charts/issues/20315#issuecomment-595324778
Encountered the same issue and was able to resolve it by making sure that the cronjob pod
labelsshould be different from the deploymentmatchLabels.In @max-rocket-internet’s case deployment pod matchlabels
cronjob pod labels
Maybe worth noting that even though I got these events and the HPA is saying
failed to get cpu utilization: missing request for cpuit still didn’t transition to<unknown> / 70%after 10 minutes. But if I create the HPA while there is theseCompletedpods then it stays in<unknown>state.I’m getting same issue, I have enable metric-server in minikube, when I create hpa its always says
FailedGetResourceMetric 4m15s (x21 over 9m16s) horizontal-pod-autoscaler failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics APImy deployment able to get scale, but not scale down even after hours,
--------Edited------
I have tried my same deployment with kind cluster and its working fine, there is some issue with minikube
I found the same issue and in my case the reason because the pod or pods are failing with the metrics is because the POD is not 100% ready… Check the healtchecks , security groups, etc.
Here more info: https://docs.aws.amazon.com/eks/latest/userguide/horizontal-pod-autoscaler.html
for the
stable/nginx-ingressyou can either set the same limits on the defaultBackendor probably use @oba11 's idea and set probably
defaultBackend.deploymentLabelsand/ordefaultBackend.podLabelsspecifically if the requests/limits are much higher on the controller pods and it would be a waste of resourceswork form me !! thanks
Using the new beta API
autoscaling/v2beta2seems to solve it@hyprnick It worked. Thanks!
experiencing the same issue
Configured HPA as
and
Deployment CPU resources as
Cluster details
@hex108
Sure. Here’s from the deployment (
kubectl get -o json deployment myapp | jq '.spec.template.spec.containers[].resources'):This shows there’s only a single container in these pods.
Here’s a list of pods:
And resources from all pods (
$ kubectl get -o json -l app=myapp pod | jq '.items[].spec.containers[].resources'):That just repeats 14 times, once for each pod. And then the 3 completed pods show as: