kubernetes: Heapster cpu/usage_rate returns incorrect cpu/usage_rate caused by overflow?

Forked from https://github.com/kubernetes/kubernetes/issues/27194

@ichekrygin reported a weird cpu/usage_rate issue from heapster

I am running on v1.3.5 and this still seems to be an issue

curl http://localhost:8001/api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/model/namespaces/production/pods/rain-377063289-tz0tq/metrics/cpu/usage_rate
{
  "metrics": [
   {
    "timestamp": "2016-08-18T03:04:00Z",
    "value": 18446744073709551449
   },
   {
    "timestamp": "2016-08-18T03:05:00Z",
    "value": 705
   },
   {
    "timestamp": "2016-08-18T03:06:00Z",
    "value": 18446744073709551028
   },
   {
    "timestamp": "2016-08-18T03:07:00Z",
    "value": 696
   },
   {
    "timestamp": "2016-08-18T03:08:00Z",
    "value": 18446744073709550587
   },
   {
    "timestamp": "2016-08-18T03:09:00Z",
    "value": 0
   },

Node version:

kubelet --version
Kubernetes v1.3.5

@timstclair and I checked his node stats by collecting curl -s http://localhost:10255/stats/summary | jq '.node.cpu.usageCoreNanoSeconds'. usageCoreNanoSeconds is the accumulative cpu usage, and it is only stats used by heapster and HPA. This stat is mapping to heapster’s cpu usage, and it is used to calculate cpu/usage_rate together with some timestamps. We noticed that the usageCoreNanoSeconds are sane, but corresponding heapster’s cpu/usage_rate is insane as the above. I believe heapster ran into some overflow issue cAdvisor encountered before. Filed this one as a record to make sure the problem is addressed at the proper layer.

cc/ @mwielgus @fgrzadkowski

About this issue

Original URL
State: closed
Created 8 years ago
Reactions: 1
Comments: 62 (35 by maintainers)

Most upvoted comments

I believe this is no longer an issue with our 1.5 series AWS images: docker version is >= 1.12, systemctl version < 226, docker not configured with custom option. Closing.

justinsb on Feb 5, 2017

@danielfm latest stable AMI still does not include the fix. Beta channel is fixed however.

zihaoyu on Oct 25, 2016