kubernetes: cAdvisor /stats/summary endpoint in kubelet returns incorrect cpu usage numbers

Environment

Kubernetes version: 1.2.3 Docker version: 1.10.3 3 node(c4.xLarge) cluster on AWS running CoreOS 1010.4.0.

Issue

After facing an issue with incorrect metrics being reported by heapster kubernetes/heapster#1177 I tried querying the cadvisor /stats/summary endpoint directly to see if that would give me consistent values for node cpu usage. I have one pod with cpu request=1000m and limit=1000m. In that pod I run a busy loop to consume a 100% of the cpu. This is what top shows on the node. screen shot 2016-06-10 at 10 07 52 am

I query the /stats/summary endpoint every 5 seconds, however it seems that the latest timestamps are only updated every 15 seconds or so. Checking the summary.Node.CPU.UsageNanoCores value from the summary returned gives me the following output(formatted):

TS:2016-06-09T20:05:20-07:00, Percentage:105.539566 Val:1055395658
TS:2016-06-09T20:05:31-07:00, Percentage:107.416097 Val:1074160974
TS:2016-06-09T20:05:44-07:00, Percentage:108.195877 Val:1081958770
TS:2016-06-09T20:05:59-07:00, Percentage:106.670910 Val:1066709101
TS:2016-06-09T20:06:19-07:00, Percentage:14.360576 Val:143605762
TS:2016-06-09T20:06:31-07:00, Percentage:108.000277 Val:1080002769
TS:2016-06-09T20:06:41-07:00, Percentage:108.373232 Val:1083732315
TS:2016-06-09T20:06:56-07:00, Percentage:107.025070 Val:1070250700
TS:2016-06-09T20:07:16-07:00, Percentage:13.004869 Val:130048687
TS:2016-06-09T20:07:31-07:00, Percentage:106.839146 Val:1068391461
TS:2016-06-09T20:07:48-07:00, Percentage:107.614464 Val:1076144640
TS:2016-06-09T20:08:06-07:00, Percentage:4.232330 Val:42323305
TS:2016-06-09T20:08:20-07:00, Percentage:106.009173 Val:1060091732
TS:2016-06-09T20:08:35-07:00, Percentage:108.121440 Val:1081214401
TS:2016-06-09T20:08:50-07:00, Percentage:106.659561 Val:1066595609
TS:2016-06-09T20:09:07-07:00, Percentage:1.724644 Val:17246439
TS:2016-06-09T20:09:19-07:00, Percentage:106.633227 Val:1066332268
TS:2016-06-09T20:09:38-07:00, Percentage:9.938621 Val:99386209
TS:2016-06-09T20:09:53-07:00, Percentage:107.046112 Val:1070461118
TS:2016-06-09T20:10:10-07:00, Percentage:3.373636 Val:33736361
TS:2016-06-09T20:10:25-07:00, Percentage:107.338541 Val:1073385413
TS:2016-06-09T20:10:39-07:00, Percentage:108.575783 Val:1085757834
TS:2016-06-09T20:10:54-07:00, Percentage:107.055382 Val:1070553817
TS:2016-06-09T20:11:13-07:00, Percentage:7.869509 Val:78695088
TS:2016-06-09T20:11:32-07:00, Percentage:11.476262 Val:114762620
TS:2016-06-09T20:11:45-07:00, Percentage:106.928681 Val:1069286811
TS:2016-06-09T20:12:03-07:00, Percentage:3.309632 Val:33096320
TS:2016-06-09T20:12:15-07:00, Percentage:105.832345 Val:1058323450
TS:2016-06-09T20:12:34-07:00, Percentage:5.079409 Val:50794090
TS:2016-06-09T20:12:47-07:00, Percentage:106.305439 Val:1063054389
TS:2016-06-09T20:13:05-07:00, Percentage:3.613690 Val:36136900
TS:2016-06-09T20:13:24-07:00, Percentage:9.785441 Val:97854409
TS:2016-06-09T20:13:43-07:00, Percentage:12.661783 Val:126617830

As you can see I’m not getting a steady report of near 100% cpu usage values for UsageNanoCores. Any idea why this might be the case or how I can debug this issue. Also is there any way I can change the resolution of the summary stats to get more fine grained reporting.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 2
  • Comments: 66 (41 by maintainers)

Commits related to this issue

Most upvoted comments

Hello guys!!

I updated kubernetes to 1.3 and still having the issue.

Enviroment:

Kubernetes version: 1.3.2 Docker version: 1.10.3 Heapster: v1.1.0 (deployed in standalone mode using the ) 6 node cluster (on premises) running CoreOS 1068.8.0.

Here are some info:

heapster’s cpu/usage

curl 10.125.7.224:31580/api/v1/model/namespaces/rubis-ns/pods/rubis-m8n/metrics/cpu/usage { “metrics”: [ { “timestamp”: “2016-07-28T20:04:00Z”, “value”: 976820 }, { “timestamp”: “2016-07-28T20:05:00Z”, “value”: 362613815 }, { “timestamp”: “2016-07-28T20:06:00Z”, “value”: 976820 } ], “latestTimestamp”: “2016-07-28T20:06:00Z”

heapster’s log

I0728 20:01:05.351222 1 manager.go:152] ScrapeMetrics: time: 350.478305ms size: 442 I0728 20:02:05.000562 1 manager.go:79] Scraping metrics start: 2016-07-28 20:01:00 +0000 UTC, end: 2016-07-28 20:02:00 +0000 UTC I0728 20:02:05.603113 1 manager.go:152] ScrapeMetrics: time: 602.172438ms size: 442 I0728 20:03:05.000407 1 manager.go:79] Scraping metrics start: 2016-07-28 20:02:00 +0000 UTC, end: 2016-07-28 20:03:00 +0000 UTC I0728 20:03:05.436473 1 manager.go:152] ScrapeMetrics: time: 435.826802ms size: 442 I0728 20:04:05.000521 1 manager.go:79] Scraping metrics start: 2016-07-28 20:03:00 +0000 UTC, end: 2016-07-28 20:04:00 +0000 UTC I0728 20:04:05.389321 1 manager.go:152] ScrapeMetrics: time: 388.585729ms size: 442 I0728 20:05:05.000420 1 manager.go:79] Scraping metrics start: 2016-07-28 20:04:00 +0000 UTC, end: 2016-07-28 20:05:00 +0000 UTC I0728 20:05:05.456034 1 manager.go:152] ScrapeMetrics: time: 455.30329ms size: 442 I0728 20:06:05.000495 1 manager.go:79] Scraping metrics start: 2016-07-28 20:05:00 +0000 UTC, end: 2016-07-28 20:06:00 +0000 UTC I0728 20:06:05.405457 1 manager.go:152] ScrapeMetrics: time: 404.548733ms size: 442 I0728 20:07:05.000560 1 manager.go:79] Scraping metrics start: 2016-07-28 20:06:00 +0000 UTC, end: 2016-07-28 20:07:00 +0000 UTC I0728 20:07:05.493016 1 manager.go:152] ScrapeMetrics: time: 492.195673ms size: 442

I am using HPA and seeing some strange values on “CURRENT” cpu utilization :

image

Thanks for your attention!