prometheus: Prometheus 2.2.1 "out of memory" when starting TSDB

What did you do? Starting Prometheus in Container What did you expect to see? Normal behavior What did you see instead? Under which circumstances? fatal error: runtime: out of memory

However, after I run “docker-compose down -v” to delete the volume, it works again.

I got the same issue when I ran Prometheus in kubernetes with StatefulSet. Prometheus hanged up at “Starting TSDB…”. After I deleted the PersistentVolume, Prometheus comes up as usual.

Environment

  • System information:

Linux 4.9.32-15.41.amzn1.x86_64 x86_64

  • Prometheus version:

    2.2.1

  • Prometheus configuration file:

global:
  scrape_interval: 5s
  scrape_timeout: 5s
  evaluation_interval: 1m

alerting:
  alertmanagers:
  - path_prefix: devops-eval-prometheus1-app1/
  - static_configs:
    - targets:
      - alertmanager:9093
    scheme: http
    timeout: 10s

rule_files:
- /srv/rules/*.yml
scrape_configs:
- job_name: nodecadvisor
  scrape_interval: 5s
  scrape_timeout: 5s
  metrics_path: /metrics
  scheme: http
  static_configs:
  - targets:
    - cadvisor:8080

- job_name: consul
  scrape_interval: 5s
  scrape_timeout: 5s
  metrics_path: /metrics
  consul_sd_configs:
  - server: asds-dev-elastic01-es-master01:8500
    datacenter: asds-dev-elastic01
    services: ['node-exporter','cadvisor']
  relabel_configs:
    - source_labels: ['__meta_consul_node']
      regex:         (.+)
      target_label:  instance
      replacement:   '${1}'
      action: replace
    - source_labels: ['__meta_consul_address']
      regex:         (.+)
      target_label:  ip
      replacement:   '${1}'
      action: replace
    - source_labels: ['__meta_consul_service_id']
      regex:         (.+)
      target_label:  service_id
      replacement:   '${1}'
      action: replace
    - source_labels: ['__meta_consul_tags']
      separator: ","
      regex:         ',(.+),(.+),'
      target_label:  version
      replacement:   '${2}'
      action: replace
  • Logs:
prometheus_1       | level=info ts=2018-04-05T09:39:28.840531407Z caller=main.go:220 msg="Starting Prometheus" version="(version=2.2.1, branch=HEAD, revision=bc6058c81272a8d938c05e75607371284236aadc)"
prometheus_1       | level=info ts=2018-04-05T09:39:28.840586103Z caller=main.go:221 build_context="(go=go1.10, user=root@149e5b3f0829, date=20180314-14:15:45)"
prometheus_1       | level=info ts=2018-04-05T09:39:28.840612693Z caller=main.go:222 host_details="(Linux 4.9.32-15.41.amzn1.x86_64 #1 SMP Thu Jun 22 06:20:54 UTC 2017 x86_64 958d383a8236 (none))"
prometheus_1       | level=info ts=2018-04-05T09:39:28.840636264Z caller=main.go:223 fd_limits="(soft=1024, hard=4096)"
prometheus_1       | level=info ts=2018-04-05T09:39:28.844252402Z caller=main.go:504 msg="Starting TSDB ..."
prometheus_1       | level=info ts=2018-04-05T09:39:28.845984174Z caller=web.go:382 component=web msg="Start listening for connections" address=0.0.0.0:9090
prometheus_1       | fatal error: runtime: out of memory
prometheus_1       |
prometheus_1       | runtime stack:
prometheus_1       | runtime.throw(0x1af4202, 0x16)
prometheus_1       |    /usr/local/go/src/runtime/panic.go:619 +0x81
prometheus_1       | runtime.sysMap(0xc5e4040000, 0x100000, 0xc420179d00, 0x28a6138)
prometheus_1       |    /usr/local/go/src/runtime/mem_linux.go:216 +0x20a
prometheus_1       | runtime.(*mheap).sysAlloc(0x288c9c0, 0x100000, 0x7fc60e202db0)
prometheus_1       |    /usr/local/go/src/runtime/malloc.go:470 +0xd4
prometheus_1       | runtime.(*mheap).grow(0x288c9c0, 0x1, 0x0)
prometheus_1       |    /usr/local/go/src/runtime/mheap.go:907 +0x60
prometheus_1       | runtime.(*mheap).allocSpanLocked(0x288c9c0, 0x1, 0x28a6148, 0x7fc60e202db0)
prometheus_1       |    /usr/local/go/src/runtime/mheap.go:820 +0x301
prometheus_1       | runtime.(*mheap).alloc_m(0x288c9c0, 0x1, 0xc42004003f, 0x7fc60e202db0)
prometheus_1       |    /usr/local/go/src/runtime/mheap.go:686 +0x118
prometheus_1       | runtime.(*mheap).alloc.func1()
prometheus_1       |    /usr/local/go/src/runtime/mheap.go:753 +0x4d
prometheus_1       | runtime.(*mheap).alloc(0x288c9c0, 0x1, 0x7fc60e01003f, 0x7fc60e202db0)
prometheus_1       |    /usr/local/go/src/runtime/mheap.go:752 +0x8a
prometheus_1       | runtime.(*mcentral).grow(0x288ecd0, 0x0)
prometheus_1       |    /usr/local/go/src/runtime/mcentral.go:232 +0x94
prometheus_1       | runtime.(*mcentral).cacheSpan(0x288ecd0, 0x7fc60e202db0)
prometheus_1       |    /usr/local/go/src/runtime/mcentral.go:106 +0x2e4
prometheus_1       | runtime.(*mcache).refill(0x7fc9f1a336c8, 0xc42004653f)
prometheus_1       |    /usr/local/go/src/runtime/mcache.go:123 +0x9c
prometheus_1       | runtime.(*mcache).nextFree.func1()
prometheus_1       |    /usr/local/go/src/runtime/malloc.go:556 +0x32
prometheus_1       | runtime.systemstack(0x0)
prometheus_1       |    /usr/local/go/src/runtime/asm_amd64.s:409 +0x79
prometheus_1       | runtime.mstart()
prometheus_1       |    /usr/local/go/src/runtime/proc.go:1170
prometheus_1       |
prometheus_1       | goroutine 170 [running]:
prometheus_1       | runtime.systemstack_switch()
prometheus_1       |    /usr/local/go/src/runtime/asm_amd64.s:363 fp=0xc424108c00 sp=0xc424108bf8 pc=0x457ee0
prometheus_1       | runtime.(*mcache).nextFree(0x7fc9f1a336c8, 0xc424108c3f, 0x441f58, 0xc5e403fe01, 0x1ff)
prometheus_1       |    /usr/local/go/src/runtime/malloc.go:555 +0xa9 fp=0xc424108c58 sp=0xc424108c00 pc=0x4101f9
prometheus_1       | runtime.mallocgc(0x400, 0x0, 0x200, 0x400)
prometheus_1       |    /usr/local/go/src/runtime/malloc.go:710 +0x79f fp=0xc424108cf8 sp=0xc424108c58 pc=0x410b4f
prometheus_1       | runtime.growslice(0x1744e80, 0xc5e2775600, 0x200, 0x200, 0x201, 0xc5e403f400, 0x3f383a8000000002, 0x400)
prometheus_1       |    /usr/local/go/src/runtime/slice.go:172 +0x21d fp=0xc424108d60 sp=0xc424108cf8 pc=0x441f0d
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc.(*bstream).writeBit(...)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc/bstream.go:79
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc.(*bstream).writeBits(0xc5e1c9d9c0, 0x0, 0x1)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc/bstream.go:118 +0x2de fp=0xc424108dd0 sp=0xc424108d60 pc=0x147b6ce
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc.(*xorAppender).Append(0xc5e1c9ec60, 0x162793587f3, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc/xor.go:165 +0x54f fp=0xc424108e80 sp=0xc424108dd0 pc=0x147c8ef
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*memSeries).append(0xc423498e70, 0x162793587f3, 0x0, 0x700001)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:1221 +0x126 fp=0xc424108ec0 sp=0xc424108e80 pc=0x14a8a86
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*Head).processWALSamples(0xc4201a8d20, 0x16267544a00, 0x1, 0x2, 0xc42044a300, 0xc42044a360, 0x60e57)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:219 +0x16f fp=0xc424108f58 sp=0xc424108ec0 pc=0x14a33bf
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*Head).ReadWAL.func1(0xc4201a8d20, 0x16267544a00, 0x2, 0xc4216fd3e8, 0xc4216fd3f0, 0x1, 0xc42044a300, 0xc42044a360)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:256 +0x60 fp=0xc424108fa0 sp=0xc424108f58 pc=0x14b91f0
prometheus_1       | runtime.goexit()
prometheus_1       |    /usr/local/go/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc424108fa8 sp=0xc424108fa0 pc=0x45aa01
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*Head).ReadWAL
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:255 +0x1e8
prometheus_1       |
prometheus_1       | goroutine 1 [chan receive, 5 minutes]:
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run(0xc420adbc88, 0xc42050c6f0, 0x8)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:43 +0xec
prometheus_1       | main.main()
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:581 +0x5237
prometheus_1       |
prometheus_1       | goroutine 19 [syscall, 5 minutes]:
prometheus_1       | os/signal.signal_recv(0x0)
prometheus_1       |    /usr/local/go/src/runtime/sigqueue.go:139 +0xa6
prometheus_1       | os/signal.loop()
prometheus_1       |    /usr/local/go/src/os/signal/signal_unix.go:22 +0x22
prometheus_1       | created by os/signal.init.0
prometheus_1       |    /usr/local/go/src/os/signal/signal_unix.go:28 +0x41
prometheus_1       |
prometheus_1       | goroutine 4 [chan receive, 1 minutes]:
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/golang/glog.(*loggingT).flushDaemon(0x2883600)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/golang/glog/glog.go:879 +0x8b
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/golang/glog.init.0
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/golang/glog/glog.go:410 +0x203
prometheus_1       |
prometheus_1       | goroutine 46 [chan receive (nil chan), 5 minutes]:
prometheus_1       | github.com/prometheus/prometheus/prompb.RegisterAdminHandlerFromEndpoint.func1.1(0x7fc9f19dc150, 0xc420094020, 0xc4200b1380, 0x1ae4f89, 0xc)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/prompb/rpc.pb.gw.go:85 +0x4c
prometheus_1       | created by github.com/prometheus/prometheus/prompb.RegisterAdminHandlerFromEndpoint.func1
prometheus_1       |    /go/src/github.com/prometheus/prometheus/prompb/rpc.pb.gw.go:84 +0x19b
prometheus_1       |
prometheus_1       | goroutine 141 [select, 5 minutes, locked to thread]:
prometheus_1       | runtime.gopark(0x1b6e6d0, 0x0, 0x1ade841, 0x6, 0x18, 0x1)
prometheus_1       |    /usr/local/go/src/runtime/proc.go:291 +0x11a
prometheus_1       | runtime.selectgo(0xc420573f50, 0xc420088900)
prometheus_1       |    /usr/local/go/src/runtime/select.go:392 +0xe50
prometheus_1       | runtime.ensureSigM.func1()
prometheus_1       |    /usr/local/go/src/runtime/signal_unix.go:549 +0x1f4
prometheus_1       | runtime.goexit()
prometheus_1       |    /usr/local/go/src/runtime/asm_amd64.s:2361 +0x1
prometheus_1       |
prometheus_1       | goroutine 142 [select, 5 minutes]:
prometheus_1       | main.main.func6(0xc420573818, 0xc420573838)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:372 +0x121
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run.func1(0xc42012a900, 0xc4200c2380, 0xc42050c670)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:38 +0x27
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:37 +0xa8


prometheus_1       | github.com/prometheus/prometheus/discovery.(*Manager).Run(0xc42041c620, 0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/discovery/manager.go:93 +0x50
prometheus_1       | main.main.func8(0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:393 +0x40
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run.func1(0xc42012a900, 0xc42045eca0, 0xc42045ecc0)


prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:37 +0xa8
prometheus_1       |
prometheus_1       | goroutine 144 [chan receive, 5 minutes]:
prometheus_1       | github.com/prometheus/prometheus/discovery.(*Manager).Run(0xc42041c690, 0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/discovery/manager.go:93 +0x50
prometheus_1       | main.main.func10(0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:406 +0x40
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run.func1(0xc42012a900, 0xc42045ed20, 0xc42045ed60)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:38 +0x27
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:37 +0xa8
prometheus_1       |
prometheus_1       | goroutine 145 [chan receive, 5 minutes]:
prometheus_1       | main.main.func12(0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:423 +0x5e
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run.func1(0xc42012a900, 0xc42036fc80, 0xc42045ed80)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:38 +0x27
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:37 +0xa8
prometheus_1       |
prometheus_1       | goroutine 146 [chan receive, 5 minutes]:
prometheus_1       | main.main.func14(0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:445 +0x8f
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run.func1(0xc42012a900, 0xc42012a840, 0xc42050c6b0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:38 +0x27
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:37 +0xa8
prometheus_1       |
prometheus_1       | goroutine 147 [select, 5 minutes]:
prometheus_1       | main.main.func16(0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:475 +0x113
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run.func1(0xc42012a900, 0xc42012a8a0, 0xc42050c6d0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:38 +0x27
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:37 +0xa8
prometheus_1       |
prometheus_1       | goroutine 148 [runnable]:
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*walReader).entry(0xc42021f3b0, 0x1c37580, 0xc4203972e0, 0xfe1b027, 0x0, 0x0, 0xc42009be00, 0xc42044a418, 0xc420b63a00)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.go:1087 +0x122
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*walReader).next(0xc42021f3b0, 0xc420b63ac0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.go:1032 +0xee
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*walReader).Read(0xc42021f3b0, 0xc42047e830, 0xc4217081a0, 0xc4217081c0, 0xc42044a360, 0xc420b63be8)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.go:926 +0x184
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*repairingWALReader).Read(0xc421708180, 0xc42047e830, 0xc4217081a0, 0xc4217081c0, 0x2, 0xc4216fd3e8)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.go:261 +0x5f
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*Head).ReadWAL(0xc4201a8d20, 0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:308 +0x321
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.Open(0x7ffcfaba6f62, 0xb, 0x1c37240, 0xc4205264e0, 0x1c475c0, 0xc42009c6c0, 0xc420526510, 0xc42043a500, 0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/db.go:239 +0x52d
prometheus_1       | github.com/prometheus/prometheus/storage/tsdb.Open(0x7ffcfaba6f62, 0xb, 0x1c37240, 0xc4205264e0, 0x1c475c0, 0xc42009c6c0, 0xc42015ad88, 0x0, 0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/storage/tsdb/tsdb.go:143 +0x293
prometheus_1       | main.main.func18(0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:505 +0x1f6
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run.func1(0xc42012a900, 0xc4200c2400, 0xc42036fe00)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:38 +0x27
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:37 +0xa8
prometheus_1       |
prometheus_1       | goroutine 149 [select, 5 minutes]:
prometheus_1       | github.com/prometheus/prometheus/web.(*Handler).Run(0xc42014dd00, 0x1c50d80, 0xc42032c480, 0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/web/web.go:470 +0xe6b
prometheus_1       | main.main.func20(0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:533 +0x40
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run.func1(0xc42012a900, 0xc42045eda0, 0xc42050c6e0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:38 +0x27
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:37 +0xa8
prometheus_1       |
prometheus_1       | goroutine 150 [chan receive, 5 minutes]:
prometheus_1       | main.main.func22(0x1, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:552 +0x4e
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run.func1(0xc42012a900, 0xc42045edc0, 0xc42045ee20)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:38 +0x27
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:37 +0xa8
prometheus_1       |
prometheus_1       | goroutine 151 [chan receive, 5 minutes]:
prometheus_1       | main.main.func24(0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:570 +0x5e
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run.func1(0xc42012a900, 0xc42036ff20, 0xc42050c6f0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:38 +0x27
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group.(*Group).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/oklog/oklog/pkg/group/group.go:37 +0xa8
prometheus_1       |
prometheus_1       | goroutine 167 [select]:
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*SegmentWAL).run(0xc4202e2280, 0x2540be400)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.go:704 +0x36e
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.OpenSegmentWAL
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.go:244 +0x776
prometheus_1       |
prometheus_1       | goroutine 45 [select, 5 minutes]:
prometheus_1       | github.com/prometheus/prometheus/vendor/google.golang.org/grpc.(*addrConn).transportMonitor(0xc4200b1ba0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/clientconn.go:908 +0x1c0
prometheus_1       | github.com/prometheus/prometheus/vendor/google.golang.org/grpc.(*ClientConn).resetAddrConn.func1(0xc4200b1ba0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/clientconn.go:637 +0x1af
prometheus_1       | created by github.com/prometheus/prometheus/vendor/google.golang.org/grpc.(*ClientConn).resetAddrConn
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/clientconn.go:628 +0x6d8
prometheus_1       |
prometheus_1       | goroutine 239 [chan receive, 5 minutes]:
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.muxListener.Accept(...)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/cmux.go:184
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*muxListener).Accept(0xc4203d65c0, 0xc420094020, 0x180fc40, 0x28679d0, 0x1a98bc0)
prometheus_1       |    <autogenerated>:1 +0x5b
prometheus_1       | net/http.(*Server).Serve(0xc4204c6340, 0x1c4f480, 0xc4203d65c0, 0x0, 0x0)
prometheus_1       |    /usr/local/go/src/net/http/server.go:2770 +0x1a5
prometheus_1       | github.com/prometheus/prometheus/web.(*Handler).Run.func5(0xc4204c6340, 0x1c4f480, 0xc4203d65c0, 0xc42014dd00)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/web/web.go:455 +0x43
prometheus_1       | created by github.com/prometheus/prometheus/web.(*Handler).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/web/web.go:454 +0xced
prometheus_1       |
prometheus_1       | goroutine 240 [chan receive, 5 minutes]:
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.muxListener.Accept(...)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/cmux.go:184
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*muxListener).Accept(0xc4203d63a0, 0x1b6a478, 0xc420394240, 0x1c4f480, 0xc4203d63a0)
prometheus_1       |    <autogenerated>:1 +0x5b
prometheus_1       | github.com/prometheus/prometheus/vendor/google.golang.org/grpc.(*Server).Serve(0xc420394240, 0x1c4f480, 0xc4203d63a0, 0x0, 0x0)


prometheus_1       |    /go/src/github.com/prometheus/prometheus/web/web.go:460 +0x43
prometheus_1       | created by github.com/prometheus/prometheus/web.(*Handler).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/web/web.go:459 +0xd39
prometheus_1       |
prometheus_1       | goroutine 241 [IO wait, 5 minutes]:


prometheus_1       | internal/poll.(*pollDesc).wait(0xc42047c298, 0x72, 0xc42009d200, 0x0, 0x0)
prometheus_1       |    /usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0x9b
prometheus_1       | internal/poll.(*pollDesc).waitRead(0xc42047c298, 0xffffffffffffff00, 0x0, 0x0)
prometheus_1       |    /usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3d
prometheus_1       | internal/poll.(*FD).Accept(0xc42047c280, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
prometheus_1       |    /usr/local/go/src/internal/poll/fd_unix.go:372 +0x1a8
prometheus_1       | net.(*netFD).accept(0xc42047c280, 0x1c59ea0, 0xc420312058, 0x1adc177)
prometheus_1       |    /usr/local/go/src/net/fd_unix.go:238 +0x42
prometheus_1       | net.(*TCPListener).accept(0xc420096018, 0xc420312000, 0xc42091ee98, 0x1)
prometheus_1       |    /usr/local/go/src/net/tcpsock_posix.go:136 +0x2e
prometheus_1       | net.(*TCPListener).Accept(0xc420096018, 0xc42091ee98, 0x1955540, 0x1955540, 0xc420dcd290)
prometheus_1       |    /usr/local/go/src/net/tcpsock.go:259 +0x49
prometheus_1       | github.com/prometheus/prometheus/vendor/golang.org/x/net/netutil.(*limitListener).Accept(0xc4203d61e0, 0x4346f4, 0xc42091eee8, 0x4571a0, 0xc42091ef28)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/golang.org/x/net/netutil/listen.go:30 +0x53
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/mwitkow/go-conntrack.(*connTrackListener).Accept(0xc4203d6360, 0x1b678a0, 0xc42009d040, 0x1c5d9e0, 0xc420dcd320)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/mwitkow/go-conntrack/listener_wrapper.go:86 +0x37
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*cMux).Serve(0xc42009d040, 0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/cmux.go:124 +0x88
prometheus_1       | github.com/prometheus/prometheus/web.(*Handler).Run.func7(0xc420312180, 0x1c45ec0, 0xc42009d040)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/web/web.go:467 +0x31
prometheus_1       | created by github.com/prometheus/prometheus/web.(*Handler).Run
prometheus_1       |    /go/src/github.com/prometheus/prometheus/web/web.go:466 +0xd8e
prometheus_1       |
prometheus_1       | goroutine 243 [IO wait, 5 minutes]:
prometheus_1       | internal/poll.runtime_pollWait(0x7fc9f19d7e30, 0x72, 0xc420921b88)
prometheus_1       |    /usr/local/go/src/runtime/netpoll.go:173 +0x57
prometheus_1       | internal/poll.(*pollDesc).wait(0xc42047c718, 0x72, 0xffffffffffffff00, 0x1c3be60, 0x27b7638)
prometheus_1       |    /usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0x9b
prometheus_1       | internal/poll.(*pollDesc).waitRead(0xc42047c718, 0xc420e7c000, 0x8000, 0x8000)
prometheus_1       |    /usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3d
prometheus_1       | internal/poll.(*FD).Read(0xc42047c700, 0xc420e7c000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
prometheus_1       |    /usr/local/go/src/internal/poll/fd_unix.go:157 +0x17d
prometheus_1       | net.(*netFD).Read(0xc42047c700, 0xc420e7c000, 0x8000, 0x8000, 0x60, 0x0, 0xc420057740)
prometheus_1       |    /usr/local/go/src/net/fd_unix.go:202 +0x4f
prometheus_1       | net.(*conn).Read(0xc420096cd8, 0xc420e7c000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
prometheus_1       |    /usr/local/go/src/net/net.go:176 +0x6a
prometheus_1       | bufio.(*Reader).Read(0xc42007e120, 0xc420648118, 0x9, 0x9, 0xc42017afb0, 0xc420057548, 0x411959)
prometheus_1       |    /usr/local/go/src/bufio/bufio.go:216 +0x238
prometheus_1       | io.ReadAtLeast(0x1c36960, 0xc42007e120, 0xc420648118, 0x9, 0x9, 0x9, 0x1c278da, 0xc4200b1ba0, 0xc4200575d8)
prometheus_1       |    /usr/local/go/src/io/io.go:309 +0x86
prometheus_1       | io.ReadFull(0x1c36960, 0xc42007e120, 0xc420648118, 0x9, 0x9, 0x403f2c, 0xc4202e4000, 0x4)
prometheus_1       |    /usr/local/go/src/io/io.go:327 +0x58
prometheus_1       | github.com/prometheus/prometheus/vendor/golang.org/x/net/http2.readFrameHeader(0xc420648118, 0x9, 0x9, 0x1c36960, 0xc42007e120, 0x0, 0xc400000000, 0x0, 0x2)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/golang.org/x/net/http2/frame.go:237 +0x7b
prometheus_1       | github.com/prometheus/prometheus/vendor/golang.org/x/net/http2.(*Framer).ReadFrame(0xc4206480e0, 0xc42009d1c0, 0xc420057768, 0xc4203122a0, 0xc4200577a0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/golang.org/x/net/http2/frame.go:492 +0xa4
prometheus_1       | github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport.(*framer).readFrame(0xc420dcd110, 0xc4200577a0, 0x0, 0xc4203122a0, 0x53d70d)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport/http_util.go:608 +0x2f
prometheus_1       | github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport.(*http2Client).reader(0xc4202e4900)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport/http2_client.go:1080 +0x47
prometheus_1       | created by github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport.newHTTP2Client
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport/http2_client.go:267 +0xb6c
prometheus_1       |
prometheus_1       | goroutine 244 [select, 5 minutes]:
prometheus_1       | github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport.(*http2Client).controller(0xc4202e4900)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport/http2_client.go:1168 +0x122
prometheus_1       | created by github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport.newHTTP2Client
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport/http2_client.go:297 +0xca2
prometheus_1       |
prometheus_1       | goroutine 245 [IO wait, 5 minutes]:
prometheus_1       | internal/poll.runtime_pollWait(0x7fc9f19d7d60, 0x72, 0xc420e9e940)
prometheus_1       |    /usr/local/go/src/runtime/netpoll.go:173 +0x57
prometheus_1       | internal/poll.(*pollDesc).wait(0xc42047d818, 0x72, 0xffffffffffffff00, 0x1c3be60, 0x27b7638)
prometheus_1       |    /usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0x9b
prometheus_1       | internal/poll.(*pollDesc).waitRead(0xc42047d818, 0xc420648200, 0x9, 0x9)
prometheus_1       |    /usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3d
prometheus_1       | internal/poll.(*FD).Read(0xc42047d800, 0xc4206482d8, 0x9, 0x9, 0x0, 0x0, 0x0)
prometheus_1       |    /usr/local/go/src/internal/poll/fd_unix.go:157 +0x17d
prometheus_1       | net.(*netFD).Read(0xc42047d800, 0xc4206482d8, 0x9, 0x9, 0x4, 0x0, 0x0)
prometheus_1       |    /usr/local/go/src/net/fd_unix.go:202 +0x4f
prometheus_1       | net.(*conn).Read(0xc420096ce0, 0xc4206482d8, 0x9, 0x9, 0x0, 0x0, 0x0)
prometheus_1       |    /usr/local/go/src/net/net.go:176 +0x6a
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*bufferedReader).Read(0xc4201a8898, 0xc4206482d8, 0x9, 0x9, 0x100000000000000, 0x0, 0x10)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/buffer.go:42 +0x120
prometheus_1       | io.ReadAtLeast(0x1c37180, 0xc4201a8898, 0xc4206482d8, 0x9, 0x9, 0x9, 0x410ef8, 0x10, 0x1964300)
prometheus_1       |    /usr/local/go/src/io/io.go:309 +0x86
prometheus_1       | io.ReadFull(0x1c37180, 0xc4201a8898, 0xc4206482d8, 0x9, 0x9, 0x32b5e23d542da301, 0xefff100000004, 0x7)
prometheus_1       |    /usr/local/go/src/io/io.go:327 +0x58
prometheus_1       | github.com/prometheus/prometheus/vendor/golang.org/x/net/http2.readFrameHeader(0xc4206482d8, 0x9, 0x9, 0x1c37180, 0xc4201a8898, 0x0, 0x0, 0xc420095ca0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/golang.org/x/net/http2/frame.go:237 +0x7b
prometheus_1       | github.com/prometheus/prometheus/vendor/golang.org/x/net/http2.(*Framer).ReadFrame(0xc4206482a0, 0x1c3eb80, 0xc420095ca0, 0x0, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/golang.org/x/net/http2/frame.go:492 +0xa4
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.matchHTTP2Field(0x1c37180, 0xc4201a8898, 0x1ae59d9, 0xc, 0x1aeaa9a, 0x10, 0x7fc9f19f94b8)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/matchers.go:145 +0x140
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.HTTP2HeaderField.func1(0x1c37180, 0xc4201a8898, 0xc4200576a0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/matchers.go:111 +0x59
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*cMux).serve(0xc42009d040, 0x1c5d9e0, 0xc420dcd320, 0xc420312060, 0xc420095940)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/cmux.go:143 +0x1f3
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*cMux).Serve
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/cmux.go:133 +0x15d
prometheus_1       |
prometheus_1       | goroutine 169 [runnable]:
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc.(*bstream).writeBits(0xc5e1c63180, 0x4e20, 0x11)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc/bstream.go:108 +0x349
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc.(*xorAppender).Append(0xc5e1c4af30, 0x1627935af03, 0x0)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunkenc/xor.go:166 +0x576
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*memSeries).append(0xc4234d40b0, 0x1627935af03, 0x0, 0x1)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:1221 +0x126
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*Head).processWALSamples(0xc4201a8d20, 0x16267544a00, 0x0, 0x2, 0xc42044a240, 0xc42044a300, 0x60b58)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:219 +0x16f
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*Head).ReadWAL.func1(0xc4201a8d20, 0x16267544a00, 0x2, 0xc4216fd3e8, 0xc4216fd3f0, 0x0, 0xc42044a240, 0xc42044a300)
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:256 +0x60
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*Head).ReadWAL
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/head.go:255 +0x1e8
prometheus_1       |
prometheus_1       | goroutine 171 [runnable]:
prometheus_1       | github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*walReader).Read.func1(0xc420312960, 0xc42044a3c0, 0xc42047e830, 0xc4217081e0, 0xc4217081a0, 0xc421708200, 0xc4217081c0, 0xc421708220, 0xc42021f3b0) prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.go:901 +0x7f
prometheus_1       | created by github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*walReader).Read
prometheus_1       |    /go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.go:898 +0x173```

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 1
  • Comments: 16 (5 by maintainers)

Most upvoted comments

I had wal/checkpoing.000029 with 159GB and prometheus crashed everytime. Removed the file and all working fine.

Another prometheus (replica) had a 3GB wal/checkpoint file i suspect that everytime that prometheus crashed (probably the first time due a OOM from a huge query), it made that file grow… until it was huge and impossible to load

i’m using prometheus 2.5.0

@brian-brazil We have this issue again. The container is getting OOM killed upon start.

image

You see it is reaching its 30GB limit and then it goes down due to OOM kill.

Logs:

level=info ts=2018-11-06T23:54:07.953888964Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1539388800000 maxt=1539410400000 ulid=01CSPAG4D1S2PP200F5S3GJ88A
level=info ts=2018-11-06T23:54:07.954345953Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1539410400000 maxt=1539432000000 ulid=01CSPZ38NCPGFATD9TC6BYVDQK
level=info ts=2018-11-06T23:54:07.954769258Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1539432000000 maxt=1539453600000 ulid=01CSQKPF927H8YWZV020XCS58A
level=info ts=2018-11-06T23:54:07.955252927Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1539453600000 maxt=1539475200000 ulid=01CSR89QY9KFJZ5Q71CJY4N9QR
level=info ts=2018-11-06T23:54:07.95575467Z caller=repair.go:39 component=tsdb msg="found healthy block" mint=1539475200000 maxt=1539496800000 ulid=01CSRWWXW5HBBJT2G1EBW1WC02
...

Configuration: image

Hi, did somebody found any solution on the OOM issue during the startup when Prometheus compact a lot of data?

I am running prometheus v2.2.1 in OpenShift with 10GB memory limit and 27GB of data in /data directory. During the startup and these entries in the log:

level=info ts=2018-05-16T08:45:33.61776039Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1526443200000 maxt=1526450400000
level=info ts=2018-05-16T08:45:41.121433336Z caller=head.go:348 component=tsdb msg="head GC completed" duration=599.959505ms
level=info ts=2018-05-16T08:45:42.739168664Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=1.616577116s
level=info ts=2018-05-16T08:45:44.987052493Z caller=compact.go:393 component=tsdb msg="compact blocks" count=3 mint=1526342400000 maxt=1526364000000
level=info ts=2018-05-16T08:47:15.503732025Z caller=compact.go:393 component=tsdb msg="compact blocks" count=3 mint=1526364000000 maxt=1526385600000
level=info ts=2018-05-16T09:00:00.238214945Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1526450400000 maxt=1526457600000
level=info ts=2018-05-16T09:00:19.690017824Z caller=head.go:348 component=tsdb msg="head GC completed" duration=1.870823274s
level=info ts=2018-05-16T09:00:24.602969974Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=4.912877555s
level=info ts=2018-05-16T09:00:26.021513094Z caller=compact.go:393 component=tsdb msg="compact blocks" count=3 mint=1526428800000 maxt=1526450400000

Prometheus consumes about 99% of the 10GB. I added the --storage.tsdb.max-block-duration=6h to the command args and my retention is --storage.tsdb.retention=168h.

Where do I find any documentation on how to reduce memory consumption or the “head chunks” loaded into the memory?

@dobesv @brian-brazil @juliusv