prometheus: tsdb Panic when reading chunk
Bug Report
What did you do? Ran prometheus
What did you expect to see? It run
What did you see instead? Under which circumstances?
Exception on startup: panic: runtime error: index out of range (looks similar to: https://github.com/prometheus/tsdb/issues/251)
Environment
- System information:
/prometheus $ uname -srm
Linux 4.15.3-1.el7.elrepo.x86_64 x86_64
- Prometheus version:
/prometheus $ prometheus --version
prometheus, version 2.2.1 (branch: HEAD, revision: bc6058c81272a8d938c05e75607371284236aadc)
build user: root@149e5b3f0829
build date: 20180314-14:15:45
go version: go1.10
- Logs:
level=info ts=2018-05-01T12:35:26.455261083Z caller=main.go:220 msg="Starting Prometheus" version="(version=2.2.1, branch=HEAD, revision=bc6058c81272a8d938c05e75607371284236aadc)"
level=info ts=2018-05-01T12:35:26.45538143Z caller=main.go:221 build_context="(go=go1.10, user=root@149e5b3f0829, date=20180314-14:15:45)"
level=info ts=2018-05-01T12:35:26.455414873Z caller=main.go:222 host_details="(Linux 4.15.3-1.el7.elrepo.x86_64 #1 SMP Mon Feb 12 06:46:25 EST 2018 x86_64 prometheus-59bf4d7cf-sxsk5 (none))"
level=info ts=2018-05-01T12:35:26.455427991Z caller=main.go:223 fd_limits="(soft=65536, hard=65536)"
level=info ts=2018-05-01T12:35:26.459927519Z caller=web.go:382 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2018-05-01T12:35:26.45947935Z caller=main.go:504 msg="Starting TSDB ..."
level=error ts=2018-05-01T12:36:48.215666256Z caller=wal.go:275 component=tsdb msg="WAL corruption detected; truncating" err="unexpected CRC32 checksum 6cafd16a, want 0" file=/data/prometheus/wal/033372 pos=75965705
level=info ts=2018-05-01T12:36:48.950404399Z caller=main.go:514 msg="TSDB started"
level=info ts=2018-05-01T12:36:48.950585155Z caller=main.go:588 msg="Loading configuration file" filename=/etc/prometheus/config.yaml
level=info ts=2018-05-01T12:36:48.96267101Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2018-05-01T12:36:48.965331005Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2018-05-01T12:36:48.966139809Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2018-05-01T12:36:48.966845893Z caller=kubernetes.go:191 component="discovery manager scrape" discovery=k8s msg="Using pod service account via in-cluster config"
level=info ts=2018-05-01T12:36:48.96989835Z caller=main.go:491 msg="Server is ready to receive web requests."
level=info ts=2018-05-01T12:37:48.996924066Z caller=compact.go:393 component=tsdb msg="compact blocks" count=4 mint=1524960000000 maxt=1524981600000
panic: runtime error: index out of range
goroutine 403 [running]:
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunks.(*Reader).Chunk(0xc75cb30c40, 0x13e, 0x1c56c80, 0xc5b252e0f0, 0x0, 0x0)
/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/chunks/chunks.go:365 +0x3f8
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*compactionSeriesSet).Next(0xc75bbc0dc0, 0xc763d9c770)
/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/compact.go:691 +0x223
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.newCompactionMerger(0x1c47a80, 0xc75bbc0dc0, 0x1c47a80, 0xc75bbc0e60, 0xc7687230a0, 0x1c51880, 0xc7687230a0)
/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/compact.go:733 +0x8f
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*LeveledCompactor).populateBlock(0xc420d93220, 0xc75cb30e80, 0x4, 0x4, 0xc763d9c540, 0x1c56cc0, 0xc768766000, 0x1c3e680, 0xc7671fe640, 0x0, ...)
/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/compact.go:537 +0x5bf
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*LeveledCompactor).write(0xc420d93220, 0x7ffc588f58e3, 0x10, 0xc763d9c540, 0xc75cb30e80, 0x4, 0x4, 0x0, 0x0)
/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/compact.go:441 +0x5dd
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*LeveledCompactor).Compact(0xc420d93220, 0x7ffc588f58e3, 0x10, 0xc75cb30b40, 0x4, 0x4, 0x3023447db51b6301, 0x6431175cbef51e81, 0x0, 0x0)
/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/compact.go:339 +0x693
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*DB).compact(0xc420c7a0c0, 0x0, 0x0, 0x0)
/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/db.go:419 +0x400
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*DB).run(0xc420c7a0c0)
/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/db.go:279 +0x2cc
created by github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.Open
/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/db.go:243 +0x5ba
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 17 (11 by maintainers)
In most cases I have seen only one block is causing an issue so if you delete the entire directory you loose that date range,but Prometheus should still read the remaining. By looking at the logs and checking the
meta.json
file in each folder you should be able to tell which is the offending directory.@gouthamve suggested that we should probably add this as a cleanup command to the tsdb tool so if I find time I will try to implement this.
closing for now, but feel free to reopen if you can provide more info how to reproduce.