VictoriaMetrics: query_range is more and more slowly
Describe the bug
- query_range is more and more slowly, but will be recovery at 08:00 every day(utc 00:00)
- vminsert always oom
Expected behavior query_range will be more faster
Screenshots

Version
The line returned when passing --version command line flag to binary. For example:
./vmstorage-prod -version
vmstorage-20200623-210915-heads-cluster-0-g46c5c077
Used command-line flags
./vminsert-prod -influxSkipSingleField -insert.maxQueueDuration 10s -storageNode=10.15.38.29:8400 -storageNode=10.15.38.30:8400 -storageNode=10.15.59.47:8400 -storageNode=10.15.59.48:8400 -storageNode=10.15.59.49:8400 -storageNode=10.15.34.61:8400 -storageNode=10.15.38.19:8400 -storageNode=10.15.38.20:8400 -storageNode=10.15.53.33:8400 -storageNode=10.15.59.20:8400 -storageNode=10.15.36.41:8400
./vmselect-prod -search.maxQueryLen 32768 -storageNode=10.15.78.25:8401 -storageNode=10.15.77.18:8401 -storageNode=10.15.77.19:8401 -storageNode=10.15.77.20:8401 -storageNode=10.15.68.27:8401 -storageNode=10.15.78.44:8401 -storageNode=10.15.81.42:8401 -storageNode=10.15.81.46:8401 -storageNode=10.15.87.23:8401 -storageNode=10.15.49.19:8401 -storageNode=10.15.58.61:8401 -storageNode=10.15.38.29:8401 -storageNode=10.15.38.30:8401 -storageNode=10.15.59.47:8401 -storageNode=10.15.59.48:8401 -storageNode=10.15.59.49:8401 -storageNode=10.15.34.61:8401 -storageNode=10.15.38.19:8401 -storageNode=10.15.38.20:8401 -storageNode=10.15.53.33:8401 -storageNode=10.15.59.20:8401 -storageNode=10.15.36.41:8401
./vmstorage-prod -storageDataPath /data1/vmdata -retentionPeriod 6 -search.maxUniqueTimeseries 5000000
Additional context
- we have 11
vmstorageinstances. total ingest rate ~ 2.6M points/s - vmstorage log by
grep '^2020-07.*error' app.logvmstorage.error.log - vminsert log by
grep -E '2020-07-01.*(error|warn)' vminsert.logvminsert.error.log.tar.gz - vmselect log by
grep -E '2020-07-01.*(warn|error)' vmselect.log | grep -v 'slow query according' | grep -Ev 'VictoriaMetrics/app/vmselect/main.go:31[78]'vmselect.error.log.tar.gz - vmstorage pprof pprof.vmstorage-prod.samples.cpu.001.pb.gz@utc 04:47 pprof.vmstorage-prod.samples.cpu.002.pb.gz@utc 05:40
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 25 (16 by maintainers)
Commits related to this issue
- lib/storage: the `dmis` is no longer check in `insert path` (#596) — committed to n4mine/VictoriaMetrics by n4mine 4 years ago
- lib/storage: reset MetricName->TSID cache after deleting time series This should prevent from adding new data points to deleted time series without the need to check for the deleted time series. Thi... — committed to VictoriaMetrics/VictoriaMetrics by valyala 4 years ago
- lib/storage: reset MetricName->TSID cache after deleting time series This should prevent from adding new data points to deleted time series without the need to check for the deleted time series. Thi... — committed to VictoriaMetrics/VictoriaMetrics by valyala 4 years ago
FYI, the commit that removes
dmis.Hascalls from data ingestion path has been included in v1.38.0 release.@n4mine , thanks for the idea about removing
dmis.Hascheck insideStorage.add! It has been appeared it is quite easy to implement it by just resettingMetricName->TSIDcache after the deletion of time series. This guarantees that the cache won’t contain entries for the deleted time series. See the commit fe58462bef9f6c211a036fa1e4f9cf3ced4b9ad4 .seems it’s slow when determine
MetricIDinDeletedMetricIDshttps://github.com/VictoriaMetrics/VictoriaMetrics/blob/8bb3622e9dc7309f60b9412090af342d5d5d2192/lib/storage/index_db.go#L1407