VictoriaMetrics: Takes too long time to start the newly added scrapers

Describe the bug The VictoriaMetrics takes too long time to start the newly added scrapers if there are scrapers that should be removed but still trying to scrape data from a target but does not respond. Please see below scenario.

  1. There are 100 scrapers.
  2. I need to stop existing 100 scrapers but need to add 1 scraper.
  3. By current implementation, the VictoriaMetrics should wait for the stop for all the scrapers.
  4. But if the targets are not responding, a scraper waits for the response until expiration of timeout.
  5. So, it the timeout is 5 seconds, the VictoriaMetrics should wait for total 500 seconds to add 1 scraper.

So, I want to fix this issue by myself and contribute the code to this community. Please guide me how to do it. Any procedures, governance of this community or anything. 😃

To Reproduce Please refer to above scenario.

Expected behavior The VictoriaMetrics should not wait for expiration of timeout but stop the scrapers immediately.

Logs None.

Screenshots None.

Version

# /usr/sbin/victoriametrics --version
victoria-metrics-20221026-123908-pmm-6401-v1.77.1

Used command-line flags

–promscrape.config=/etc/victoriametrics-promscrape.yml --retentionPeriod=4d --storageDataPath=/srv/victoriametrics/data --httpListenAddr=127.0.0.1:9090 --search.disableCache=true --search.maxQueryLen=1MB --search.latencyOffset=5s --search.maxUniqueTimeseries=100000000 --search.maxSamplesPerQuery=1500000000 --search.maxQueueDuration=30s --search.logSlowQueryDuration=30s --search.maxQueryDuration=90s --promscrape.streamParse=true --prometheusDataPath=/srv/prometheus/data --http.pathPrefix=/prometheus --envflag.enable --envflag.prefix=VM_

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 16 (2 by maintainers)

Most upvoted comments

@chadr123 Using -promscrape.streamParse="true" will be the only way to achieve this for now.

Current implementation of scraping uses different client for streamParse mode and this client supports requests cancelling. We will continue working on it to make this also possible for default mode(without -promscrape.streamParse="true" option set).

@zekker6 OK. I see. I will wait for a fix for the default mode as well. Feel free to let me know once a fix is ready. I could validate your fixes. Thanks!

FYI, the fix, which improves config reload speed for the disabled stream parsing mode, has been included in v1.88.0 - see this issue for details. Closing this issue as fixed then.

@chadr123 , please re-open this issue or create new one if the latest release of vmagent still experience the original issue you posted here.

FYI, configuration reload should take less time in vmagent starting from v1.87.1 when stream parsing mode is enabled. The fix for the disabled stream parsing mode will be provided in the future releases.

@zekker6 It seems that once I ran the Victoriametrics with -promscrape.streamParse="true" option, it works as expected, I mean that without any delay, like below but does not without the -promscrape.streamParse="true" option.

$ ./victoria-metrics-prod --promscrape.config=current.yml -promscrape.streamParse
2023-02-09T02:42:58.510Z        info    VictoriaMetrics/lib/logger/flag.go:12   build version: victoria-metrics-20230209-004937-heads-fix-vmagent-scrape-requests-cancellationn-0-g2f7537bf7-dirty-c889c5f9
2023-02-09T02:42:58.510Z        info    VictoriaMetrics/lib/logger/flag.go:13   command-line flags
2023-02-09T02:42:58.510Z        info    VictoriaMetrics/lib/logger/flag.go:20     -promscrape.config="current.yml"
2023-02-09T02:42:58.510Z        info    VictoriaMetrics/lib/logger/flag.go:20     -promscrape.streamParse="true"
2023-02-09T02:42:58.510Z        info    VictoriaMetrics/app/victoria-metrics/main.go:60 starting VictoriaMetrics at ":8428"...
2023-02-09T02:42:58.510Z        info    VictoriaMetrics/app/vmstorage/main.go:106       opening storage at "victoria-metrics-data" with -retentionPeriod=1
2023-02-09T02:42:58.624Z        info    VictoriaMetrics/lib/memory/memory.go:42 limiting caches to 10013621452 bytes, leaving 6675747636 bytes to the OS according to -memory.allowedPercent=60
2023-02-09T02:42:58.624Z        info    VictoriaMetrics/lib/storage/storage.go:1010     loading MetricName->TSID cache from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/metricName_tsid"...
2023-02-09T02:42:58.644Z        info    VictoriaMetrics/lib/storage/storage.go:1015     loaded MetricName->TSID cache from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/metricName_tsid" in 0.020 seconds; entriesCount: 0; sizeBytes: 0
2023-02-09T02:42:58.644Z        info    VictoriaMetrics/lib/storage/storage.go:1010     loading MetricID->TSID cache from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/metricID_tsid"...
2023-02-09T02:42:58.651Z        info    VictoriaMetrics/lib/storage/storage.go:1015     loaded MetricID->TSID cache from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/metricID_tsid" in 0.006 seconds; entriesCount: 0; sizeBytes: 0
2023-02-09T02:42:58.651Z        info    VictoriaMetrics/lib/storage/storage.go:1010     loading MetricID->MetricName cache from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/metricID_metricName"...
2023-02-09T02:42:58.660Z        info    VictoriaMetrics/lib/storage/storage.go:1015     loaded MetricID->MetricName cache from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/metricID_metricName" in 0.009 seconds; entriesCount: 0; sizeBytes: 0
2023-02-09T02:42:58.660Z        info    VictoriaMetrics/lib/storage/storage.go:870      loading curr_hour_metric_ids from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/curr_hour_metric_ids"...
2023-02-09T02:42:58.660Z        info    VictoriaMetrics/lib/storage/storage.go:905      loaded curr_hour_metric_ids from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/curr_hour_metric_ids" in 0.000 seconds; entriesCount: 6; sizeBytes: 64
2023-02-09T02:42:58.661Z        info    VictoriaMetrics/lib/storage/storage.go:870      loading prev_hour_metric_ids from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/prev_hour_metric_ids"...
2023-02-09T02:42:58.661Z        info    VictoriaMetrics/lib/storage/storage.go:905      loaded prev_hour_metric_ids from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/prev_hour_metric_ids" in 0.000 seconds; entriesCount: 0; sizeBytes: 16
2023-02-09T02:42:58.661Z        info    VictoriaMetrics/lib/storage/storage.go:826      loading next_day_metric_ids from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/next_day_metric_ids"...
2023-02-09T02:42:58.661Z        info    VictoriaMetrics/lib/storage/storage.go:861      loaded next_day_metric_ids from "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/cache/next_day_metric_ids" in 0.000 seconds; entriesCount: 0; sizeBytes: 16
2023-02-09T02:42:59.032Z        info    VictoriaMetrics/lib/mergeset/table.go:319       opening table "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/indexdb/174201CB845C736C"...
2023-02-09T02:42:59.619Z        info    VictoriaMetrics/lib/mergeset/table.go:359       table "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/indexdb/174201CB845C736C" has been opened in 0.586 seconds; partsCount: 1; blocksCount: 1, itemsCount: 106; sizeBytes: 1576
2023-02-09T02:42:59.623Z        info    VictoriaMetrics/lib/mergeset/table.go:319       opening table "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/indexdb/174201CB845C736B"...
2023-02-09T02:43:00.210Z        info    VictoriaMetrics/lib/mergeset/table.go:359       table "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/indexdb/174201CB845C736B" has been opened in 0.587 seconds; partsCount: 0; blocksCount: 0, itemsCount: 0; sizeBytes: 0
2023-02-09T02:43:00.805Z        info    VictoriaMetrics/lib/storage/partition.go:1921   opened part "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/data/small/2023_02/12_6_20230209024100.361_20230209024150.600_174207ABE67D5966" in 0.002 seconds
2023-02-09T02:43:00.805Z        info    VictoriaMetrics/lib/storage/partition.go:1921   opened part "/home/chadr/hdd/project/VictoriaMetrics/bin/victoria-metrics-data/data/small/2023_02/12_6_20230209005600.361_20230209010050.622_17420237EFF749E8" in 0.001 seconds
2023-02-09T02:43:01.430Z        info    VictoriaMetrics/app/vmstorage/main.go:123       successfully opened storage "victoria-metrics-data" in 2.920 seconds; partsCount: 2; blocksCount: 12; rowsCount: 24; sizeBytes: 782
2023-02-09T02:43:01.663Z        info    VictoriaMetrics/app/vmselect/promql/rollup_result_cache.go:114  loading rollupResult cache from "victoria-metrics-data/cache/rollupResult"...
2023-02-09T02:43:01.670Z        info    VictoriaMetrics/app/vmselect/promql/rollup_result_cache.go:142  loaded rollupResult cache from "victoria-metrics-data/cache/rollupResult" in 0.007 seconds; entriesCount: 0, sizeBytes: 0
2023-02-09T02:43:01.671Z        info    VictoriaMetrics/app/victoria-metrics/main.go:70 started VictoriaMetrics in 3.161 seconds
2023-02-09T02:43:01.671Z        info    VictoriaMetrics/lib/promscrape/scraper.go:110   reading Prometheus configs from "current.yml"
2023-02-09T02:43:01.671Z        info    VictoriaMetrics/lib/httpserver/httpserver.go:99 starting http server at http://127.0.0.1:8428/
2023-02-09T02:43:01.671Z        info    VictoriaMetrics/lib/httpserver/httpserver.go:100        pprof handlers are exposed at http://127.0.0.1:8428/debug/pprof/
2023-02-09T02:43:01.677Z        info    VictoriaMetrics/lib/promscrape/config.go:120    starting service discovery routines...
2023-02-09T02:43:01.677Z        info    VictoriaMetrics/lib/promscrape/config.go:126    started service discovery routines in 0.000 seconds
2023-02-09T02:43:01.678Z        info    VictoriaMetrics/lib/promscrape/scraper.go:422   static_configs: added targets: 1, removed targets: 0; total targets: 1
2023-02-09T02:46:12.697Z        info    VictoriaMetrics/lib/promscrape/scraper.go:151   SIGHUP received; reloading Prometheus configs from "current.yml"
2023-02-09T02:46:12.698Z        info    VictoriaMetrics/lib/promscrape/config.go:131    restarting service discovery routines...
2023-02-09T02:46:12.698Z        info    VictoriaMetrics/lib/promscrape/config.go:174    restarted service discovery routines in 0.000 seconds, stopped=1, started=1, restarted=0
2023-02-09T02:46:12.698Z        info    VictoriaMetrics/lib/promscrape/scraper.go:193   found changes in "current.yml"; applying these changes
2023-02-09T02:46:12.699Z        warn    VictoriaMetrics/lib/promscrape/scrapework.go:376        cannot scrape target "http://localhost:42000/metrics?collect%5B%5D=bonding&collect%5B%5D=entropy&collect%5B%5D=textfile.lr&collect%5B%5D=uname" ({agent_id="/agent_id/014494c3-36d3-4256-8569-2e87e99e5312",agent_type="node_exporter",instance="/agent_id/014494c3-36d3-4256-8569-2e87e99e5312",job="node_exporter_agent_id_014494c3-36d3-4256-8569-2e87e99e5312_lr",node_id="/node_id/8d53380e-0c10-4297-99ae-09ba969bf236"}) 1 out of 1 times during -promscrape.suppressScrapeErrorsDelay=0s; the last error: cannot read data: cannot scrape "http://localhost:42000/metrics?collect%5B%5D=bonding&collect%5B%5D=entropy&collect%5B%5D=textfile.lr&collect%5B%5D=uname": Get "http://localhost:42000/metrics?collect%5B%5D=bonding&collect%5B%5D=entropy&collect%5B%5D=textfile.lr&collect%5B%5D=uname": context canceled
2023-02-09T02:46:12.700Z        info    VictoriaMetrics/lib/promscrape/scraper.go:422   static_configs: added targets: 1, removed targets: 1; total targets: 1