VictoriaMetrics: vmselect is killed by oom-killer when quering export API
Describe the bug vmselect is killed by oom-killer when quering export API.
To Reproduce Setup cluster. Start vmselect container with 8GB memory limit. Perform data export
curl -H 'Accept-Encoding: gzip' http://<vmselect>:8481/select/1/prometheus/api/v1/export -d 'match[]={host="srv1"}' --output data.jsonl.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1092M 0 1092M 0 24 10.3M 0 --:--:-- 0:01:45 --:--:-- 15.0M
curl: (18) transfer closed with outstanding read data remaining
Kernel messages:
conmon: conmon c023a2e02cd2f4109ddb <ninfo>: OOM received
kernel: [28891450.687558] vmselect-prod invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0
kernel: [28891450.687562] vmselect-prod cpuset=libpod-c023a2e02cd2f4109ddbfe88a708f29ffa1f571af3e095a979010a6752953b71 mems_allowed=0
kernel: [28891450.687570] CPU: 40 PID: 2375 Comm: vmselect-prod Tainted: G O 4.19.65-1.el7.x86_64 #1
kernel: [28891450.687571] Hardware name:
kernel: [28891450.687572] Call Trace:
kernel: [28891450.687583] dump_stack+0x63/0x88
kernel: [28891450.687589] dump_header+0x78/0x2a4
kernel: [28891450.687596] ? mem_cgroup_scan_tasks+0x9c/0xf0
kernel: [28891450.687600] oom_kill_process+0x26b/0x290
kernel: [28891450.687603] out_of_memory+0x140/0x4a0
kernel: [28891450.687607] mem_cgroup_out_of_memory+0xb9/0xd0
kernel: [28891450.687610] try_charge+0x6d6/0x750
kernel: [28891450.687614] ? __alloc_pages_nodemask+0x119/0x2a0
kernel: [28891450.687617] mem_cgroup_try_charge+0xbe/0x1d0
kernel: [28891450.687619] mem_cgroup_try_charge_delay+0x22/0x50
kernel: [28891450.687624] do_anonymous_page+0x11a/0x650
kernel: [28891450.687627] __handle_mm_fault+0xc24/0xe80
kernel: [28891450.687631] handle_mm_fault+0x102/0x240
kernel: [28891450.687636] __do_page_fault+0x212/0x4e0
kernel: [28891450.687640] do_page_fault+0x37/0x140
kernel: [28891450.687645] ? page_fault+0x8/0x30
kernel: [28891450.687648] page_fault+0x1e/0x30
kernel: [28891450.687651] RIP: 0033:0x469e28
kernel: [28891450.687655] Code: 4c 01 de 48 29 c3 c5 fe 6f 06 c5 fe 6f 4e 20 c5 fe 6f 56 40 c5 fe 6f 5e 60 48 01 c6 c5 fd 7f 07 c5 fd 7f 4f 20 c5 fd 7f 57 40 <c5> fd 7f 5f 60 48 01 c7 48 29 c3 77 cf 48 01 c3 48 01 fb c4 c1 7e
kernel: [28891450.687656] RSP: 002b:000000c000c58c98 EFLAGS: 00010202
kernel: [28891450.687659] RAX: 0000000000000080 RBX: 0000000000093650 RCX: 000000c1e6d7c670
kernel: [28891450.687660] RDX: 000000000002d990 RSI: 000000c1e6ce9020 RDI: 000000c1eb062fa0
kernel: [28891450.687662] RBP: 000000c000c58cf8 R08: 0000000000000001 R09: 0000000000128000
kernel: [28891450.687663] R10: 000000c1eb00c000 R11: 0000000000000020 R12: 0000000000000002
kernel: [28891450.687665] R13: 0000000000df5660 R14: 0000000000000000 R15: 0000000000468840
kernel: [28891450.687667] Task in /cl/vmselect/pids-batch/libpod-c023a2e02cd2f4109ddbfe88a708f29ffa1f571af3e095a979010a6752953b71 killed as a result of limit of /cl/vmselect
kernel: [28891450.687678] memory: usage 8388608kB, limit 8388608kB, failcnt 712433
kernel: [28891450.687680] memory+swap: usage 8388612kB, limit 9007199254740988kB, failcnt 0
kernel: [28891450.687681] kmem: usage 63936kB, limit 8388608kB, failcnt 0
kernel: [28891450.687682] Memory cgroup stats for /cl/vmselect: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
kernel: [28891450.687699] Memory cgroup stats for /cl/vmselect/pids-batch: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
kernel: [28891450.687714] Memory cgroup stats for /cl/vmselect/pids-batch/libpod-c023a2e02cd2f4109ddbfe88a708f29ffa1f571af3e095a979010a6752953b71: cache:2460KB rss:8321004KB rss_huge:0KB shmem:0KB mapped_file:3300KB dirty:0KB writeback:0KB swap:0KB inactive_anon:20KB active_anon:8324280KB inactive_file:4KB active_file:0KB unevictable:0KB
kernel: [28891450.687731] Memory cgroup stats for /cl/vmselect/pids-idle: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
kernel: [28891450.687747] Tasks state (memory values in pages):
kernel: [28891450.687748] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
kernel: [28891450.687910] [ 8923] 0 8923 10744 1210 131072 0 0 xxxxxxx
kernel: [28891450.687914] [ 8952] 0 8952 7725 723 102400 0 0 xxxxxxx
kernel: [28891450.687917] [ 9018] 0 9018 63581 787 151552 0 0 xxxxxxx
kernel: [28891450.687962] [ 167136] 999 167136 4331147 2079099 18558976 0 0 vmselect-prod
kernel: [28891450.687974] [ 176997] 0 176997 3987 1063 81920 0 0 xxxxxxx
kernel: [28891450.688025] Memory cgroup out of memory: Kill process 167136 (vmselect-prod) score 993 or sacrifice child
kernel: [28891450.701657] Killed process 167136 (vmselect-prod) total-vm:17324588kB, anon-rss:8311840kB, file-rss:6052kB, shmem-rss:0kB
kernel: [28891451.138861] oom_reaper: reaped process 167136 (vmselect-prod), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Version
vmselect-20200727-210903-tags-v1.39.1-cluster-0-g96bc476e5
Used command-line flags
flag{name="cacheDataPath", value="/var/lib/victoriametrics/cache"} 1
flag{name="dedup.minScrapeInterval", value="1ms"} 1
flag{name="enableTCP6", value="false"} 1
flag{name="envflag.enable", value="true"} 1
flag{name="envflag.prefix", value=""} 1
flag{name="fs.disableMmap", value="false"} 1
flag{name="http.disableResponseCompression", value="false"} 1
flag{name="http.maxGracefulShutdownDuration", value="7s"} 1
flag{name="http.pathPrefix", value=""} 1
flag{name="http.shutdownDelay", value="0s"} 1
flag{name="httpListenAddr", value=":8481"} 1
flag{name="loggerErrorsPerSecondLimit", value="10"} 1
flag{name="loggerFormat", value="default"} 1
flag{name="loggerLevel", value="INFO"} 1
flag{name="loggerOutput", value="stderr"} 1
flag{name="memory.allowedPercent", value="60"} 1
flag{name="search.cacheTimestampOffset", value="5m0s"} 1
flag{name="search.denyPartialResponse", value="false"} 1
flag{name="search.disableCache", value="false"} 1
flag{name="search.latencyOffset", value="30s"} 1
flag{name="search.logSlowQueryDuration", value="5s"} 1
flag{name="search.maxConcurrentRequests", value="16"} 1
flag{name="search.maxExportDuration", value="720h0m0s"} 1
flag{name="search.maxLookback", value="0s"} 1
flag{name="search.maxPointsPerTimeseries", value="30000"} 1
flag{name="search.maxQueryDuration", value="30s"} 1
flag{name="search.maxQueryLen", value="16384"} 1
flag{name="search.maxQueueDuration", value="10s"} 1
flag{name="search.maxStalenessInterval", value="0s"} 1
flag{name="search.minStalenessInterval", value="0s"} 1
flag{name="search.resetCacheAuthKey", value="secret"} 1
flag{name="selectNode", value=""} 1
flag{name="storageNode", value="1-vmstorage:8401,2-vmstorage:8401,3-vmstorage:8401"} 1
flag{name="version", value="false"} 1
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 21
Commits related to this issue
- app/vmselect: reduce memory usage when exporting time series with big number of samples via `/api/v1/export` if `max_rows_per_line` is set to non-zero value Updates https://github.com/VictoriaMetrics... — committed to VictoriaMetrics/VictoriaMetrics by valyala 4 years ago
- app/vmselect: reduce memory usage when exporting time series with big number of samples via `/api/v1/export` if `max_rows_per_line` is set to non-zero value Updates https://github.com/VictoriaMetrics... — committed to VictoriaMetrics/VictoriaMetrics by valyala 4 years ago
- app: respect CPU limits set via cgroups Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage for cases when VictoriaMetrics components run in container... — committed to VictoriaMetrics/VictoriaMetrics by valyala 4 years ago
- app: respect CPU limits set via cgroups Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage for cases when VictoriaMetrics components run in container... — committed to VictoriaMetrics/VictoriaMetrics by valyala 4 years ago
- lib/cgroup: attempt to obtain available CPU cores via /sys/devices/system/cpu/online See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685#issuecomment-674423728 — committed to VictoriaMetrics/VictoriaMetrics by valyala 4 years ago
- lib/cgroup: attempt to obtain available CPU cores via /sys/devices/system/cpu/online See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685#issuecomment-674423728 — committed to VictoriaMetrics/VictoriaMetrics by valyala 4 years ago
- lib/cgroup: do not adjust the number of detected CPU cores via /sys/devices/system/cpu/online The adjustement increases the resulting GOMAXPROC by 1, which looks confusing to users as outlined at htt... — committed to VictoriaMetrics/VictoriaMetrics by valyala 4 years ago
- lib/cgroup: do not adjust the number of detected CPU cores via /sys/devices/system/cpu/online The adjustement increases the resulting GOMAXPROC by 1, which looks confusing to users as outlined at htt... — committed to VictoriaMetrics/VictoriaMetrics by valyala 4 years ago
With CPU quota limit
Without CPU quota limit
Looks little bit misleading.
You may try get available CPUs by reading
/sys/devices/system/cpu/online.https://www.kernel.org/doc/html/latest/admin-guide/cputopology.html
The number is rounded to whole.
I checked some our production containers
CPU quota is not respected (as memory limit too) due to the specifics of our cloud.
As i said in my case quota is set in last but one level of cgroup hierarchy, so it is not visible from inside of container. Value of
/sys/fs/cgroup/cpu/cpu.cfs_quota_usis -1 andcpu.cfs_period_usis 100000.So the only way to reduce CPU trashing in my case is set
GOMAXPROCSenv var equal to containers CPU quota.But using
cpu.cfs_quota_usandcpu.cfs_period_usmust work fine in clean Docker and Kubernetes.