VictoriaMetrics: vmselect: data over retentionPeriod 1d is not filtered out on query time
Is your question request related to a specific component?
VictoriaMetrics cluster version v1.87.6 VMStorage component
Describe the question in detail
I am using VictoriaMetrics cluster version with all components replica = 3
The retentionPeriod=1d is passed to VMStorage component, but it does not work as expected.
At the time I post this issue 2023:06:30 10:20, Grafana dashboard still shows data from 2023-06-28 16:15:00 that is I think outside of 1 day period ( 2023:06:29 10:20 => 2023:06:30 10:20)
I read through the session about retention here: (https://docs.victoriametrics.com/#retention), but I am confused about when vmstorage will remove old data with day retention, as the documentation only talks about monthly rotate data
Data is split in per-month partitions inside <-storageDataPath>/data/{small,big} folders. Data partitions outside the configured retention are deleted on the first day of the new month
Here is all config I have set in the VMStorage component
- '--retentionPeriod=1d'
- '--storageDataPath=/storage'
- '--envflag.enable=true'
- '--envflag.prefix=VM_'
- '--loggerFormat=json'
Thanks for your time reading my question!
Troubleshooting docs
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 32 (1 by maintainers)
Hello @tantm3! Sorry for the late response.
It turned out that this additional filtering on query time works on VictoriaMetrics single-server installation only, because in this case the process knows the configured retention period and is able to filter out results by time before returning.
In cluster mode, only vmstorage component is aware of retentionPeriod setting. This is why vmselect component isn’t able to filter data in the same way as in single-server installation.
The potential change here is the following: Enhancement: make vmselect in cluster mode to filter out results exceeding the configured
retentionPeriodin the same way as in VictoriaMetrics single-server.Taking into account that in both cases data parts still can be stored longer than configured retentionPeriod, the severity of this issue doesn’t seem critical. Even if we fix that for cluster mode by, for example, explicitly specifying expected retentionPeriod for vmselect - it would only change the visual aspect. The data will remain on the disk until deleted due to details I described earlier.
I’d add that this issue might concern installations with low retentionPeriod values, like
1d. For bigger values ofretentionPeriod, it is unlikely a concern - this is why this issue was never a subject for discussion. Thanks for digging into details of this!Hi @hagen1778,
Thanks for your detailed response! I think it’s a good idea to keep the problem here as an enhancement. Hope to see the enhancement soon!
Hi @hagen1778,
Could you please try three new backup files? I have a mistake when archiving the backup folder.
Just for information. I am trying to deploy a new cluster with testing data and I will give you a snapshot data soon.
Thanks for your help! I will close this issue here because it has been solved.