nats-server: NATS/Jetstream server does not expire messges fast enough from the filestore when stream size is very large.
Defect
NATS/Jetstream server does not expire messages fast enough from the filestore when stream size is very large. Stream size continues to grow and old messages continue to get further and further in the past.
Make sure that these boxes are checked before submitting your issue – thank you!
- [X ] Included
nats-server -DV
output - Included a [Minimal, Complete, and Verifiable example] (https://stackoverflow.com/help/mcve)
Versions of nats-server
and affected client libraries used:
nats-server-2.10.0-beta.42 nats-cli version 0.0.35
OS/Container environment:
synadia/nats-server:nightly, running in a docker container (not clustered) on an HP SuperMicro SuperStorage server with 256GB of RAM and 88 CPU cores. The container is using File storage for persistence. The file storage is a RAID 0 (6 HDD @ 7200RPM) total of 76 TB.
Steps or code to reproduce the issue:
- Start up Docker container using this command. docker run -d --name test -p 4222:4222 -p 6222:6222 -p 8222:8222 -v /data3:/data3 synadia/nats-server:nightly -js -sd=/data3
- Create a stream with a single subject. Set the limits for a max-age of 3 days (I suspect 2 days would work as well). nats stream add --config=./config.json (config.json) attached. In my case, I’m creating 8 identical streams each with a different name. Each stream has a single subject of the same name as the stream. Note, I have acks turned off. My publishers are adhering to the NATS core API and publishing to a subject only. My server uses Jetstream to persist the messages and allow me to replay them on demand.
- Start pushing data to the stream. In my case, I’m pushing ~1.45 GiB/minute to the stream in 400KB messages. This rounds out to 81 GiB/hour and ~2.1 GiB/day. I need to be able to keep 3 days worth of messages.
Expected result:
At the 3d0H0M0S mark the old messages should be purged where so the first message is always 3 days older than the most recent message.
Actual result:
At the 3 day mark, old messages are purged, but they are purged very slowly. At 3 days and 4 hours, the oldest message was 3 days and 1 hour old. After 5 days, my oldest message is now 4 days old. Stream info attached.
NOTE: I ran this same test with a 1 day retention policy and the data was purged on time as expected. This seems to be an issue with the message expiration keeping up on very large, file-based streams.
[nats_dv.txt]( stream_info.txt https://github.com/nats-io/nats-server/files/11996451/nats_dv.txt) config_json.txt stream_info.txt
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 1
- Comments: 16 (8 by maintainers)
I should be able to do that. I’ll let you know how that goes
On Tue, Jul 18, 2023 at 11:55 Derek Collison @.***> wrote:
Unfortunately I had to stop it and clean up the disk as I was nearly out of space. I’m about to start up a new instance so I can grab a tree of the storage directory at some point after that. I assume you want it when I reach the max-age?