nats-server: RAFT: Resource not found on account deletion
Defect
On our production servers, we observed the following kind of error (generated every 20 seconds approximately):
[82010] 2023/03/23 14:59:46.793797 [ERR] RAFT [Yjedx3Ck - S-R3F-12Jrdv1K] Resource not found: open /home/nats/store/jetstream/AAWOKMTECQYMRURLZLDPLKEMEOSXPSD5QEJST3QS5UZSRH5W7JGOKKPD/_js_/S-R3F-12Jrdv1K/tav.idx: no such file or directory
[82010] 2023/03/23 14:59:46.793825 [WRN] RAFT [Yjedx3Ck - S-R3F-12Jrdv1K] Error writing term and vote file for "S-R3F-12Jrdv1K": open /home/nats/store/jetstream/AAWOKMTECQYMRURLZLDPLKEMEOSXPSD5QEJST3QS5UZSRH5W7JGOKKPD/_js_/S-R3F-12Jrdv1K/tav.idx: no such file or directory
[82010] 2023/03/23 14:59:46.793846 [ERR] RAFT [Yjedx3Ck - S-R3F-12Jrdv1K] Resource not found: open /home/nats/store/jetstream/AAWOKMTECQYMRURLZLDPLKEMEOSXPSD5QEJST3QS5UZSRH5W7JGOKKPD/_js_/S-R3F-12Jrdv1K/tav.idx: no such file or directory
[82010] 2023/03/23 14:59:52.521153 [ERR] RAFT [Yjedx3Ck - S-R3F-12Jrdv1K] Resource not found: open /home/nats/store/jetstream/AAWOKMTECQYMRURLZLDPLKEMEOSXPSD5QEJST3QS5UZSRH5W7JGOKKPD/_js_/S-R3F-12Jrdv1K/tav.idx: no such file or directory
[82010] 2023/03/23 14:59:52.521201 [WRN] RAFT [Yjedx3Ck - S-R3F-12Jrdv1K] Error writing term and vote file for "S-R3F-12Jrdv1K": open /home/nats/store/jetstream/AAWOKMTECQYMRURLZLDPLKEMEOSXPSD5QEJST3QS5UZSRH5W7JGOKKPD/_js_/S-R3F-12Jrdv1K/tav.idx: no such file or directory
[82010] 2023/03/23 14:59:58.195319 [WRN] JetStream cluster stream 'ABF5FAZ53HINEIAX2BPKECJXSJ7R7OCJKXQNWIDAFNM5XAV5E5LZDGII > Q_Ginkgo-N1-sqs_test-95-gqurl-sqs-with-a-queue-candac4d4cb22ef-queue' has NO quorum, stalled
[82010] 2023/03/23 14:59:58.196204 [ERR] RAFT [Yjedx3Ck - S-R3F-12Jrdv1K] Resource not found: open /home/nats/store/jetstream/AAWOKMTECQYMRURLZLDPLKEMEOSXPSD5QEJST3QS5UZSRH5W7JGOKKPD/_js_/S-R3F-12Jrdv1K/tav.idx: no such file or directory
[82010] 2023/03/23 14:59:58.196239 [WRN] RAFT [Yjedx3Ck - S-R3F-12Jrdv1K] Error writing term and vote file for "S-R3F-12Jrdv1K": open /home/nats/store/jetstream/AAWOKMTECQYMRURLZLDPLKEMEOSXPSD5QEJST3QS5UZSRH5W7JGOKKPD/_js_/S-R3F-12Jrdv1K/tav.idx: no such file or directory
A server restart generally solves the issue.
Versions of nats-server
and affected client libraries used:
- nats-server: v2.9.15 (but we also seen the issue with v2.9.14)
- Go libraries:
- github.com/nats-io/jwt/v2 v2.3.0
- github.com/nats-io/nats.go v1.23.0
OS/Container environment:
- NATS cluster (with 3 virtual machines)
- Ubuntu 20.04.5 LTS
Steps or code to reproduce the issue:
While not easy to reproduce, we managed to get the same problem on a temporary test cluster. Here is the process:
- several Go clients are run in parallel
- these Go clients:
- create a NATS account (through JWT)
- connect as user to create a stream and a consumer
- once operations are finished on the stream / consumer, the stream is deleted, the NATS account is DELETED and PURGED
Expected result:
It should work all the time.
Actual result:
It sometimes fails with RAFT errors. Note that the mentioned directory (/home/nats/store/jetstream/AAWOKMTECQYMRURLZLDPLKEMEOSXPSD5QEJST3QS5UZSRH5W7JGOKKPD/_js_/S-R3F-12Jrdv1K
in my case) does not exist anymore.
Please find the full logs below and look for errors on the server number 2:
If you need additional information, please ask 😃 Thanks!
About this issue
- Original URL
- State: open
- Created a year ago
- Comments: 15 (12 by maintainers)
Servers are built nightly from both the main and dev branches. We encourage folks to try them out as needed.
https://hub.docker.com/r/synadia/nats-server/tags
Look for nightly-main, nightly is the dev branch.
The directory is missing.
Thanks this is helpful and this is an active bug we are tracking and hope to fix before a 2.9.16 release.