nats-server: JetStream does not survive after "stale NFS file handle" error

Defect

I’m running NATS cluster on AWS Fargate. For ~2 months it was running ok (NATS v2.6.6), but 2 days ago it suffered storage issue with EFS backend. JetStream became unavailable. I’ve tried to upgrade NATS to a recent version (v2.7.0) in a hope it would automatically heal itself, but it did not.

Versions of nats-server and affected client libraries used:

  • NATS 2.6.6 or NATS 2.7.0

OS/Container environment:

  • AWS Fargate with efs-volume as storage backend
  • For the reproduces I’m using a CentOS 7 VM

nats-server -DV:

[20215] 2022/01/16 19:44:28.561325 [INF] Starting nats-server
[20215] 2022/01/16 19:44:28.561508 [INF]   Version:  2.7.0
[20215] 2022/01/16 19:44:28.561512 [INF]   Git:      [not set]
[20215] 2022/01/16 19:44:28.561514 [DBG]   Go build: go1.17.6
[20215] 2022/01/16 19:44:28.561518 [INF]   Name:     NBME2FKEOX72T5L6VUKYUEXKFIWIRUBXAQHDUTJHH3XOBQJUOIQ4CP6Q
[20215] 2022/01/16 19:44:28.561523 [INF]   ID:       NBME2FKEOX72T5L6VUKYUEXKFIWIRUBXAQHDUTJHH3XOBQJUOIQ4CP6Q
[20215] 2022/01/16 19:44:28.561547 [DBG] Created system account: "$SYS"
[20215] 2022/01/16 19:44:28.561910 [INF] Listening for client connections on 0.0.0.0:4222
[20215] 2022/01/16 19:44:28.561917 [DBG] Get non local IPs for "0.0.0.0"
[20215] 2022/01/16 19:44:28.562071 [DBG]   ip=198.61.70.28
[20215] 2022/01/16 19:44:28.562097 [DBG]   ip=172.17.0.1
[20215] 2022/01/16 19:44:28.562108 [INF] Server is ready

Steps or code to reproduce the issue:

  1. Install NFS server (here and after a CentOS 7 VM is used): yum install nfs-utils

  2. Configure /etc/exports and apply changes withsystemctl restart nfs:

    /tmp/nats localhost(rw,no_root_squash)
    
  3. mount NFS:

    mount -t nfs -o hard localhost:/tmp/nats /mnt/nfsnats
    
  4. Start JetStream cluster (run each command in a new shell)

    nats-server -p 4222 -cluster nats://localhost:4248 --cluster_name test-cluster -routes nats://localhost:4248,nats://localhost:5248,nats://localhost:6248 -js -sd /mnt/nfsnats/natsA --name natsA
    nats-server -p 5222 -cluster nats://localhost:5248 --cluster_name test-cluster -routes nats://localhost:4248,nats://localhost:5248,nats://localhost:6248 -js -sd /mnt/nfsnats/natsB --name natsB
    nats-server -p 6222 -cluster nats://localhost:6248 --cluster_name test-cluster -routes nats://localhost:4248,nats://localhost:5248,nats://localhost:6248 -js -sd /mnt/nfsnats/natsC --name natsC
    
  5. Create a stream and start ingesting data:

    nats str add --replicas 3 --subjects data --storage file DATA
    # all other options with default values
    i=0; while true; do ((i++)); nats pub data $i; sleep 0.3; done
    
  6. Simulate NFS file handle error by “breaking” a mount point of a leader server for DATA stream

    cp -r /tmp/nats /tmp/nats.backup
    rm -rf /tmp/nats
    mv /tmp/nats.backup /tmp/nats
    
  7. Restart NFS

systemctl restart nfs
  1. JetStream shuts down:
[6229] 2022/01/16 20:19:39.416387 [ERR] JetStream failed to store a msg on stream '$G > DATA': write /mnt/nfs/nats/natsB/jetstream/$G/streams/DATA/msgs/1.blk: bad file descriptor
[6229] 2022/01/16 20:19:39.737394 [ERR] JetStream failed to store a msg on stream '$G > DATA': error opening msg block file ["/mnt/nfs/nats/natsB/jetstream/$G/streams/DATA/msgs/1.blk"]: open /mnt/nfs/nats/natsB/jetstream/$G/streams/DATA/msgs/1.blk: stale NFS file handle
[6229] 2022/01/16 20:19:40.068785 [ERR] JetStream failed to store a msg on stream '$G > DATA': error opening msg block file ["/mnt/nfs/nats/natsB/jetstream/$G/streams/DATA/msgs/1.blk"]: open /mnt/nfs/nats/natsB/jetstream/$G/streams/DATA/msgs/1.blk: stale NFS file handle
[6229] 2022/01/16 20:19:40.401472 [ERR] JetStream failed to store a msg on stream '$G > DATA': error opening msg block file ["/mnt/nfs/nats/natsB/jetstream/$G/streams/DATA/msgs/1.blk"]: open /mnt/nfs/nats/natsB/jetstream/$G/streams/DATA/msgs/1.blk: stale NFS file handle
[6229] 2022/01/16 20:19:44.316535 [ERR] RAFT [wURfHZ9N - _meta_] Critical write error: open /mnt/nfs/nats/natsB/jetstream/$SYS/_js_/_meta_/tav.idx: stale NFS file handle
[6229] 2022/01/16 20:19:44.316564 [WRN] RAFT [wURfHZ9N - _meta_] Error writing term and vote file for "_meta_": open /mnt/nfs/nats/natsB/jetstream/$SYS/_js_/_meta_/tav.idx: stale NFS file handle
[6229] 2022/01/16 20:19:44.316846 [ERR] JetStream out of resources, will be DISABLED
[6229] 2022/01/16 20:19:44.317655 [ERR] RAFT [wURfHZ9N - _meta_] Critical write error: open /mnt/nfs/nats/natsB/jetstream/$SYS/_js_/_meta_/tav.idx: stale NFS file handle
[6229] 2022/01/16 20:19:44.317667 [WRN] RAFT [wURfHZ9N - _meta_] Error writing term and vote file for "_meta_": open /mnt/nfs/nats/natsB/jetstream/$SYS/_js_/_meta_/tav.idx: stale NFS file handle
[6229] 2022/01/16 20:19:45.325688 [INF] Initiating JetStream Shutdown...
[6229] 2022/01/16 20:19:45.325943 [INF] JetStream Shutdown

Any JetStream operation fails with “JetStream system temporarily unavailable”

  1. Fix mountpoint:

    umount -f -l /mnt/nfsnats
    mount -t nfs -o hard localhost:/tmp/nats /mnt/nfsnats
    
  2. Restarting NATS doesn’t recover JetStream:

$ nats str ls
No Streams defined

All data appears to be lost. But it is there, in /mnt/nfsdata/… !

Expected result:

JetStream data should be recovered as much as possible.

Actual result:

All JetStream data is no longer available.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 30 (14 by maintainers)

Most upvoted comments

If you can please upgrade to at least 2.7.4. JetStream is rapidly improving and best to stay as current as possible.

The upcoming 2.7.5 also has some improvements so if kicking the tires or doing testing you can use our nightly built docker image at syndia/nats-server:nightly.

On version 2.8.1 and experiencing the log flooding as well

[535] 2022/05/02 15:37:09.351378 [WRN] RAFT [cnrtt3eg - C-R3F-C4IFG7Z8] Expected first catchup entry to be a snapshot and peerstate, will retry
nats
[535] 2022/05/02 15:37:09.351751 [WRN] RAFT [cnrtt3eg - C-R3F-C4IFG7Z8] Error storing entry to WAL: raft: could not store entry to WAL
nats
[535] 2022/05/02 15:37:09.351763 [WRN] RAFT [cnrtt3eg - C-R3F-C4IFG7Z8] Expected first catchup entry to be a snapshot and peerstate, will retry
nats
[535] 2022/05/02 15:37:09.352074 [WRN] RAFT [cnrtt3eg - C-R3F-C4IFG7Z8] Error storing entry to WAL: raft: could not store entry to WAL
nats
[535] 2022/05/02 15:37:09.352089 [WRN] RAFT [cnrtt3eg - C-R3F-C4IFG7Z8] Expected first catchup entry to be a snapshot and peerstate, will retry
nats
[535] 2022/05/02 15:37:09.352511 [WRN] RAFT [cnrtt3eg - C-R3F-C4IFG7Z8] Error storing entry to WAL: raft: could not store entry to WAL

Is there a way to avoid the amount of log warnings? I just blew through 50gb of logs in less than 24 hours.

NATS JetStream is ready for production, and we are working hard with our customers to have a better experience when running inside of K8S.

The areas we want to watch out for are the following.

  1. Ingest controllers and overriding NATS client and server connectivity. We suggest using NodePort.
  2. K8S dns, if you rely on it for multiple A records, etc.
  3. Memory limits and OOM killer. This is not specific to NATS, but more of a Golang programs inside of K8S containers with limits. Was experimenting with a team the other day and tweaking GOGC but also did some work this weekend on reducing NATS w/ JetStream’s memory usage to hopefully avoid the OOM all together.

We have also put more work into out helm charts, specifically utilizing the /healthz endpoint for a server when upgrading before moving to the next server.