hornet: Cannot shut down Hornet gracefully

Describe the bug I run Hornet in a Docker container. I use my own image. I cannot stop Hornet in a graceful way. To Reproduce Steps to reproduce the behavior:

Create Dockerfile with these steps:

wget -q https://github.com/gohornet/hornet/releases/download/v0.4.1/HORNET-0.4.1_Linux_x86_64.tar.gz 
...
tar xf HORNET-0.4.1_Linux_x86_64.tar.gz
...
CMD script.sh

script.sh:

<a few steps>
exec hornet

Start and stop container.
Message in the container log:

WARN    Tangle  HORNET was not shut down correctly, the database may be corrupted. Starting revalidation...

Then it takes several hours to validate the DB. Dropping the DB with every restart is definitely not an option for me.

Troubleshooting: Confirm that the hornet process is running with PID 1:

docker exec hornet ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
hornet       1 16.9 78.6 79727852 3180880 ?    Ssl  06:54   2:12 /folder/hornet

I have also tried to add STOPSIGNAL 2 to the Dockerfile and specify a long stop-timeout:

 docker inspect -f '{{.Config.StopSignal}}' hornet
2
 docker inspect -f '{{.Config.StopTimeout}}' hornet
300

While docker stop hornet is running I can look at the logs and I see this:

2020-07-10T06:46:06Z    WARN    Graceful Shutdown       Received shutdown request - waiting (max 2 seconds) to finish processing (send queue <ip>:15600, BroadcastQueue, send queue 207.180.224.65:15600, Tangle[HeartbeatEvents], Cleanup at shutdown, Peering Server, Dashboard[Visualizer], Close database, TangleProcessor[ReceiveTx], TangleProcessor[ProcessMilestone], send queue <ip>:15600, LocalSnapshots, send queue <ip>:15600, PendingRequestsEnqueuer, Tangle Solidifier, Dashboard[WSSend], MessageProcessor, TangleProcessor[UpdateMetrics], Peering Reconnect, send queue <ip>:15600, STINGRequester, WarpSync[Events], Metrics TPS Updater, TangleProcessor[MilestoneSolidifier], send queue <ip>:15600) ...
2020-07-10T06:46:07Z    WARN    Graceful Shutdown       Received shutdown request - waiting (max 1 seconds) to finish processing (send queue <ip>:15600, Tangle[HeartbeatEvents], Cleanup at shutdown, Peering Server, Dashboard[Visualizer], TangleProcessor[ProcessMilestone], Close database, TangleProcessor[ReceiveTx], send queue <ip>:15600, LocalSnapshots, send queue <ip>:15600, PendingRequestsEnqueuer, Tangle Solidifier, Dashboard[WSSend], MessageProcessor, TangleProcessor[UpdateMetrics], Peering Reconnect, send queue <ip>:15600, STINGRequester, WarpSync[Events], Metrics TPS Updater, send queue <ip>:15600, TangleProcessor[MilestoneSolidifier], send queue <ip>:15600, BroadcastQueue) ...
2020-07-10T06:46:08Z    FATAL   Graceful Shutdown       Background processes did not terminate in time! Forcing shutdown ...
github.com/gohornet/hornet/plugins/gracefulshutdown.configure.func1.1
        /__w/hornet/hornet/plugins/gracefulshutdown/plugin.go:50

The ExitCode looks good to me:

docker inspect -f '{{.State.ExitCode}}' iota_hornet
0

Expected behavior I should be able to use docker stop to stop the container without causing a long downtime.

Environment information:

RAM: 4 GB
Cores: 8 Cores
HORNET version 0.4.1

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 27 (1 by maintainers)

Most upvoted comments

Hi @muXxer , no worries, I don’t restart Hornet that often. Actually I haven’t experienced this issue recently after I removed a few unnecessary processes from the machine. So your hint about IO makes a lot of sense now. About the history size I cannot say much as I had to remove the data after each failed restart. I close the case, I think it’s OK now. Thanks for your help.

ghost on Oct 29, 2020