foundryvtt-docker: Node segfault (error 139) in container release v11.x.x
Bug description
I can neither migrate old data nor can I start a new world. I always get the same error:
{"code":"ECOMPROMISED","errno":-2,"level":"error","message":"ENOENT: no such file or directory, stat '/data/Config/options.json.lock'","path":"/data/Config/options.json.lock","stack":"Error: ENOENT: no such file or directory, stat '/data/Config/options.json.lock'","syscall":"stat","timestamp":"2023-05-26 17:39:43"}
{"level":"error","message":"A fatal error occurred while trying to start the Foundry Virtual Tabletop server: Foundry VTT cannot start in this directory which is already locked by another process.","stack":"Error: A fatal error occurred while trying to start the Foundry Virtual Tabletop server: Foundry VTT cannot start in this directory which is already locked by another process.\n at _acquireLockFile (file:///home/foundry/resources/app/dist/init.mjs:1:4799)\n at async _initializeCriticalFunctions (file:///home/foundry/resources/app/dist/init.mjs:1:2593)\n at async Module.initialize (file:///home/foundry/resources/app/dist/init.mjs:1:1564)","timestamp":"2023-05-26 17:41:11"}
{"level":"error","message":"A fatal error occurred while trying to start the Foundry Virtual Tabletop server: Foundry VTT cannot start in this directory which is already locked by another process.","stack":"Error: A fatal error occurred while trying to start the Foundry Virtual Tabletop server: Foundry VTT cannot start in this directory which is already locked by another process.\n at _acquireLockFile (file:///home/foundry/resources/app/dist/init.mjs:1:4799)\n at async _initializeCriticalFunctions (file:///home/foundry/resources/app/dist/init.mjs:1:2593)\n at async Module.initialize (file:///home/foundry/resources/app/dist/init.mjs:1:1564)","timestamp":"2023-05-26 17:44:36"}
The data is stored on the host system as a bind volume under a directory. All was well with V10.
I attempted to ‘reset’ all the data as well, by stopping the container, moving the data to a new folder, re-creating the old folder completely empty, and creating a brand new world. The error logs above are from that last attempt.
I have tried removing the options.json.lock directory (isn’t this supposed to be a file, not a directory, btw?), recycling the container, to no avail. I suspect this may be a bug in foundry itself, but I’m not certain, and figured anyone else running into this issue from this container may find their way here…
Steps to reproduce
- Start a new container instance with a bind volume to hold the container data
- Create a new world
- Attempt to launch the world
Expected behavior
I expect the world to launch without error.
Container metadata
com.foundryvtt.version = "11.299"
org.opencontainers.image.authors = "markf+github@geekpad.com"
org.opencontainers.image.created = "2023-05-26T01:33:20.124Z"
org.opencontainers.image.description = "An easy-to-deploy Dockerized Foundry Virtual Tabletop server."
org.opencontainers.image.licenses = "MIT"
org.opencontainers.image.revision = "349bc278fa92049dd2b480b322fc30a0842221fb"
org.opencontainers.image.source = "https://github.com/felddy/foundryvtt-docker"
org.opencontainers.image.title = "foundryvtt-docker"
org.opencontainers.image.url = "https://github.com/felddy/foundryvtt-docker"
org.opencontainers.image.vendor = "Geekpad"
org.opencontainers.image.version = "11.299.0"
Relevant log output
No response
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 51 (21 by maintainers)
Commits related to this issue
- Add hotfix for issue #697 - v11 database glibc workaround — committed to felddy/foundryvtt-docker by felddy 4 months ago
- Merge pull request #919 from felddy/patch/issue-697 Add hotfix for issue #697 - v11 database glibc workaround — committed to felddy/foundryvtt-docker by felddy 4 months ago
Exact same issue for me, except here was my System:
Thank you @felddy!
This fixed the segfault I was getting on a pi4, thank you!
(to be clear, I applied the changes to alpine, not the debian branch.)
As a workaround for this issue you can try the following:
Create a
container_patches
directory in yourdata
mount. e.g.,data/container_patches
.In this directory create a file called:
patch-issue-697.sh
with the following contents:Set the environment variable:
CONTAINER_PATCHES=/data/container_patches
in your container’s configuration.Start the container using the standard release image:
felddy/foundryvtt:release
In your container logs you will see lines similar to this after Foundry is installed but before it starts:
This should apply the steps as documented in the associated issue here:
Please let me know how this works. If it succeeds I can add this file to the repo’s patch library for others to use.
I too can confirm that this works on my Raspberry Pi 4 and solves the segfault 🥳 I will note that this increases the startup time of the container to a little over two minutes on the Pi 4 hardware due to the compilation step, perhaps a useful disclaimer to mention when publishing the patch.
Thanks for all the efforts to resolve this issue!
@mcdonnnj - not yet. Work has been killing me, and I had a bit of a rough zfs pool migration that JUST finished on the home lab. I’ve been utterly exhausted and why I haven’t participated after I booted FoundryVTT-docker from the ARM64 RPi4 over to the x86 server LOL - I will try to try it tonight, but not making promises*, my wife just went back into the hospital this morning and I’m running the house as well…
So, to be absolutely certain, I wanted to try running things in a barebones container, install Foundry V11 manually and see if it would still segfault. The short version is that it works just fine this way. This might indicate it might not be a Foundry bug after all, or at least there is some weird interaction going on with the way the container is set-up as I got it to work in another container configuration on the same hardware.
So here is what I did:
docker run
with just port 30000 mapped and dropping straight into a bash shell, no other optionssudo, wget, curl, libssl-dev
and installed nodejs 18 using the official install scriptSo its definitely possible to run Foundry V11 in a container on a Raspberry PI 4… not sure if this makes it easier or harder now, but I feel this takes us a step closer to a solution…
I was having this problem and your solution worked for my system.
Problem
When upgrading or launching a world, foundry crashed with the follow logs:
Solution
Exactly as described here: https://github.com/felddy/foundryvtt-docker/issues/697#issuecomment-1597821895
My system
@felddy This container unfortunately doesn’t start, here’s the error:
@felddy I tried two more things (tell me when to stop):
So on the Raspberry PI 4 hardware, running Foundry v11 works with an Ubuntu based image but not with Alpine. Is there a way we could get a release of your container setup based on another distro than Alpine for this case or would it be possible for me to modify and build a version myself?
Thanks
Alright, I just tried the above both with 11.300 in the 10.291.1 container and the just released 11.301 and in both instances the segfault is still thrown when trying to open the world. So its something specific to v11 and not the container…
Attach latest log - created a completely fresh image. Running on a raspberry pi4, btw - so this is ARM, rather than x86 based. Completely new container won’t launch any world when running in a container, but works fine running directly from my RPi4 host. I’m really leaning towards a container issue _foundryvtt-test_logs.txt 😦
Ok, so noticed something potentially of interest. Node’s dying with exit code 139 (SIGSEGV) - and when I went to launch using the same data config from the straight node FoundryVTT install, all worked fine. As part of trying to get it to run, though, I had to get GCC 3.4.29 - and since I’m running on Bullseye (Raspian) I had to upgrade to unstable (Bookworm) - then I could run FoundryVTT from node non-containerized perfectly fine. What GLIBCXX is available within the container? I’m trying to hit it with /bin/sh so I can check, but now that I migrated the default world, I’m just spinning and crashing the container lol. I’ll keep digging…
Edit - Here’s the actual log where I go to start the world (now post-migration):
BTW, can I get an invite to the discord channel?
Edit Part Deux - My concentration on the lock file was just a symptom. What’s happening here, is that upon launch of a world, Node is crashing with exit code 139 (SIGSEGV), then the container restarts. Upon startup, the lock file still exists from the prior crash and that is what I was seeing originally. So the “digging” really needs to be understanding the first 139 exit code from Node…
Note - Node 18.16.0 is what I used from the non-containerized test, and that matches what’s claimed in the container as well. Really thinking this may be a glibc issue…
Ok then if that’s not you… good news then if you believe that miserly loves company.
CaydenSworn#3518
is reporting the same issue in the #troubleshooting forum. I’d keep an eye any resolutions they report.I saw your issue that you opened to the foundryvtt project. Just be aware that there is a very strong container phobia with the support team. To the point that they will refuse to support any containerized foundry debugging. 🤷
I can’t think of any solutions or tests that you haven’t already run. Can you think of anything novel about your setup or configuration?