nginx-proxy-manager: 2.10.0 unable to start on clean install

Checklist

  • Have you pulled and found the error with jc21/nginx-proxy-manager:latest docker image?
    • Yes ~/ No~
  • Are you sure you’re not using someone else’s docker image?
    • Yes ~/ No~
  • Have you searched for similar issues (both open and closed)?
    • Yes ~/ No~

Describe the bug

The :latest and 2.10.0 image fails to start either with an existing configuration, or with a clean install.

Nginx Proxy Manager Version

2.10.0

To Reproduce Steps to reproduce the behavior:

  1. Start a container
  2. Watch it fail

Expected behavior

The container should start

Screenshots

➜  lb-pi003 docker compose up -d && docker compose logs -f app
[+] Running 3/3
 ⠿ Network lb-pi003_default  Created                                                                                                                                                        0.8s
 ⠿ Container lb-pi003-db-1   Started                                                                                                                                                       27.7s
 ⠿ Container lb-pi003-app-1  Started                                                                                                                                                       18.7s
lb-pi003-app-1  | s6-rc: info: service s6rc-oneshot-runner: starting
lb-pi003-app-1  | s6-rc: info: service s6rc-oneshot-runner successfully started
lb-pi003-app-1  | s6-rc: info: service fix-attrs: starting
lb-pi003-app-1  | s6-rc: info: service fix-attrs successfully started
lb-pi003-app-1  | s6-rc: info: service legacy-cont-init: starting
lb-pi003-app-1  | s6-rc: info: service legacy-cont-init successfully started
lb-pi003-app-1  | s6-rc: info: service prepare: starting
lb-pi003-app-1  | ❯ Configuring npmuser ...
lb-pi003-app-1  | id: 'npmuser': no such user
lb-pi003-app-1  | ❯ Checking paths ...
lb-pi003-app-1  | ❯ Setting ownership ...
lb-pi003-app-1  | s6-rc: fatal: timed out
lb-pi003-app-1  | s6-sudoc: fatal: unable to get exit status from server: Operation timed out
lb-pi003-app-1  | /run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.

Operating System

Rpi

Additional context

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 26
  • Comments: 135 (19 by maintainers)

Commits related to this issue

Most upvoted comments

Same issue here. Ubuntu 22.04 LTS (docker). Confirmed fix on rollback to 2.9.22

Hello, I have a strange issue. My npm stuck at setting ownership, exactly on “chown -R 1000:1000 /opt/certbot”. The OS is TrueNAS scale, but I use direct docker :latest image (not ix-sys version).

In fact, if I run that command from the shell, it never ends.

If I do quickly after deploy this commands, npm start well. rm -rf /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh touch /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh chown -R "1000:1000" /data chown -R "1000:1000" /etc/letsencrypt chown -R "1000:1000" /run/nginx chown -R "1000:1000" /tmp/nginx chown -R "1000:1000" /var/cache/nginx chown -R "1000:1000" /var/lib/logrotate chown -R "1000:1000" /var/lib/nginx chown -R "1000:1000" /var/log/nginx chown -R "1000:1000" /etc/nginx/nginx chown -R "1000:1000" /etc/nginx/nginx.conf chown -R "1000:1000" /etc/nginx/conf.d

EDIT “chown -R 1000:1000 /opt/certbot -v” log 1 file per second, so the deploy go timeout

I can confirm that this is exactly the same issue I’ve been seeing from my side as well. Also running TrueNAS Scale, broken since 2.10.x.

Can confirm, I have the same issue in TrueNAS Scale with the container getting stuck at chown -R xx:xx /opt/certbot, exactly as in https://github.com/NginxProxyManager/nginx-proxy-manager/issues/2753#issuecomment-1556126836. Very annoying

I ended up mounting /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh to a script I created on the host, with all of the same contents, but with the modified chown line that causes the issue (just let it run in the background):

...
# Prevents errors when installing python certbot plugins when non-root
nohup chown -R "$PUID:$PGID" /opt/certbot -v &

When I do a portainter recreate including “re-pull image”, I’m getting the error:

s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service prepare: starting
❯ Configuring npmuser ...
id: 'npmuser': no such user
s6-rc: fatal: timed out
s6-sudoc: fatal: unable to get exit status from server: Operation timed out
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.

I’m running on jc21/nginx-proxy-manager:2

Back to 2.9.22 “solves” the problem for now 😃

Hello, I have a strange issue. My npm stuck at setting ownership, exactly on “chown -R 1000:1000 /opt/certbot”. The OS is TrueNAS scale, but I use direct docker :latest image (not ix-sys version).

In fact, if I run that command from the shell, it never ends.

If I do quickly after deploy these commands, npm start well.

rm -rf /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh
touch /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh
chown -R "1000:1000" /data
chown -R "1000:1000" /etc/letsencrypt
chown -R "1000:1000" /run/nginx
chown -R "1000:1000" /tmp/nginx
chown -R "1000:1000" /var/cache/nginx
chown -R "1000:1000" /var/lib/logrotate
chown -R "1000:1000" /var/lib/nginx
chown -R "1000:1000" /var/log/nginx
chown -R "1000:1000" /etc/nginx/nginx
chown -R "1000:1000" /etc/nginx/nginx.conf
chown -R "1000:1000" /etc/nginx/conf.d

EDIT

chown -R  1000:1000  /opt/certbot -v

logs 1 file per second, so the deploy go timeout.

Same on a Arm7 Back to 2.9.22

After updating from 2.9.22 to 2.10.0 on my Synology DS it failed to start:

nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)

I did a fresh new install with minimal configuration and got the error:

id: 'npmuser': no such user
s6-rc: fatal: timed out
s6-sudoc: fatal: unable to get exit status from server: Operation timed out

Rolling back to 2.9.22 fixed the issue.

2.10.0 works on my laptop (Pop OS). Synology OS has no user with ID 1000. Maybe that’s a hint.

@befantasy I’ve been away on my honeymoon and it was great thanks for asking.

FWIW this was meant to be fixed and is fixed on all the architectures that I have access to. I do not have access to an OrangePi or Synology setup however which makes things very difficult.

As for S6 scripts, they don’t have a timeout set by default or by me, that error message is incorrect and misleading. I doubt the disk access is a contributing factor but that ownership script can be heavy depending on the filesystems so I can’t rule it out.

I’ve created a docker image that has verbose output of the s6 scripts so we can work out exactly where things are failing.

For those affected, please use this docker image and post the previous 10 lines or so prior to the error:

jc21/nginx-proxy-manager:github-s6-verbose

Also mention:

  • the system: arm/arm64/amd64
  • if you’re using the PUID/PGID env vars (remember they are optional, maybe try with and without)
  • if you’re doing anything more than vanilla in the docker-compose config (docker secrets for example)

Thanks for testing everyone.

@blaine07 The env vars still work as before if they are specified, so if they work for you, keep using them 😃

Hi @jicho , I also rolled back to 2.9.22 but got this log, and the login has a Bad Gateway. did you get that log too?

proxy-manager-app-1 | [3/27/2023] [8:17:30 AM] [Global ] › ✖ error create table migrations (id int unsigned not null auto_increment primary key, name varchar(255), batch int, migration_time timestamp) - ER_CANT_CREATE_TABLE: Can’t create table proxy-mgr.migrations (errno: 13 “Permission denied”)

Hi @adammau2 after going back to tag/label 2.9.22 I had no issues had all. I can login without any issues.

Some more info:

  • I run NPM with a SQLite db
  • I’m running NPM on a Synology NAS, but do stuff (most of the time) with portainer.

This is how to fix it: truenas/charts#1212 (comment)

S6_STAGE2_HOOK=sed -i $d /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh This will remove the last line that was added in version 2.10+ chown -R “$PUID:$PGID” /opt/certbot with takes a long time on HDD pools

Or you can just wait for 5-10mins like I did…

Mine won’t even start after one night. HDD pool.

Thanks for the ENV, they work 😃

This is how to fix it: https://github.com/truenas/charts/issues/1212#issuecomment-1568518666

S6_STAGE2_HOOK=sed -i $d /etc/s6-overlay/s6-rc.d/prepare/30-ownership.sh

This will remove the last line that was added in version 2.10+ chown -R “$PUID:$PGID” /opt/certbot with takes a long time on HDD pools

Or you can just wait for 5-10mins like I did…

Im running unraid, and get this if I use :latest, but if i use :2.10.2 or now .3 it starts up fine.

You must be using an old “latest” than. Currently tag “latest”, “2.10.3” and “v2” are the same image.

Ya it’s a fresh install, v2 just removed my container completely since it failed. Latest and 2.10.3 should be exact same thing, but only works if I define the version

also to add just jc21/nginx-proxy-manager (default on install) fails to start too. Only works when I specify a version. Weird

@rymancl in fact yes I am seeing more information in your output as expected.

@Sungray you can run cert-prune inside the docker container to clean up those archived files. Just be sure to back up your letsencrypt folder first.

Not sure if I got the same issue but when I have a PGID which is different from PUID, a cold start of version 2.10.2 didn’t work.

docker-compose.yml

services:
  npm:
    image: jc21/nginx-proxy-manager:2.10.2
    container_name: npm
    environment:
      - PGID=999
      - PUID=1001
    volumes:
      - ./data:/data
      - ./etc/letsencrypt:/etc/letsencrypt
    ports:
      - 10080:80
      - 10081:81
      - 10443:443
    restart: always
s6-rc: info: service s6rc-oneshot-runner: starting
s6-rc: info: service s6rc-oneshot-runner successfully started
s6-rc: info: service fix-attrs: starting
s6-rc: info: service fix-attrs successfully started
s6-rc: info: service legacy-cont-init: starting
s6-rc: info: service legacy-cont-init successfully started
s6-rc: info: service prepare: starting
❯ Configuring npmuser ...
id: 'npmuser': no such user
usermod: group '999' does not exist
s6-rc: warning: unable to start service prepare: command exited 1
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.

I’ve been using these PGID and PUID for my other Dockers just fine and only got this issue with NPM.

My workaround fix is to set PGID to the same with PUID.

    environment:
      - PGID=1001
      - PUID=1001

My environment:

  • Raspberry Pi 3 Model B
  • Ubuntu 22.04.2 LTS
  • Docker version 23.0.5
  • Docker Compose version v2.17.3

Same here on a Synology NAS

* if I reboot, NPM fails to restart giving the "s6-rc: fatal: timed out" error
* but if I manually restart it from Portainer, it works

It is a “cold boot” problem.

That’s correct @barndawgie !

@antoinedelia I’m not using docker-compose files, I normally enter the docker command on the CLI for the initial creation. After that I do the updates in Portainer by changing the tag.

When I run in Portainer as a stack:

version: '3.8'
services:
  app:
    image: 'jc21/nginx-proxy-manager:2.10.2'
    restart: always
    ports:
      - '3080:80'
      - '3081:81'
      - '3443:443'

I first get an error, when I restart the container and when I restart the container everything starts up without any issues.

Just did a test for you, so I din’t create any volume mappings.

When I go to the maintenance port I get the weirdest thing: image

Strangest detail is that my successful upgrade on my production container I’m seeing version v2.10.2…

And… when I login to the container that is created with the compose above I’m getting this on the CLI:

| \ | | __ _(_)_ __ __  _|  _ \ _ __ _____  ___   _|  \/  | __ _ _ __   __ _  __ _  ___ _ __ 
|  \| |/ _` | | '_ \\ \/ / |_) | '__/ _ \ \/ / | | | |\/| |/ _` | '_ \ / _` |/ _` |/ _ \ '__|
| |\  | (_| | | | | |>  <|  __/| | | (_) >  <| |_| | |  | | (_| | | | | (_| | (_| |  __/ |   
|_| \_|\__, |_|_| |_/_/\_\_|   |_|  \___/_/\_\\__, |_|  |_|\__,_|_| |_|\__,_|\__, |\___|_|   
       |___/                                  |___/                          |___/           
Version 2.10.2 (86ddd9c) 2023-03-30 23:54:10 UTC, OpenResty 1.21.4.1, debian 10 (buster), Certbot certbot 2.4.0
Base: debian:buster-slim, linux/amd64
Certbot: jc21/nginx-full:latest, linux/amd64
Node: jc21/nginx-full:certbot, linux/amd64

Ah… after a forced reload of the page I’m seeing 2.10.2 on the login page of my compose test. Grrr… browser cache 😃

Thanks for testing everyone.

@blaine07 The env vars still work as before if they are specified, so if they work for you, keep using them 😃

I appreciate your time to reply; it’s appreciated.

I realize this is a bumpy road with the changes needed to made to progress forward but THANK YOU for all the hard work you’re putting into this for ALL of us. We appreciate you, your hard work and dedication. Thank YOU!!! 😀

I can confirm that github-uidgid works fine on old install that worked on 2.9.22 and was failing on 2.10.0 and 2.10.1, just had to move mysql folder from NPM data folder 😃

@jc21 Adding the Net_Bind_Service and making the container privileged worked for me. Tried both individually and neither work alone, but together they work. I’m on Ubuntu Server 20.04 LTS.

The only caveat is that I do get this error in the log (don’t know if it matters since the service is working)

npm | 2023-03-28T15:08:28.551267368Z [3/28/2023] [3:08:28 PM] [SSL ] › ✖ error Error: Command failed: /usr/sbin/nginx -t -g “error_log off;” npm | 2023-03-28T15:08:28.551309020Z nginx: the configuration file /etc/nginx/nginx.conf syntax is ok npm | 2023-03-28T15:08:28.551318330Z nginx: [emerg] open() “/etc/nginx/nginx/off” failed (13: Permission denied) npm | 2023-03-28T15:08:28.551326980Z nginx: configuration file /etc/nginx/nginx.conf test failed npm | 2023-03-28T15:08:28.551334216Z npm | 2023-03-28T15:08:28.551342262Z at ChildProcess.exithandler (node:child_process:402:12) npm | 2023-03-28T15:08:28.551349330Z at ChildProcess.emit (node:events:513:28) npm | 2023-03-28T15:08:28.551356030Z at maybeClose (node:internal/child_process:1100:16) npm | 2023-03-28T15:08:28.551363741Z at Process.ChildProcess._handle.onexit (node:internal/child_process:304:5)

Edit: formatting

I was able to reproduce the error (nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)) outside Synology DSM using Debian 10 in a VM which makes debugging easier (hopefully). Synology uses Kernel version 4 and so does Debian 10.

Docker install on Debian 10 (buster,oldstable)

apt install docker.io docker-compose

Follow the quick setup instructions https://nginxproxymanager.com/guide/#quick-setup Modified compose file:

version: '3.3'
services:
  app:
    image: 'jc21/nginx-proxy-manager:github-develop'
    restart: unless-stopped
    ports:
      - '80:80'
      - '81:81'
      - '443:443'

Run and analyze

docker-compose up -d
docker logs npm_app_1

Log

❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)
❯ Starting nginx ...
nginx: [emerg] bind() to 0.0.0.0:80 failed (13: Permission denied)

All HTTP Services will not be available. Portainer is not needed to reproduce the error.

Ditto. 2.10.0 has the error “‘npmuser’: no such user” and will not start. Switch back to 2.9.22, and everything works.
Host Kernel: Linux 5.19.9-Unraid x86_64

Same issue on Ubuntu. Confirmed rollback works fine.

Hi, same issue here. Rolling back to 2.9.22 did the job for now…

can confirm this issue on synology for me. Rollback on 2.9.22 worked