podman: Container doesn't start after a system reboot

/kind bug Description

Podman does not start a container after the system is rebooted. This is on a ZFS on Linux system.

Steps to reproduce the issue:

  1. A container is created with sudo podman run --name test -v /sys/fs/cgroup:/sys/fs/cgroup -p 80:80 -p 443:443 -p 5001:5001 --tmpfs /run -dti centos /sbin/init

  2. The system is rebooted

  3. Try to star the container via sudo podman start test

Describe the results you received: Try to start the container again and I receive the following error unable to start container “test”: container create failed: container_linux.go:344: starting container process caused “exec: "/sbin/init": stat /sbin/init: no such file or directory” : internal libpod error

Running debug level logging sudo podman --log-level debug start test DEBU[0000] Initializing boltdb state at /var/lib/containers/storage/libpod/bolt_state.db DEBU[0000] Using graph driver zfs
DEBU[0000] Using graph root /var/lib/containers/storage DEBU[0000] Using run root /var/run/containers/storage
DEBU[0000] Using static dir /var/lib/containers/storage/libpod DEBU[0000] Using tmp dir /var/run/libpod
DEBU[0000] Set libpod namespace to “”
DEBU[0000] [graphdriver] trying provided driver “zfs”
DEBU[0000] [zfs] zfs get -rHp -t filesystem all containers/podman/storage INFO[0000] Found CNI network podman (type=bridge) at /etc/cni/net.d/87-podman-bridge.conflist DEBU[0000] Made network namespace at /var/run/netns/cni-e4dc6d29-4077-4615-5e34-f3bda7aa82fd for container 70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 INFO[0000] Got pod network &{Name:test Namespace:test ID:70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 NetNS:/var/run/netns/cni-e4dc6d29-4077-4615-5e34-f3bda7aa82fd PortMappings:[{HostPort:80 ContainerPort:80 Protocol:tcp HostIP:} {HostPort:443 ContainerPort:443 Protocol:tcp HostIP:} {HostPort:5001 ContainerPort:5001 Protocol:tcp HostIP:}] Networks:[] NetworkConfig:map[]} INFO[0000] About to add CNI network cni-loopback (type=loopback) INFO[0000] Got pod network &{Name:test Namespace:test ID:70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 NetNS:/var/run/netns/cni-e4dc6d29-4077-4615-5e34-f3bda7aa82fd PortMappings:[{HostPort:80 ContainerPort:80 Protocol:tcp HostIP:} {HostPort:443 ContainerPort:443 Protocol:tcp HostIP:} {HostPort:5001 ContainerPort:5001 Protocol:tcp HostIP:}] Networks:[] NetworkConfig:map[]} INFO[0000] About to add CNI network podman (type=bridge) DEBU[0000] mounted container “70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9” at “/var/lib/containers/storage/zfs/graph/56b635ed4657a9202edd3e2ed29edc5a2ed026edc31dd1d8b7e4dbe80cb28ceb” DEBU[0000] Created root filesystem for container 70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 at /var/lib/containers/storage/zfs/graph/56b635ed4657a9202edd3e2ed29edc5a2ed026edc31dd1d8b7e4dbe80cb28ceb DEBU[0000] [0] CNI result: Interfaces:[{Name:cni0 Mac:4a:79:e0:38:5a:7e Sandbox:} {Name:veth3e16c7ea Mac:8e:2b:b7:c7:17:11 Sandbox:} {Name:eth0 Mac:32:3c:14:21:01:2c Sandbox:/var/run/netns/cni-e4dc6d29-4077-4615-5e34-f3bda7aa82fd}], IP:[{Version:4 Interface:0xc00027b100 Address:{IP:10.88.0.112 Mask:ffff0000} Gateway:10.88.0.1}], Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:<nil>}], DNS:{Nameservers:[] Domain: Search:[] Options:[]} DEBU[0000] /etc/system-fips does not exist on host, not mounting FIPS mode secret DEBU[0000] parsed reference into “[zfs@/var/lib/containers/storage+/var/run/containers/storage]@1e1148e4cc2c148c6890a18e3b2d2dde41a6745ceb4e5fe94a923d811bf82ddb” DEBU[0000] parsed reference into “[zfs@/var/lib/containers/storage+/var/run/containers/storage]@1e1148e4cc2c148c6890a18e3b2d2dde41a6745ceb4e5fe94a923d811bf82ddb” DEBU[0000] exporting opaque data as blob “sha256:1e1148e4cc2c148c6890a18e3b2d2dde41a6745ceb4e5fe94a923d811bf82ddb” DEBU[0000] parsed reference into “[zfs@/var/lib/containers/storage+/var/run/containers/storage]@1e1148e4cc2c148c6890a18e3b2d2dde41a6745ceb4e5fe94a923d811bf82ddb” DEBU[0000] exporting opaque data as blob “sha256:1e1148e4cc2c148c6890a18e3b2d2dde41a6745ceb4e5fe94a923d811bf82ddb” DEBU[0000] parsed reference into “[zfs@/var/lib/containers/storage+/var/run/containers/storage]@1e1148e4cc2c148c6890a18e3b2d2dde41a6745ceb4e5fe94a923d811bf82ddb” DEBU[0000] Setting CGroups for container 70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 to machine.slice:libpod:70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 WARN[0000] failed to parse language “en_CA.utf8”: language: tag is not well-formed DEBU[0000] reading hooks from /usr/share/containers/oci/hooks.d DEBU[0000] reading hooks from /etc/containers/oci/hooks.d DEBU[0000] Created OCI spec for container 70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 at /var/lib/containers/storage/zfs-containers/70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9/userdata/config.json DEBU[0000] /usr/libexec/podman/conmon messages will be logged to syslog DEBU[0000] running conmon: /usr/libexec/podman/conmon args=[-s -c 70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 -u 70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 -r /usr/bin/runc -b /var/lib/containers/storage/zfs-containers/70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9/userdata -p /var/run/containers/storage/zfs-containers/70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9/userdata/pidfile -l /var/lib/containers/storage/zfs-containers/70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9/userdata/ctr.log --exit-dir /var/run/libpod/exits --exit-command /usr/bin/podman --exit-command-arg --root --exit-command-arg /var/lib/containers/storage --exit-command-arg --runroot --exit-command-arg /var/run/containers/storage --exit-command-arg --log-level --exit-command-arg error --exit-command-arg --cgroup-manager --exit-command-arg systemd --exit-command-arg --tmpdir --exit-command-arg /var/run/libpod --exit-command-arg --storage-driver --exit-command-arg zfs --exit-command-arg container --exit-command-arg cleanup --exit-command-arg 70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 --socket-dir-path /var/run/libpod/socket -t --log-level debug --syslog] INFO[0000] Running conmon under slice machine.slice and unitName libpod-conmon-70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9.scope DEBU[0000] Received container pid: -1
DEBU[0000] Cleaning up container 70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 DEBU[0000] Tearing down network namespace at /var/run/netns/cni-e4dc6d29-4077-4615-5e34-f3bda7aa82fd for container 70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 INFO[0000] Got pod network &{Name:test Namespace:test ID:70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9 NetNS:/var/run/netns/cni-e4dc6d29-4077-4615-5e34-f3bda7aa82fd PortMappings:[{HostPort:80 ContainerPort:80 Protocol:tcp HostIP:} {HostPort:443 ContainerPort:443 Protocol:tcp HostIP:} {HostPort:5001 ContainerPort:5001 Protocol:tcp HostIP:}] Networks:[] NetworkConfig:map[]} INFO[0000] About to del CNI network podman (type=bridge) DEBU[0000] unmounted container “70a2ed4ade73b6c66cb6f7f4480afd4b1ed5bc178e09ba832848022617b7d4f9” ERRO[0000] unable to start container “test”: container create failed: container_linux.go:344: starting container process caused “exec: "/sbin/init": stat /sbin/init: no such file or directory” : internal libpod error

It appears that the container dataset is not mounted during the podman start after a reboot.

sudo podman inspect test| grep -i mountpoint -B 1 “Dataset”: “containers/podman/storage/56b635ed4657a9202edd3e2ed29edc5a2ed026edc31dd1d8b7e4dbe80cb28ceb”, “Mountpoint”: “/var/lib/containers/storage/zfs/graph/56b635ed4657a9202edd3e2ed29edc5a2ed026edc31dd1d8b7e4dbe80cb28ceb” Below is the contents of the mount point sudo ls /var/lib/containers/storage/zfs/graph/56b635ed4657a9202edd3e2ed29edc5a2ed026edc31dd1d8b7e4dbe80cb28ceb dev etc proc run sys tmp var

However if I mount the dataset manually everything works correctly after sudo mount -t zfs containers/podman/storage/56b635ed4657a9202edd3e2ed29edc5a2ed026edc31dd1d8b7e4dbe80cb28ceb /var/lib/containers/storage/zfs/graph/56b635ed4657a9202edd3e2ed29edc5a2ed026edc31dd1d8b7e4dbe80cb28ceb

sudo podman start test test sudo podman exec test cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core)

Describe the results you expected: Podman should mount the data and run the container as expected

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

using the latest master (January 13)
podman version 0.12.2-dev

Output of podman info:

host:
  BuildahVersion: 1.6-dev
  Conmon:
    package: podman-0.12.1.2-1.git9551f6b.fc29.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 1.14.0-dev, commit: 9b1f0a08285a7f74b21cc9b6bfd98a48905a7ba2'
  Distribution:
    distribution: fedora
    version: "29"
  MemFree: 65933111296
  MemTotal: 67545485312
  OCIRuntime:
    package: runc-1.0.0-66.dev.gitbbb17ef.fc29.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc6+dev
      commit: ead425507b6ba28278ef71ad06582df97f2d5b5f
      spec: 1.0.1-dev
  SwapFree: 0
  SwapTotal: 0
  arch: amd64
  cpus: 16
  hostname: gitlab.fusion.local
  kernel: 4.19.10-300.fc29.x86_64
  os: linux
  rootless: false
  uptime: 13m 47.36s
insecure registries:
  registries: []
registries:
  registries:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 2
  GraphDriverName: zfs
  GraphOptions: null
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Compression: lz4
    Parent Dataset: containers/podman/storage
    Parent Quota: "no"
    Space Available: "442634854400"
    Space Used By Parent: "13794312192"
    Zpool: containers
    Zpool Health: ONLINE
  ImageStore:
    number: 8
  RunRoot: /var/run/containers/storage

Additional environment details (AWS, VirtualBox, physical, etc.): Fedora 29 Server on a physical host, selinux is in permissive mode

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 35 (19 by maintainers)

Commits related to this issue

Most upvoted comments

Looks like that fixed it! I am able to start the container without any issues after a reboot! Shall I close the issue or wait until it is merged?