podman: Podman hangs with deadlock on /var/lib/containers/storage/storage.lock

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

Description

Got podman in a locked state:

[heat-admin@openstack-controller-0 ~]$ ps aux | grep podman
root       29708  0.0  0.1 480508 15104 ?        Sl   08:47   0:00 /usr/bin/podman run --user root --name docker-puppet-horizon --env PUPPET_TAGS=file,file_line,concat,augeas,cron,horizon_config --env NAME=horizon --env HOSTNAME=openstack-controller-0 --env NO_ARCHIVE= --env STEP=6 --volume /etc/localtime:/etc/localtime:ro --volume /tmp/tmpYRgU6v:/etc/config.pp:ro,z --volume /etc/puppet/:/tmp/puppet-etc/:ro,z --volume /usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro --volume /var/lib/config-data:/var/lib/config-data/rw,z --volume /dev/log:/dev/log:rw --volume /etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro --volume /etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro --volume /etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro --volume /etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro --volume /var/lib/docker-puppet/docker-puppet.sh:/var/lib/docker-puppet/docker-puppet.sh:rw,z --entrypoint /var/lib/docker-puppet/docker-puppet.sh --net host --volume /etc/hosts:/etc/hosts:ro 192.168.24.1:8787/tripleomaster/centos-binary-horizon:current-tripleo
root       30967  0.7  0.1 341240 15840 ?        Sl   08:47   0:33 /usr/bin/podman run --user root --name docker-puppet-cinder --env PUPPET_TAGS=file,file_line,concat,augeas,cron,cinder_config,file,concat,file_line,cinder_config,file,concat,file_line,cinder_config,file,concat,file_line --env NAME=cinder --env HOSTNAME=openstack-controller-0 --env NO_ARCHIVE= --env STEP=6 --volume /etc/localtime:/etc/localtime:ro --volume /tmp/tmpFMcsQy:/etc/config.pp:ro,z --volume /etc/puppet/:/tmp/puppet-etc/:ro,z --volume /usr/share/openstack-puppet/modules/:/usr/share/openstack-puppet/modules/:ro --volume /var/lib/config-data:/var/lib/config-data/rw,z --volume /dev/log:/dev/log:rw --volume /etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro --volume /etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro --volume /etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro --volume /etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro --volume /var/lib/docker-puppet/docker-puppet.sh:/var/lib/docker-puppet/docker-puppet.sh:rw,z --entrypoint /var/lib/docker-puppet/docker-puppet.sh --net host --volume /etc/hosts:/etc/hosts:ro 192.168.24.1:8787/tripleomaster/centos-binary-cinder-api:current-tripleo
root       31014  0.0  0.1 331732 11892 ?        Sl   08:47   0:00 /usr/bin/podman inspect 192.168.24.1:8787/tripleomaster/centos-binary-cron:current-tripleo
heat-ad+   50808  0.0  0.0 112708   968 pts/0    S+   09:59   0:00 grep --color=auto podman

strace shows it’s waiting on the lock at “/var/lib/containers/storage/storage.lock”:

[heat-admin@openstack-controller-0 ~]$ sudo strace podman ps
[...]
stat("/var/run/containers/storage", {st_mode=S_IFDIR|0700, st_size=80, ...}) = 0
stat("/var/lib/containers/storage", {st_mode=S_IFDIR|0700, st_size=150, ...}) = 0
stat("/var/lib/containers/storage/mounts", {st_mode=S_IFDIR|0700, st_size=6, ...}) = 0
stat("/var/lib/containers/storage/tmp", {st_mode=S_IFDIR|0700, st_size=6, ...}) = 0
stat("/var/lib/containers/storage/overlay", {st_mode=S_IFDIR|0700, st_size=12288, ...}) = 0
openat(AT_FDCWD, "/var/lib/containers/storage/storage.lock", O_RDWR|O_CREAT, 0600) = 5
fcntl(5, F_SETFD, FD_CLOEXEC)           = 0
getrandom("\272\267\26?\301\310GUa\36L\255+\24I\\\251\235\2333\273\367\227\313-\326K\327\233\371\252\244", 32, 0) = 32
getpid()                                = 51161
fcntl(5, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}
^Cstrace: Process 51161 detached
 <detached ...>

[heat-admin@openstack-controller-0 ~]$ sudo lsof /var/lib/containers/storage/storage.lock
COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF      NODE NAME
podman  29708 root    4u   REG  252,2       64 117440877 /var/lib/containers/storage/storage.lock
podman  30967 root    4u   REG  252,2       64 117440877 /var/lib/containers/storage/storage.lock
podman  31014 root    4uW  REG  252,2       64 117440877 /var/lib/containers/storage/storage.lock

Steps to reproduce the issue:

  1. Launch several containers at the same time

  2. Wait for podman to hang

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Version:       0.6.1
Go Version:    go1.9.4
OS/Arch:       linux/amd64

Output of podman info:

I can’t provide the output of podman info, it hangs.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 18 (16 by maintainers)

Commits related to this issue

Most upvoted comments

I ran my buildah torture script with concurrency=16 (!) and can no longer reproduce my lockup.

[1] http://paste.openstack.org/show/729907/