podman: Podman dies during interactive session exec due to disconnect

/kind bug

Description

When running a shell script in the interactive shell with the terminal during exec , podman dies if session is disconnected.

Steps to reproduce the issue: This is just an example, you can

Connect to your remote host via ssh. Run bash in the podman e.g. podman exec -it containerx /bin/bash
Run loop e.g. while x=1; do ls; sleep 1; done
Disconnect ssh session

Describe the results you received: Container dies.
May 29 07:58:26 storage2n1-la sshd[5785]: Received disconnect from 10.10.8.142 port 33360:11: disconnected by user May 29 07:58:26 storage2n1-la sshd[5785]: Disconnected from user root 10.10.8.142 port 33360 May 29 07:58:26 storage2n1-la sshd[5785]: pam_unix(sshd:session): session closed for user root May 29 07:58:26 storage2n1-la ceph-mon-storage2n1-la[6614]: teardown: managing teardown after SIGCHLD May 29 07:58:26 storage2n1-la ceph-mon-storage2n1-la[6614]: teardown: Sending SIGTERM to PID 173 May 29 07:58:26 storage2n1-la ceph-mon-storage2n1-la[6614]: debug 2019-05-29 07:58:26.956 7f393d287700 -1 received signal: Terminated from (PID: 259) UID: 0 May 29 07:58:26 storage2n1-la ceph-mon-storage2n1-la[6614]: debug 2019-05-29 07:58:26.956 7f393d287700 -1 mon.storage2n1-la@0(leader) e1 *** Got Signal Terminated *** May 29 07:58:26 storage2n1-la systemd-logind[1256]: Removed session 524. – Subject: Session 524 has been terminated

Describe the results you expected: Container stays alive with and bash session dies. This works if I just do exec /bin/bash and don’t run loop in that shell.

Additional information you deem important (e.g. issue happens only occasionally): There is an additional problem. The session dies uncleanly. Systemd cannot restart it unless I do podman rm --force <container name>

Output of podman version:

Version:            1.3.2-dev
RemoteAPI Version:  1
Go Version:         go1.10.4
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  git commit: ""
  go version: go1.10.4
  podman version: 1.3.2-dev
host:
  BuildahVersion: 1.9.0-dev
  Conmon:
    package: 'conmon: /usr/libexec/crio/conmon'
    path: /usr/libexec/crio/conmon
    version: 'conmon version , commit: '
  Distribution:
    distribution: ubuntu
    version: "18.04"
  MemFree: 117319438336
  MemTotal: 134647275520
  OCIRuntime:
    package: 'cri-o-runc: /usr/bin/runc'
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 6446641152
  SwapTotal: 6446641152
  arch: amd64
  cpus: 40
  hostname: storage2n1-la
  kernel: 4.18.0-20-generic
  os: linux
  rootless: false
  uptime: 163h 34m 28.88s (Approximately 6.79 days)
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 8
  GraphDriverName: overlay
  GraphOptions: null
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 3
  RunRoot: /var/run/containers/storage
  VolumePath: /var/lib/containers/storage/volumes

Additional environment details (AWS, VirtualBox, physical, etc.): Physical host

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 32 (9 by maintainers)

Commits related to this issue

cephadm: rm -f if necessary This ticket seems to suggest that (1) -f may be needed, sometimes, and (2) newer versions fix it. https://github.com/containers/libpod/issues/3226 Way back in 26f9fe54... — committed to liewegas/ceph by liewegas 4 years ago
cephadm: avoid trigger old podman bug This ticket seems to suggest that (1) the root cause is related to an exec that is orphaned and screws up the container state (due to, e.g., ssh dropping, or a t... — committed to liewegas/ceph by liewegas 4 years ago
cephadm: avoid trigger old podman bug This ticket seems to suggest that (1) the root cause is related to an exec that is orphaned and screws up the container state (due to, e.g., ssh dropping, or a t... — committed to liewegas/ceph by liewegas 4 years ago

Most upvoted comments

@haircommander I understand that you need to cleanup with -f sometimes but I don’t understand why running a bash script in container exec shell should cause the whole thing to die during a terminal session disconnection. I believe it should cause a script itself to die as well as exec session but not the running container. Anyway, what would be a comparable docker or rkt behavior? I used both and don’t recall either had the same issue.

alitvak69 on Aug 6, 2019