podman: minor mismatch with docker: container exitcode is 0 after checkpoint without --leave-running

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

In docker, the exitcode of a container checkpointed without the --leave-running option is 137 (the same as a stopped container). In podman, the exitcode is 0.

Steps to reproduce the issue:

podman:

$ sudo podman run --detach --name foo alpine sleep 240
d9f45d2433f3475d3bb71cb8e74b0f22824188e5be45b00502ef81f1c3d4332a
$ sudo podman container ls 
CONTAINER ID  IMAGE                            COMMAND    CREATED        STATUS            PORTS   NAMES
d9f45d2433f3  docker.io/library/alpine:latest  sleep 240  7 seconds ago  Up 6 seconds ago          foo
$ sudo podman container checkpoint foo
d9f45d2433f3475d3bb71cb8e74b0f22824188e5be45b00502ef81f1c3d4332a
$ sudo podman container ls --all
CONTAINER ID  IMAGE                                 COMMAND    CREATED             STATUS                           PORTS   NAMES
d9f45d2433f3  docker.io/library/alpine:latest       sleep 240  23 seconds ago      Exited (0) 5 seconds ago                 foo

docker:

$ docker info|grep -i experimental
WARNING: No swap limit support
 Experimental: true
$ docker run --detach --name foo alpine sleep 240
1456036f5da1688f7d8f534447c02645b5c317259bb4b6cd4ef6449384f296a4
$ docker container ls
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
1456036f5da1        alpine              "sleep 240"         5 seconds ago       Up 4 seconds                            foo
$ docker checkpoint create foo chk
chk
$ docker container ls --all
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                       PORTS               NAMES
1456036f5da1        alpine              "sleep 240"         18 seconds ago      Exited (137) 3 seconds ago                       foo

Describe the results you received: Container exitcode reflects a “clean” exit

Describe the results you expected: Container exitcode after creating a checkpoint and stopping the container (via omission of --leave-running) should reflect the stop

Output of podman version:

Version:      2.1.1
API Version:  2.0.0
Go Version:   go1.15.2
Built:        Wed Dec 31 19:00:00 1969
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.16.1
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.20, commit: '
  cpus: 8
  distribution:
    distribution: linuxmint
    version: "19.1"
  eventLogger: journald
  hostname: X
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 4.15.0-123-generic
  linkmode: dynamic
  memFree: 307376128
  memTotal: 33605021696
  ociRuntime:
    name: runc
    package: 'containerd.io: /usr/bin/runc'
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc8+dev
      commit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
      spec: 1.0.1-dev
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.1.4
      commit: unknown
      libslirp: 4.3.1-git
      SLIRP_CONFIG_VERSION_MAX: 3
  swapFree: 0
  swapTotal: 0
  uptime: 162h 2m 51.62s (Approximately 6.75 days)
registries:
  search:
  - docker.io
  - quay.io
  - registry.access.redhat.com
store:
  configFile: /home/X/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: vfs
  graphOptions: {}
  graphRoot: /home/X/.local/share/containers/storage
  graphStatus: {}
  imageStore:
    number: 14
  runRoot: /run/user/1000/containers
  volumePath: /home/X/.local/share/containers/storage/volumes
version:
  APIVersion: 2.0.0
  Built: 0
  BuiltTime: Wed Dec 31 19:00:00 1969
  GitCommit: ""
  GoVersion: go1.15.2
  OsArch: linux/amd64
  Version: 2.1.1

Package info (e.g. output of rpm -q podman or apt list podman):

$ apt list podman
podman/unknown,now 2.1.1~2 amd64 [installed]
$ apt-cache madison podman
    podman |    2.1.1~2 | http://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/xUbuntu_18.04  Packages

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.): physical linux mint 19.1 (ubuntu bionic base) with functional kernel for checkpointing (4.15)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

Just saw this issue. As the author of the checkpoint/restore code I do not think that this is a real problem.

The goal of the checkpoint/restore functionality was never to be compatible to docker. In Podman we are using different commands and different parameters. I do not see any benefit in having a status of Exited (137) 3 seconds ago instead of Exited (0) 5 seconds ago. Neither choice is really helpful and gives the user no idea that the container has been checkpointed.

I think Exited (137) 3 seconds ago is even worse as it sounds like a problem has happened.

The correct solution, from my point of view, would be to have a state with something like Checkpointed(0). But currently we are not tracking a checkpointed state at all.

During restore we check if the container is stopped and if it has a checkpoint/ directory. The only way a user can know that a container is checkpointed, is by looking at the userdata/ directory to see if there is a checkpoint or by trying to restore it and seeing that it does not fail.