podman: podman unable to limit memory (-m flag) on Ubuntu/Debian distros

/kind bug

Description

Was trying to test drive Hashicorp Nomad podman driver on Debian/Ubuntu machines and I ran into an interesting issue. Deployment of containers, via Nomad, fail, yet if I just ran podman run <image> it worked fine.

The errors I was getting were “failed to write -1 to /sys/fs/cgroup/memory/machine.slice/libpod-2e61c7b46bc2aeed6dadecb07583e97e03ffcc694.scope/memory.memsw.limit_in_bytes”.

I looked into that libpod-*-.scope cgroup directory and it’s missing the memory.memsw.limit_in_bytes resource control file

image

I then deployed to a Centos8 machine and it worked just fine. I followed up by checking to see if that file was present and sure enough it was.

image

Due to this difference in the cgroup control files podman, and thus Nomad, is unable to limit memory in the Debian/Ubuntu distros

Steps to reproduce the issue:

  1. sudo podman run -m=40m hello-world (on Ubuntu16-20 or Debian/9/10)

Describe the results you received:

Your kernel does not support swap limit capabilities,or the cgroup is not mounted. Memory limited without swap. Error: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:415: setting cgroup config for procHooks process caused \\\"failed to write \\\\\\\"-1\\\\\\\" to \\\\\\\"/sys/fs/cgroup/memory/machine.slice/libpod-8639bb6811f6bec5282ee72dac48abb6a647ce7daf84dd000b195dc4fe6f4df7.scope/memory.memsw.limit_in_bytes\\\\\\\": open /sys/fs/cgroup/memory/machine.slice/libpod-8639bb6811f6bec5282ee72dac48abb6a647ce7daf84dd000b195dc4fe6f4df7.scope/memory.memsw.limit_in_bytes: permission denied\\\"\"": OCI runtime permission denied error

Describe the results you expected:

Output of hello-world container/successful job deployment status on Nomad

Additional information you deem important (e.g. issue happens only occasionally):

I’ve tried using multiple versions of podman. I’ve looked into all the Ubuntu distros 16.04+ and the same issue in all of those releases, so it seems to be just a weird divergence in cgroup resource control interfaces between the RPM and Deb distros.

And I want to repeat, running podman without -m flag WORKS. both rootless and with root.

Output of podman version:

Version:            1.9.2
RemoteAPI Version:  1
Go Version:         go1.10.1
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  gitCommit: ""
  goVersion: go1.10.1
  podmanVersion: 1.9.2
host:
  arch: amd64
  buildahVersion: 1.14.8
  cgroupVersion: v1
  conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.16, commit: '
  cpus: 1
  distribution:
    distribution: ubuntu
    version: "18.04"
  eventLogger: file
  hostname: nomad-server01
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
  kernel: 4.15.0-101-generic
  memFree: 137125888
  memTotal: 1033011200
  ociRuntime:
    name: runc
    package: 'runc: /usr/sbin/runc'
    path: /usr/sbin/runc
    version: 'runc version spec: 1.0.1-dev'
  os: linux
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 0.4.3
      commit: unknown
  swapFree: 0
  swapTotal: 0
  uptime: 11h 50m 16.22s (Approximately 0.46 days)
registries:
  search:
  - docker.io
  - quay.io
store:
  configFile: /home/vagrant/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 0
    stopped: 2
  graphDriverName: vfs
  graphOptions: {}
  graphRoot: /home/vagrant/.local/share/containers/storage
  graphStatus: {}
  imageStore:
    number: 1
  runRoot: /run/user/1000/containers
  volumePath: /home/vagrant/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

podman/unknown,now 1.9.2~3 amd64 [installed]

Additional environment details (AWS, VirtualBox, physical, etc.):

VirtualBox, GCP compute instances

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 45 (20 by maintainers)

Most upvoted comments

opened a PR to add the same check on cgroup v2: https://github.com/containers/podman/pull/8197

Did you try adding swapaccount=1 and cgroup_enable=memoryas parameter to your init process? (Probably by modifying grub configuration)

I think I remember having the sample problem a while ago and fixing it this way

Not yet - fell through the cracks (I forgot to self-assign so I didn’t see it when I reviewed open issues - oops). I’ll try and get to it tomorrow.

I’ll try and get to this one this week

@mheon Awesome. When swap limiting is disabled, the cgroup resource control file to control swap+mem, memsw.limit_in_bytes, is not present. The code provided by @afbjorklund looks for the presence of that control file as a condition.

i.e

if runtime.GOOS == "linux" {
       if _, err := os.Stat("/sys/fs/cgroup/memory/memsw.limit_in_bytes"); os.IsNotExist(err) {
       }
}

For the fix. We would need to switch from using memsw.limit_in_bytes to memory.limit_in_bytes