youki: rust checkpoint integration tests fail on kernel version 5.11

In my system the checkpointing tests fail, even though those are passing in the CI. This might be due to kernel not compiled with support for checkpointing , or some other issues. If this is a support issue, then the tests should be put behind conditional tests, which will check the support. In any case the error message should be changed to include the file name which it is trying to read, so the error can be better uderstood when encountered.

Update : Modify test to copy the log file in case of failure

Error message :

3 / 7 : checkpoint and leave running with --work-path /tmp : not ok
        Error :
stdout :
stderr : Error: failed to checkpoint container 558aea4b-ece9-a4a3-eada-40afee9f28ca

Caused by:
    checkpointing container 558aea4b-ece9-a4a3-eada-40afee9f28ca failed with Os { code: 2, kind: NotFound, message: "No such file or directory" }. Please check CRIU logfile /tmp//dump.log

4 / 7 : checkpoint and leave running : not ok
        Error :
stdout :
stderr : Error: failed to checkpoint container 558aea4b-ece9-a4a3-eada-40afee9f28ca

Caused by:
    checkpointing container 558aea4b-ece9-a4a3-eada-40afee9f28ca failed with Os { code: 2, kind: NotFound, message: "No such file or directory" }. Please check CRIU logfile /tmp/fef46b54-651d-49b8-ef89-bca5dd998f8a/checkpoint/dump.log

System info :

Version           0.0.2
Commit            f66d389
Kernel-Release    5.11.0-49-generic
Kernel-Version    #55-Ubuntu SMP Wed Jan 12 17:36:34 UTC 2022
Architecture      x86_64
Operating System  Ubuntu 21.04
Cores             8
Total Memory      7859
Cgroup setup      hybrid
Cgroup mounts
  blkio           /sys/fs/cgroup/blkio
  cpu             /sys/fs/cgroup/cpu,cpuacct
  cpuacct         /sys/fs/cgroup/cpu,cpuacct
  cpuset          /sys/fs/cgroup/cpuset
  devices         /sys/fs/cgroup/devices
  freezer         /sys/fs/cgroup/freezer
  hugetlb         /sys/fs/cgroup/hugetlb
  memory          /sys/fs/cgroup/memory
  net_cls         /sys/fs/cgroup/net_cls,net_prio
  net_prio        /sys/fs/cgroup/net_cls,net_prio
  perf_event      /sys/fs/cgroup/perf_event
  pids            /sys/fs/cgroup/pids
  unified         /sys/fs/cgroup/unified
CGroup v2 controllers
  cpu             detached
  cpuset          detached
  hugetlb         detached
  io              detached
  memory          detached
  pids            detached
  device          attached
Namespaces        enabled
  mount           enabled
  uts             enabled
  ipc             enabled
  user            enabled
  pid             enabled
  network         enabled

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 25 (14 by maintainers)

Most upvoted comments

Closing, as the main issue was not really related to youki itself, more of a dependency/setup issue. And yes, maybe we should walk away this time 😄

I was thinking the same thing. Along with the binary other errors are also reported as-are which can benefit from additional context information, especially when trying to locate where exactly the error is occurring. I have opened https://github.com/checkpoint-restore/rust-criu/issues/1 , if that is fine, I would work on adding context messages to the errors that are propagated from the library to its users.

Maybe we should provide a better error message if the binary is missing at some level of the stack.

Hey, so my distribution had 1.0.3, which is why the issue was happening. I have tested on 1.1.0 and it works perfectly.

Thanks for all the help, and apologies for the confusion caused just because of my setup issues 😓 😓

If @utam0k 's issue was due to the same reason, I will open a PR and update the Readme to mention these requirements, and then we can close this issue.

Thanks 😃

It is strange as it works in CI and on my system.

The error that is returned is File not found, so it appears that the library is attempting to read/write some file which does not exist. As rust-criu is an external crate to the project,I cannot trace execution in it by printlns or other ways, as cargo does not trigger re-compilations of those.

What I did during development is to make youki point to a local copy of the rust-criu crate using file paths instead of an URL to github. That way I was able to insert print statements.

Also important, rust-criu is taken from https://github.com/checkpoint-restore/rust-criu

What is the value of /proc/sys/kernel/yama/ptrace_scope? That should be 0.

Are you running the test as root?