k8s-device-plugin: k8s-device-plugin fails with k8s static CPU policy

1. Issue or feature description

Kubelet configured with a static CPU policy (e.g. --cpu-manager-policy=static --kube-reserved cpu=0.1) will cause nvidia-smi to fail after short delay.

Configure a test pod to request a nvidia.com/gpu resource, then run a simple nvidia-smi command as “sleep 30; nvidia-smi” and this will always fail with: “Failed to initialize NVML: Unknown Error”

Running the same without the sleep, command works and nvidia-smi returns the expected info

2. Steps to reproduce the issue

Kubernetes 1.14 $ kubelet --version Kubernetes v1.14.8 Device plugin: nvidia/k8s-device-plugin:1.11 (also with 1.0.0.0-beta4)

apply the daemonset for the nvidia plugin then apply a pod yaml for a pod requesting one device:

kind: Pod
metadata:
  name: gputest
spec:
  containers:
  - command:
    - /bin/bash
    args:
    - -c
    - "sleep 30; nvidia-smi"
    image: nvidia/cuda:8.0-runtime-ubuntu16.04
    name: app
    resources:
      limits:
        cpu: "1"
        memory: 1Gi
        nvidia.com/gpu: "1"
      requests:
        cpu: "1"
        memory: 1Gi
        nvidia.com/gpu: "1"
  restartPolicy: Never
  tolerations:
  - effect: NoSchedule
    operator: Exists
  nodeSelector:
    beta.kubernetes.io/arch: amd64

then follow the pod logs:

Failed to initialize NVML: Unknown Error

The pod persists in this state

3. Information to attach (optional if deemed irrelevant)

Common error checking:

The output of nvidia-smi -a on your host


==============NVSMI LOG==============

Timestamp                           : Tue Nov 12 12:22:08 2019
Driver Version                      : 390.30

Attached GPUs                       : 1
GPU 00000000:03:00.0
    Product Name                    : Tesla M2090
    Product Brand                   : Tesla
    Display Mode                    : Disabled
    Display Active                  : Disabled
    Persistence Mode                : Disabled
    Accounting Mode                 : N/A
    Accounting Mode Buffer Size     : N/A
    Driver Model
        Current                     : N/A
        Pending                     : N/A
    Serial Number                   : 0320512020115
    GPU UUID                        : GPU-f473d23b-0a01-034e-933b-58d52ca40425
    Minor Number                    : 0
    VBIOS Version                   : 70.10.46.00.01
    MultiGPU Board                  : No
    Board ID                        : 0x300
    GPU Part Number                 : N/A
    Inforom Version
        Image Version               : N/A
        OEM Object                  : 1.1
        ECC Object                  : 2.0
        Power Management Object     : 4.0
    GPU Operation Mode
        Current                     : N/A
        Pending                     : N/A
    GPU Virtualization Mode
        Virtualization mode         : None
    PCI
        Bus                         : 0x03
        Device                      : 0x00
        Domain                      : 0x0000
        Device Id                   : 0x109110DE
        Bus Id                      : 00000000:03:00.0
        Sub System Id               : 0x088710DE
        GPU Link Info
            PCIe Generation
                Max                 : 2
                Current             : 1
            Link Width
                Max                 : 16x
                Current             : 16x
        Bridge Chip
            Type                    : N/A
            Firmware                : N/A
        Replays since reset         : N/A
        Tx Throughput               : N/A
        Rx Throughput               : N/A
    Fan Speed                       : N/A
    Performance State               : P12
    Clocks Throttle Reasons         : N/A
    FB Memory Usage
        Total                       : 6067 MiB
        Used                        : 0 MiB
        Free                        : 6067 MiB
    BAR1 Memory Usage
        Total                       : N/A
        Used                        : N/A
        Free                        : N/A
    Compute Mode                    : Default
    Utilization
        Gpu                         : 0 %
        Memory                      : 0 %
        Encoder                     : N/A
        Decoder                     : N/A
    Encoder Stats
        Active Sessions             : 0
        Average FPS                 : 0
        Average Latency             : 0
    Ecc Mode
        Current                     : Disabled
        Pending                     : Disabled
    ECC Errors
        Volatile
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
        Aggregate
            Single Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
            Double Bit
                Device Memory       : N/A
                Register File       : N/A
                L1 Cache            : N/A
                L2 Cache            : N/A
                Texture Memory      : N/A
                Texture Shared      : N/A
                CBU                 : N/A
                Total               : N/A
    Retired Pages
        Single Bit ECC              : N/A
        Double Bit ECC              : N/A
        Pending                     : N/A
    Temperature
        GPU Current Temp            : N/A
        GPU Shutdown Temp           : N/A
        GPU Slowdown Temp           : N/A
        GPU Max Operating Temp      : N/A
        Memory Current Temp         : N/A
        Memory Max Operating Temp   : N/A
    Power Readings
        Power Management            : Supported
        Power Draw                  : 29.81 W
        Power Limit                 : 225.00 W
        Default Power Limit         : N/A
        Enforced Power Limit        : N/A
        Min Power Limit             : N/A
        Max Power Limit             : N/A
    Clocks
        Graphics                    : 50 MHz
        SM                          : 101 MHz
        Memory                      : 135 MHz
        Video                       : 135 MHz
    Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Default Applications Clocks
        Graphics                    : N/A
        Memory                      : N/A
    Max Clocks
        Graphics                    : 650 MHz
        SM                          : 1301 MHz
        Memory                      : 1848 MHz
        Video                       : 540 MHz
    Max Customer Boost Clocks
        Graphics                    : N/A
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A
    Processes                       : None

Your docker configuration file (e.g: /etc/docker/daemon.json)

{
    "experimental": true,
    "storage-driver": "overlay2",
    "default-runtime": "nvidia",
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

The k8s-device-plugin container logs

2019/11/11 19:10:56 Loading NVML
2019/11/11 19:10:56 Fetching devices.
2019/11/11 19:10:56 Starting FS watcher.
2019/11/11 19:10:56 Starting OS watcher.
2019/11/11 19:10:56 Starting to serve on /var/lib/kubelet/device-plugins/nvidia.sock
2019/11/11 19:10:56 Registered device plugin with Kubelet

The kubelet logs on the node (e.g: sudo journalctl -r -u kubelet) repeated:

Nov 12 12:32:21 dal1k8s-worker-06 kubelet[8053]: E1112 12:32:21.880196    8053 cpu_manager.go:252] [cpumanager] reconcileState: failed to add container (pod: kube-proxy-bm82q, container: kube-proxy, container id: 92273ce7687ead38fb1c59b18934179183ea1b9e4f59107e92eec2f987bb91be, error: rpc error: code = Unknown desc
Nov 12 12:32:21 dal1k8s-worker-06 kubelet[8053]: I1112 12:32:21.880175    8053 policy_static.go:195] [cpumanager] static policy: RemoveContainer (container id: 92273ce7687ead38fb1c59b18934179183ea1b9e4f59107e92eec2f987bb91be)
Nov 12 12:32:21 dal1k8s-worker-06 kubelet[8053]: : unknown
Nov 12 12:32:21 dal1k8s-worker-06 kubelet[8053]: E1112 12:32:21.880153    8053 cpu_manager.go:183] [cpumanager] AddContainer error: rpc error: code = Unknown desc = failed to update container "92273ce7687ead38fb1c59b18934179183ea1b9e4f59107e92eec2f987bb91be": Error response from daemon: Cannot update container 92273
Nov 12 12:32:21 dal1k8s-worker-06 kubelet[8053]: : unknown
Nov 12 12:32:21 dal1k8s-worker-06 kubelet[8053]: E1112 12:32:21.880081    8053 remote_runtime.go:350] UpdateContainerResources "92273ce7687ead38fb1c59b18934179183ea1b9e4f59107e92eec2f987bb91be" from runtime service failed: rpc error: code = Unknown desc = failed to update container "92273ce7687ead38fb1c59b1893417918

Additional information that might help better understand your environment and reproduce the bug:

Docker version from docker version Version: 18.09.1
Docker command, image and tag used
Kernel version from uname -a

Linux dal1k8s-worker-06 4.4.0-135-generic NVIDIA/nvidia-docker#161-Ubuntu SMP Mon Aug 27 10:45:01 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Any relevant kernel output lines from dmesg

[    2.840610] nvidia: module license 'NVIDIA' taints kernel.
[    2.879301] nvidia-nvlink: Nvlink Core is being initialized, major device number 245
[    2.911779] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  390.30  Wed Jan 31 21:32:48 PST 2018
[    2.912960] [drm] [nvidia-drm] [GPU ID 0x00000300] Loading driver
[   13.893608] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 242

NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'

Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                                                      Version                                   Architecture                              Description
+++-=========================================================================-=========================================-=========================================-=======================================================================================================================================================
ii  libnvidia-container-tools                                                 1.0.1-1                                   amd64                                     NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64                                                1.0.1-1                                   amd64                                     NVIDIA container runtime library
ii  nvidia-390                                                                390.30-0ubuntu1                           amd64                                     NVIDIA binary driver - version 390.30
ii  nvidia-container-runtime                                                  2.0.0+docker18.09.1-1                     amd64                                     NVIDIA container runtime
ii  nvidia-container-runtime-hook                                             1.4.0-1                                   amd64                                     NVIDIA container runtime hook
un  nvidia-current                                                            <none>                                    <none>                                    (no description available)
un  nvidia-docker                                                             <none>                                    <none>                                    (no description available)
ii  nvidia-docker2                                                            2.0.3+docker18.09.1-1                     all                                       nvidia-docker CLI wrapper
un  nvidia-driver-binary                                                      <none>                                    <none>                                    (no description available)
un  nvidia-legacy-340xx-vdpau-driver                                          <none>                                    <none>                                    (no description available)
un  nvidia-libopencl1-390                                                     <none>                                    <none>                                    (no description available)
un  nvidia-libopencl1-dev                                                     <none>                                    <none>                                    (no description available)
un  nvidia-opencl-icd                                                         <none>                                    <none>                                    (no description available)
ii  nvidia-opencl-icd-390                                                     390.30-0ubuntu1                           amd64                                     NVIDIA OpenCL ICD
un  nvidia-persistenced                                                       <none>                                    <none>                                    (no description available)
ii  nvidia-prime                                                              0.8.2                                     amd64                                     Tools to enable NVIDIA's Prime
ii  nvidia-settings                                                           410.79-0ubuntu1                           amd64                                     Tool for configuring the NVIDIA graphics driver
un  nvidia-settings-binary                                                    <none>                                    <none>                                    (no description available)
un  nvidia-smi                                                                <none>                                    <none>                                    (no description available)
un  nvidia-vdpau-driver                                                       <none>                                    <none>                                    (no description available)

NVIDIA container library version from nvidia-container-cli -V

version: 1.0.1
build date: 2019-01-15T23:24+00:00
build revision: 038fb92d00c94f97d61492d4ed1f82e981129b74
build compiler: gcc-5 5.4.0 20160609
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections```


 - [ ] NVIDIA container library logs (see [troubleshooting](https://github.com/NVIDIA/nvidia-docker/wiki/Troubleshooting))

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 15 (9 by maintainers)

Most upvoted comments

I see. I think I can picture what the issue might be. Let me confirm it later today and I’ll provide an update here. Thanks.

klueska on Dec 16, 2020

PR to fix this is tested and ready to be merged. Will be included in the upcoming v0.9.0 release.

https://gitlab.com/nvidia/kubernetes/device-plugin/-/merge_requests/80

klueska on Feb 24, 2021

Yes, I can confirm that this is an issue.

MIG support in the k8s-device-plugin was tested together with the compatWithCPUManager when it first came out and it worked just fine. However, since that time, the way that the underlying GPU driver exposes MIG to a container has changed. It was originally based on something called /proc based nvidia-capabilities and now it’s based on something called /dev based nvidia-capabilities (more info on this here).

Without going into too much detail, when the underlying driver switched its implementation for this, it broke compatWithCPUManager in the k8s-device-plugin when MIG is enabled.

The fix should be fairly straightforward and will involve listing out the set of device nodes associated with the nvidia-capabibilities that grant access to the MIG device being allocated – and sending them back to the kubelet (the same way the device nodes for full GPUs are sent back here).

I have added this to our list of tasks for v0.8.0 which will be released sometime in January.

In the meantime, if you need this to work today, you can follow the advice in “Working with nvidia-capabilities” and flip your driver settings from /dev based nvidia-capabilities to /proc based nvidia-capabilities via:

$ modprobe nvidia nv_cap_enable_devfs=0

That should get things working again until a fix come out. It is not a long-term fix, however, as support for /proc based nvidia-capabilities will disappear in a future driver release.

Thanks for reporting!

klueska on Dec 16, 2020

This is a known issue and reported before: https://github.com/NVIDIA/nvidia-container-toolkit/issues/138

Unfortunately, there is no upstream fix for this yet. The plan is to address it as part of the upcoming redeisgn for the device plugins: https://docs.google.com/document/d/1wPlJL8DsVpHnbVbTaad35ILB-jqoMLkGFLnQpWWNduc/edit

klueska on Nov 13, 2019