kubernetes: standalone kubelet panic because of nil pointer in VolumeManager

What happened?

I tried to run a standalone kubelet (without kube-apiserver, so no kubeClient in kubelet), but there is a panic “invalid memory address or nil pointer dereference”

The corresponding logs:

E0211 15:50:57.552748  794906 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 60 [running]:
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.logPanic({0x3bd69e0, 0x72c4480})
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x85
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xc00020ccc0})
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x75
panic({0x3bd69e0, 0x72c4480})
	/home/xiejinwei/.gvm/gos/go1.17/src/runtime/panic.go:1038 +0x215
k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler.(*reconciler).updateDevicePath(0xc00020ccc0, 0x24)
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler/reconciler.go:590 +0x40
k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler.(*reconciler).updateStates(0xc00020ccc0, 0xc00185fc68)
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler/reconciler.go:621 +0x3e
k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler.(*reconciler).syncStates(0xc00020ccc0)
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler/reconciler.go:425 +0x1a5
k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler.(*reconciler).sync(0x0)
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler/reconciler.go:339 +0x50
k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler.(*reconciler).reconciliationLoopFunc.func1()
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler/reconciler.go:158 +0xd6
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000ee6f00)
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x67
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x410c67, {0x4af7180, 0xc00034a960}, 0x1, 0xc00005e4e0)
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xb6
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00020ccc0, 0x5f5e100, 0x0, 0x0, 0x442ea5)
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x89
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(...)
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90
k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler.(*reconciler).Run(0xc00020ccc0, 0x0)
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/volumemanager/reconciler/reconciler.go:146 +0x45
created by k8s.io/kubernetes/pkg/kubelet/volumemanager.(*volumeManager).Run
	/home/xiejinwei/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/volumemanager/volume_manager.go:292 +0x35f

What did you expect to happen?

I expect standalone kubelet to run without panic.

How can we reproduce it (as minimally and precisely as possible)?

Run a standalone kubelet

Anything else we need to know?

I think this is a problem of kubelet passing kubeClient into VolumeManager without checking if it is nil.

Related code snippet:

Kubernetes version

I’m only running kubelet, the kubelet version is Kubernetes v1.22.7-rc.0.17+559b798e25c8f8-dirty

Cloud provider

Not using cloud provider

OS version

$ cat /etc/os-release

NAME="Ubuntu"
VERSION="20.04.3 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.3 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Install tools

I installed with source code, the tag is release-1.22

git clone https://github.com/kubernetes/kubernetes.git
git fetch origin release-1.22
git checkout release-1.22
make WHAT=cmd/kubelet
cp ./_output/local/bin/linux/amd64/kubelet /usr/local/bin/kubelet

Container runtime (CRI) and and version (if applicable)

containerd 1.4.12

Related plugins (CNI, CSI, …) and versions (if applicable)

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 17 (14 by maintainers)

Most upvoted comments

Is api-server unavailability permanent or just during startup?

api-server unavailability does not result in a nil kubeClient… a nil kubeClient means the kubelet was started in standalone mode and was told not to talk to a kube-apiserver

We can’t just skip updating node objects and still do rest of reconstruction if kubeClient is nil. This could potentially be unsafe because volumes which are being used may not be reported correctly via node APIs. So we need to rethink - what volume functions are expected to work if api-server is unavailable? Is standalone mode supposed to mount any volumes at all? cc @jingxu97

If kubelet is not supposed to be mounting any volumes at all, then shouldn’t kubelet not even start volume-manager in standalone mode? Is api-server unavailability permanent or just during startup? I am not familiar with standalone mode of kubelet and hence the question.

/sig node

Just to make sure, is the command to run standalone kubelet just “./kubelet”?

@raphminkyu Hi, the full command used to run the standalone kubelet is as below:

/home/xiejinwei/cfc-workspace/standalone-kubelet/kubelet/kubelet \
  --config /home/xiejinwei/cfc-workspace/standalone-kubelet/kubelet/config.yaml \
  --container-runtime-endpoint unix:///home/xiejinwei/cfc-workspace/standalone-kubelet/containerd/containerd.sock \
  --cni-bin-dir /home/xiejinwei/cfc-workspace/standalone-kubelet/containerd/cri/cni/bin \
  --cni-conf-dir /home/xiejinwei/cfc-workspace/standalone-kubelet/containerd/cri/cni/net.d \
  --resolv-conf /run/systemd/resolve/resolv.conf \
  --runtime-cgroups /kubeletreserved.slice/kubeletreserved.runtime.slice \
  --kubelet-cgroups /kubeletreserved.slice/kubeletreserved.kubelet.slice \
  --kube-reserved-cgroup=kubeletreserved.slice

/sig storage

The key is to know where to put the

	if kl.kubeClient != nil {

😃