kubernetes: Cannot deploy GlusterFS through Kubernetes - "Couldn't find an alternative telinit implementation to spawn"
BUG REPORT:
Uncomment only one, leave it on its own line:
/kind bug
/kind feature
What happened: I’m running a simple Kubernetes cluster with master and one node. All basic PODs are deploying successfully (ie. weave-net) but when I try to deploy GlusterFS as a POD - container creation fails with error:
Couldn’t find an alternative telinit implementation to spawn.
This is a result of failing to perform
CMD [“/usr/sbin/init”]
from gluster/gluster-centos docker file.
The weird part is that if I run this image directly on node through docker it runs smoothly.
What you expected to happen:
I’d expect to be able to deploy GlusterFS through Kubernetess properly.
How to reproduce it (as minimally and precisely as possible):
Setup minimal cluster and try to deploy the follwing POD
apiVersion: v1
kind: Pod
metadata:
name: glusterfs
labels:
glusterfs-node: pod
spec:
hostNetwork: true
nodeSelector:
storagenode: glusterfs
restartPolicy: Never
containers:
- name: glusterfs
image: gluster/gluster-centos
imagePullPolicy: Always
volumeMounts:
- name: glusterfs-cgroup
mountPath: "/sys/fs/cgroup"
securityContext:
capabilities: {}
privileged: true
readinessProbe:
timeoutSeconds: 3
initialDelaySeconds: 60
exec:
command:
- "/bin/bash"
- "-c"
- systemctl status glusterd.service
livenessProbe:
timeoutSeconds: 3
initialDelaySeconds: 60
exec:
command:
- "/bin/bash"
- "-c"
- systemctl status glusterd.service
volumes:
- name: glusterfs-cgroup
hostPath:
path: "/sys/fs/cgroup"
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version):
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T23:15:59Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd6
4"}
Server Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.1", GitCommit:"1dc5c66f5dd61da08412a74221ecc79208c2165b", GitTreeState:"clean", BuildDate:"2017-07-14T01:48:01Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd6
4"}
- Cloud provider or hardware configuration**: virtualbox
- OS (e.g. from /etc/os-release):
NAME="Ubuntu"
VERSION="16.04.2 LTS (Xenial Xerus)"
- Kernel (e.g.
uname -a):
Linux k8s-master 4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
- Docker version
/home/ubuntu# docker version
Client:
Version: 17.05.0-ce
API version: 1.29
Go version: go1.7.5
Git commit: 89658be
Built: Thu May 4 22:10:54 2017
OS/Arch: linux/amd64
Server:
Version: 17.05.0-ce
API version: 1.29 (minimum version 1.12)
Go version: go1.7.5
Git commit: 89658be
Built: Thu May 4 22:10:54 2017
OS/Arch: linux/amd64
Experimental: false
- Install tools: vagrant
- Others:
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 39 (22 by maintainers)
Commits related to this issue
- Merge pull request #51634 from verb/sharedpid-default-off Automatic merge from submit-queue (batch tested with PRs 51984, 51351, 51873, 51795, 51634) Revert to using isolated PID namespaces in Docke... — committed to kubernetes/kubernetes by deleted user 7 years ago
- Modify template to work with kubernetes 1.7 In 1.7 release the shared namespace is enabled by default. This is causing a problem when running pbench container which runs systemd. Related issue in up... — committed to chaitanyaenr/svt by chaitanyaenr 7 years ago
- Modify template to work with kubernetes 1.7 In 1.7 release the shared PID namespaces is enabled by default. This is causing a problem when running pbench container which runs systemd. Related issue ... — committed to chaitanyaenr/svt by chaitanyaenr 7 years ago
- Modify template to work with kubernetes 1.7 (#345) In 1.7 release the shared PID namespaces is enabled by default. This is causing a problem when running pbench container which runs systemd. Rela... — committed to openshift/svt by chaitanyaenr 7 years ago
- enabled PID=1 for pause container with this the pause container can handle zombie processes see: https://www.ianlewis.org/en/almighty-pause-container sorry for glusterfs VM cotainer: https://github.... — committed to utopia-planitia/kubernetes by damoon 7 years ago
In kubernetes 1.7, under pid # 1, there is a “/ pause” process, and since init takes pid 1, it is no longer available. In earlier versions of kubernetes, pid # 1 was free .
Kubernetes now shares a single PID namespace among all containers in a pod when running with docker >= 1.13.1. This means processes can now signal processes in other containers in a pod, but it also means that the
kubectl exec {pod} kill 1pattern will cause the pod to be restarted rather than a single container. https://github.com/kubernetes/kubernetes/pull/45236I can work around the problem using the
--docker-disable-shared-pidoption in the kubelet, however this not the desired solution of course.I’m actually getting this same error with 1.7.1 and docker 1.13.1. Anyone have other thoughts on what the issue might be here? I found this so question (https://stackoverflow.com/questions/36545105/docker-couldnt-find-an-alternative-telinit-implementation-to-spawn) but it looks like everything is configured correctly.
/priority critical-urgent
gluster/gluster-centos is a container image that runs systemd so that it can run ntpd, crond, gssproxy, glusterd & sshd inside a single container. This isn’t how Kubernetes is intended to be used and I don’t know how much we should go out of our way to support it (this is a question for @dchen1107 & @yujuhong)
https://developers.redhat.com/blog/2016/09/13/running-systemd-in-a-non-privileged-container/ seems to be a good write up of the challenges of running systemd inside of docker. It also prevents Kubernetes from doing process management like:
Ideally we could change this many-processes-in-a-container to be many-containers-in-a-pod, something like this:
This doesn’t need a privileged container and doesn’t run sshd or ntpd, neither of which are needed in Kubernetes. This also doesn’t run crond, but that may actually be needed by gluster. I’ve never used gluster. (I used
--debugflags so I didn’t have to investigate how to make these processes log to stdout)If you must run systemd in a container, adding this to the container config will get systemd to run:
This tells systemd to run in system mode even though it’s not PID 1 and disables its chroot detection (which is unrelated to PID 1 but checked when invoked as “systemd”, I guess).