kubeflow: Jupyterlab, Rstudio and VSCode do not run as non root in Kubeflow 1.8
/kind bug
[julius@fedora 1.3]$ kubectl -n kubeflow-user logs pod/julius1-0
s6-overlay-preinit: fatal: unable to mkdir /var/run/s6: Permission denied
It also does not work with podman
podman run --user 1:0 public.ecr.aws/j1r0q0g6/notebooks/notebook-servers/jupyter-scipy:v1.3.0-rc.0
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/01-copy-tmp-home: Operation not permitted
s6-chmod: fatal: unable to change mode of /var/run/s6/etc/cont-init.d/01-copy-tmp-home: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/jupyterlab/run: Operation not permitted
s6-chmod: fatal: unable to change mode of /var/run/s6/etc/services.d/jupyterlab/run: Operation not permitted
exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 01-copy-tmp-home: executing...
cp: cannot access '/tmp_home/jovyan/.jupyter': Permission denied
[cont-init.d] 01-copy-tmp-home: exited 1.
[cont-init.d] done.
[services.d] starting services
s6-supervise (child): fatal: unable to exec run: Permission denied
s6-supervise jupyterlab: warning: unable to spawn ./run - waiting 10 seconds
[services.d] done.
s6-supervise (child): fatal: unable to exec run: Permission denied
s6-supervise jupyterlab: warning: unable to spawn ./run - waiting 10 seconds
s6-supervise (child): fatal: unable to exec run: Permission denied
s6-supervise jupyterlab: warning: unable to spawn ./run - waiting 10 seconds
s6-supervise (child): fatal: unable to exec run: Permission denied
The user should be set in the statefulset itself if the container is not good enough to run as any user
podman run --user 1000:0 public.ecr.aws/j1r0q0g6/notebooks/notebook-servers/jupyter-scipy:v1.3.0-rc.0
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/01-copy-tmp-home: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/jupyterlab/run: Operation not permitted
exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 01-copy-tmp-home: executing...
[cont-init.d] 01-copy-tmp-home: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
[I 2021-04-06 11:16:26.827 ServerApp] jupyterlab | extension was successfully linked.
[I 2021-04-06 11:16:26.834 ServerApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2021-04-06 11:16:27.103 ServerApp] nbclassic | extension was successfully linked.
[W 2021-04-06 11:16:27.186 ServerApp] All authentication is disabled. Anyone who can connect to this server will be able to run code.
[I 2021-04-06 11:16:27.203 LabApp] JupyterLab extension loaded from /opt/conda/lib/python3.8/site-packages/jupyterlab
[I 2021-04-06 11:16:27.203 LabApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2021-04-06 11:16:27.207 ServerApp] jupyterlab | extension was successfully loaded.
[I 2021-04-06 11:16:27.224 ServerApp] nbclassic | extension was successfully loaded.
[I 2021-04-06 11:16:27.225 ServerApp] Serving notebooks from local directory: /home/jovyan
[I 2021-04-06 11:16:27.225 ServerApp] Jupyter Server 1.4.1 is running at:
[I 2021-04-06 11:16:27.225 ServerApp] http://1ecd45fc4e8d:8888/lab
[I 2021-04-06 11:16:27.225 ServerApp] or http://127.0.0.1:8888/lab
[I 2021-04-06 11:16:27.225 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 41 (31 by maintainers)
Our team encountered the error discussed here, but in a slightly different context (I think). We were able to resolve the issue, read on for more information.
tl,dr;
Our team has to use a custom
securityContext
in order to access an NFS share. This prevents us from running as the default jovyan user (our UID/GID are different).We ran
chmod 775 s6/services.d/jupyterlab/run
before building our custom Jupyter image. This corresponds to the file https://github.com/kubeflow/kubeflow/blob/master/components/example-notebook-servers/jupyter/s6/services.d/jupyterlab/run here in the kubeflow/kubeflow repo.We also added
chmod -R 775 /tmp_home
to the relevant place in thejupyter
Dockerfile while building our custom image, but it is not clear whether that had any impact on fixing the issue.Problem Description / Environment
We created a custom notebook image (called
custom-jupyter:1
in this discussion) based on thebase
,jupyter
, andjupyter-tensorflow
Dockerfiles. Our custom Dockerfile combines the key parts of these into one, and reorganizes some of the steps slightly, but overall it is essentially the same logic condensed into a single file. We were able to spin up an instance of this custom image and access JupyterLab without problems.We are using Kubeflow 1.4.1 on Kubernetes 1.21.9. The Kubernetes cluster was provisioned through Rancher v2.6.3-patch1. We are using Docker 20.10.12 as our container engine under the hood.
We need to mount an NFS share, and in order to do that, we need to set a
securityContext
that changes the UID & GID. We currently do this by injecting thesecurityContext
block and the relevant NFS volume mounts viakubectl patch
. For reference, assume we spun up a notebook calledtest
, which creates podtest-0
; here is a heavily-redacted example of whatkubectl get -n myuser -o yaml pod test-0
gives us, showcasing the injected data in thesecurityContext
,volumeMounts
, andvolumes
areas:After patching our
custom-jupyter:1
notebook image, we are unable to connect to JupyterLab. As indicated throughout this thread, clicking the “CONNECT” button in the Kubeflow UI takes us to a page that just says:Our
kubectl logs
output is also similar to the output shown above:Problem Solution
Mounting an
emptyDir
to/var/run/s6
as suggested above did NOT fix the problem.I zeroed in on those initial error messages from s6 – specifically, I noticed that the permissions change of the
run
file for JupyterLab failed. I went back to our source for the Docker image and saw thats6/services.d/jupyterlab/run
was not executable. This file was copied as-is from the kubeflow/kubeflow repo’sjupyter
image example: https://github.com/kubeflow/kubeflow/blob/2d347e97d37b290d5764e84fc26f4d9870ba06ce/components/example-notebook-servers/jupyter/s6/services.d/jupyterlab/run (this URL is pinned to the latest master commit; the file is the same as the one we’re using).My assumption is that JupyterLab was failing to start because s6 failed to make the
run
script executable, and it wasn’t already executable to begin with.To fix this, I ran (in my local development environment)
That made the file executable.
I also noticed this error in the
kubectl logs
:After some investigation, it looks like after Jupyter gets installed, that particular directory has
600
permissions. I don’t know if this is even relevant to the issue at hand, but I also updated the Dockerfile to set mode775
on everything in/tmp_home
. (I probably should/could do something like 664 instead, but this doesn’t seem to have broken anything, so I probably will leave it as-is.)This change corresponds to lines 70-72 here: https://github.com/kubeflow/kubeflow/blob/2d347e97d37b290d5764e84fc26f4d9870ba06ce/components/example-notebook-servers/jupyter/Dockerfile
After this, I rebuilt the
custom-jupyter
Docker image. Deploying an instance of this new image along with the patches described above fixed the issue: I am able to access JupyterLab, and everything seems to work normally, despite the fact that I am running as a non-jovyan user with a different UID/GID.I don’t 100% understand why this works, but I think that since Istio injects the GID 1337 into my user as an additional GID (so I have both my custom 1234 GID shown in my sample YAML, and the Istio 1337 GID), the fact that everything is group-accessible allows me to have full access to jovyan’s things, despite not being jovyan.
Recommendation
My recommendation for now would be for the Kubeflow dev team to mark the
components/example-notebook-servers/jupyter/s6/services.d/jupyterlab/run
file as executable and then commit that to the repo and rebuild the sample images. Consider also making the change to/tmp_home
, although as stated it is not clear that that does anything useful.Yes, build your own Jupyterlab in a proper way without s6.