kale: RPC Permissions Errors - Kubeflow + Kale

Running on a local Kubernetes 1.17 cluster (installed using KubeADM) Kubeflow installed using kfctl v1.1.0-0-g9a3621e / kfctl_k8s_istio.v1.1.0.yaml Installed Kale with following Dockerfile:

FROM gcr.io/kubeflow-images-public/tensorflow-1.15.2-notebook-gpu:1.0.0

USER root

RUN cd /tmp && git clone https://github.com/kubeflow-kale/kale
RUN cd /tmp/kale/backend && python3 ./setup.py install

RUN pip3 install --upgrade pip
RUN pip3 install kubeflow-kale
RUN jupyter labextension install kubeflow-kale-launcher
RUN jupyter labextension list

RUN chown jovyan -R /home/jovyan

USER 1000

The labextension list step output the following:

Step 8/10 : RUN jupyter labextension list
 ---> Running in c050c1c7b6a4
   app dir: /usr/local/share/jupyter/lab
        kubeflow-kale-launcher v1.4.0  enabled  OK

Launching the Jupyter Notebook results in RPC errors / RBAC issues.

I can navigate through though and attempting to “Compile and Run” a notebook, I get:

An RPC Error has occurred
Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36

Type: RPC

Method: nb.compile_notebook()

Code: 6 (UnhandledError)

Transaction ID: j48hs35tcc

Message: compile_notebook() got an unexpected keyword argument 'auto_snapshot'

Details: You can find more information under /home/jovyan/kale.log

Kale.log:

2020-09-13 21:35:49 run:83 [[DEBUG]] [TID=j48hs35tcc] [] Decoding ctx of RPC function 'nb.compile_notebook'
2020-09-13 21:35:49 run:95 [[DEBUG]] [TID=j48hs35tcc] [/home/jovyan/Untitled.ipynb] Decoding kwargs of RPC function 'nb.compile_notebook'
2020-09-13 21:35:49 run:104 [[DEBUG]] [TID=j48hs35tcc] [/home/jovyan/Untitled.ipynb] Importing RPC function 'nb.compile_notebook'
2020-09-13 21:35:49 run:114 [[INFO]] [TID=j48hs35tcc] [/home/jovyan/Untitled.ipynb] Executing RPC function 'compile_notebook(source_notebook_path=Untitled.ipynb, notebook_metadata_overrides={'experiment': {'name': 'Test Run', 'id': 'new'}, 'experiment_name': 'Test Run', 'pipeline_name': 'test-pipeline', 'pipeline_description': 'First Try Building a Pipeline Using Kale', 'docker_image': 'nexus1.technicalabs.com:8123/kubeflow-images/tensorflow-1.15.2-notebook-gpu-kale:1.0.0', 'volumes': []}, debug=False, auto_snapshot=False)'
2020-09-13 21:35:49 run:125 [[ERROR]] [TID=j48hs35tcc] [/home/jovyan/Untitled.ipynb] RPC function 'compile_notebook' raised an unhandled exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/kubeflow_kale-0.5.1-py3.6.egg/kale/rpc/run.py", line 116, in run
    result = func(request, **kwargs)
TypeError: compile_notebook() got an unexpected keyword argument 'auto_snapshot'

I do not have ROC or any ROC related clients installed, I also had both

  • Use this notebook's volumes: False
    
  • Take Rok Snapshots before each step: False
    

I expected that having installed Kale per the instructions here: https://pypi.org/project/kubeflow-kale/0.3.2/

that I should be able to create and run a pipeline from a given notebook.

What RBAC / RPC configurations must be included?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 18 (9 by maintainers)

Most upvoted comments

Following the comment by yhwang, I was able to get Kale to create and upload a pipeline. https://github.com/kubeflow/pipelines/issues/4440#issuecomment-687689294

For the default admin@kubeflow.org user in the admin namespce I used this for the servicerolebinding:

cat <<EOF | kubectl apply -f -
apiVersion: rbac.istio.io/v1alpha1
kind: ServiceRoleBinding
metadata:
  name: bind-ml-pipeline-nb-admin
  namespace: kubeflow
spec:
  roleRef:
    kind: ServiceRole
    name: ml-pipeline-services
  subjects:
  - properties:
      source.principal: cluster.local/ns/admin/sa/default-editor
EOF

And this for the envoy filter:

cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: add-header
  namespace: admin
spec:
  configPatches:
  - applyTo: VIRTUAL_HOST
    match:
      context: SIDECAR_OUTBOUND
      routeConfiguration:
        vhost:
          name: ml-pipeline.kubeflow.svc.cluster.local:8888
          route:
            name: default
    patch:
      operation: MERGE
      value:
        request_headers_to_add:
        - append: true
          header:
            key: kubeflow-userid
            value: admin@kubeflow.org
  workloadSelector:
    labels:
      notebook-name: mzquality
EOF

EDIT

This actually seems to be similar to the setup @elikatsis mentioned Arrikto uses for their Kubeflow Enterprise deployments.

https://docs.arrikto.com/integrations/kubeflow.html#set-up-namespaces