kale: Kale Pipeline list experiments raised an unhandled exception "Namespace is empty"

After applying a fix to allow Kale to compile, upload and run pipelines in a multi-user environment according to the steps I posted in https://github.com/kubeflow-kale/kale/issues/204#issuecomment-694771168, I receive the following error that Kale can’t properly list experiments as the namespace field is empty. Despite this error, Kale is able to upload the pipeline. However manually running that pipeline results in a HTTP error 403 during the Getting workflow step.

kale.log showing the function ‘list_experiments’ raised an unhandled exception

2020-09-18 12:16:20 run:124 [[ERROR]] [TID=e9ow93ylkm] [/home/jovyan/mzquality.ipynb] RPC function 'list_experiments' raised an unhandled exception
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/kale/rpc/run.py", line 116, in run
    result = func(request, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/kale/rpc/kfp.py", line 37, in list_experiments
    for e in c.list_experiments().experiments or []]
  File "/opt/conda/lib/python3.8/site-packages/kfp/_client.py", line 330, in list_experiments
    response = self._experiment_api.list_experiment(
  File "/opt/conda/lib/python3.8/site-packages/kfp_server_api/api/experiment_service_api.py", line 581, in list_experiment
    return self.list_experiment_with_http_info(**kwargs)  # noqa: E501
  File "/opt/conda/lib/python3.8/site-packages/kfp_server_api/api/experiment_service_api.py", line 682, in list_experiment_with_http_info
    return self.api_client.call_api(
  File "/opt/conda/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 378, in call_api
    return self.__call_api(resource_path, method,
  File "/opt/conda/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 202, in __call_api
    raise e
  File "/opt/conda/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 195, in __call_api
    response_data = self.request(
  File "/opt/conda/lib/python3.8/site-packages/kfp_server_api/api_client.py", line 403, in request
    return self.rest_client.GET(url,
  File "/opt/conda/lib/python3.8/site-packages/kfp_server_api/rest.py", line 244, in GET
    return self.request("GET", url,
  File "/opt/conda/lib/python3.8/site-packages/kfp_server_api/rest.py", line 238, in request
    raise ApiException(http_resp=r)
kfp_server_api.exceptions.ApiException: (400)
Reason: Bad Request
HTTP response headers: HTTPHeaderDict({'content-type': 'application/json', 'trailer': 'Grpc-Trailer-Content-Type', 'date': 'Fri, 18 Sep 2020 12:16:19 GMT', 'x-envoy-upstream-service-time': '3', 'server': 'envoy', 'transfer-encoding': 'chunked'})
HTTP response body: {"error":"Invalid input error: Invalid resource references for experiment. Namespace is empty.","message":"Invalid input error: Invalid resource references for experiment. Namespace is empty.","code":3,"details":[{"@type":"type.googleapis.com/api.Error","error_message":"Invalid resource references for experiment. Namespace is empty.","error_details":"Invalid input error: Invalid resource references for experiment. Namespace is empty."}]}

Error when running the kale created pipeline:

2020-09-18 12:42:15 Kale mlmdutils:108        [INFO]     Getting workflow...
Traceback (most recent call last):
  File "<string>", line 36, in <module>
  File "<string>", line 3, in install_mzquality
  File "/opt/conda/lib/python3.8/site-packages/kale/common/mlmdutils.py", line 532, in init_metadata
    mlmd_instance = MLMetadata()
  File "/opt/conda/lib/python3.8/site-packages/kale/common/mlmdutils.py", line 109, in __init__
    self.workflow = podutils.get_workflow(self.workflow_name,
  File "/opt/conda/lib/python3.8/site-packages/kale/common/podutils.py", line 333, in get_workflow
    return co_client.get_namespaced_custom_object(api_group, api_version,
  File "/opt/conda/lib/python3.8/site-packages/kubernetes/client/api/custom_objects_api.py", line 954, in get_namespaced_custom_object
    (data) = self.get_namespaced_custom_object_with_http_info(group, version, namespace, plural, name, **kwargs)  # noqa: E501
  File "/opt/conda/lib/python3.8/site-packages/kubernetes/client/api/custom_objects_api.py", line 1043, in get_namespaced_custom_object_with_http_info
    return self.api_client.call_api(
  File "/opt/conda/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 340, in call_api
    return self.__call_api(resource_path, method,
  File "/opt/conda/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 172, in __call_api
    response_data = self.request(
  File "/opt/conda/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 362, in request
    return self.rest_client.GET(url,
  File "/opt/conda/lib/python3.8/site-packages/kubernetes/client/rest.py", line 237, in GET
    return self.request("GET", url,
  File "/opt/conda/lib/python3.8/site-packages/kubernetes/client/rest.py", line 231, in request
    raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'a7860fa0-ee24-438c-9283-b80569d9175e', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'Date': 'Fri, 18 Sep 2020 12:42:15 GMT', 'Content-Length': '411'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"workflows.argoproj.io \"mzquality-test-gw7px-jrnvp\" is forbidden: User \"system:serviceaccount:admin:default-editor\" cannot get resource \"workflows\" in API group \"argoproj.io\" in the namespace \"admin\"","reason":"Forbidden","details":{"name":"mzquality-test-gw7px-jrnvp","group":"argoproj.io","kind":"workflows"},"code":403}

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 16 (10 by maintainers)

Most upvoted comments

@mr-yaky You need to set the namespace for KFP. I am working on a PR to do this automatically but it has not been going very smoothly up to now. The easy and lazy way to do it is create the file ~/.config/kfp/context.json and write to it {"namespace": "your_namespace"}.

Happy to hear it is working. I haven’t seen that issue before so I will look at it once I get back to Kale development.

@elikatsis @StefanoFioravanzo I have gotten a bit further. By applying the below RoleBinding for namespace admin the created workflow is not getting an error while getting the workflow anymore.

cat <<EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: allow-workflow-nb-admin
  namespace: admin
subjects:
- kind: ServiceAccount
  name: default-editor
  namespace: admin
roleRef:
  kind: ClusterRole
  name: argo
  apiGroup: rbac.authorization.k8s.io
EOF

However, now I have the following error:

2020-09-23 08:53:15 Kale mlmdutils:88         [INFO]     ---------- Initializing MLMD context... ----------
2020-09-23 08:53:15 Kale mlmdutils:89         [INFO]     Connecting to MLMD...
2020-09-23 08:53:15 Kale mlmdutils:91         [INFO]     Successfully connected to MLMD
2020-09-23 08:53:15 Kale mlmdutils:92         [INFO]     Getting step details...
2020-09-23 08:53:15 Kale mlmdutils:93         [INFO]     Getting pod name...
2020-09-23 08:53:15 Kale mlmdutils:95         [INFO]     Successfully retrieved pod name: mzquality-test-o215f-4p29m-3194236941
2020-09-23 08:53:15 Kale mlmdutils:96         [INFO]     Getting pod namespace...
2020-09-23 08:53:15 Kale mlmdutils:98         [INFO]     Successfully retrieved pod namespace: admin
2020-09-23 08:53:15 Kale mlmdutils:100        [INFO]     Getting pod...
2020-09-23 08:53:15 Kale mlmdutils:102        [INFO]     Successfully retrieved pod
2020-09-23 08:53:15 Kale mlmdutils:103        [INFO]     Getting workflow name from pod...
2020-09-23 08:53:15 Kale mlmdutils:106        [INFO]     Successfully retrieved workflow name: mzquality-test-o215f-4p29m
2020-09-23 08:53:15 Kale mlmdutils:108        [INFO]     Getting workflow...
2020-09-23 08:53:15 Kale mlmdutils:111        [INFO]     Successfully retrieved workflow
2020-09-23 08:53:15 Kale mlmdutils:116        [INFO]     Successfully retrieved KFP run ID: 1a14a8ed-37ef-49ff-85ee-7a5777a360d2
2020-09-23 08:53:15 Kale mlmdutils:123        [INFO]     Successfully retrieved KFP pipeline_name: mzquality-test-o215f
2020-09-23 08:53:15 Kale podutils:343         [INFO]     Computing component ID for pod admin/mzquality-test-o215f-4p29m-3194236941...
2020-09-23 08:53:15 Kale podutils:354         [INFO]     Computed component ID: Install mzquality@sha256=0c770aed28976cd171960a69b318c3d8c6fe0c4cae930043c3b9a6a0bd379f86
2020-09-23 08:53:15 Kale mlmdutils:136        [INFO]     Failed to retrieve execution hash. Generating random string...: x67u2gxtt9
2020-09-23 08:53:15 Kale mlmdutils:258        [INFO]     Creating context 'mzquality-test-o215f-4p29m' of type 'KfpRun'...
2020-09-23 08:53:15 Kale mlmdutils:273        [INFO]     Context already exists
2020-09-23 08:53:15 Kale mlmdutils:274        [INFO]     ContextType ID: 6 - Context ID: 14
2020-09-23 08:53:15 Kale mlmdutils:222        [INFO]     Creating execution of type 'Install mzquality@sha256=0c770aed28976cd171960a69b318c3d8c6fe0c4cae930043c3b9a6a0bd379f86'...
2020-09-23 08:53:15 Kale mlmdutils:230        [INFO]     Successfully created execution
2020-09-23 08:53:15 Kale mlmdutils:231        [INFO]     ExecutionType ID: 15 - Execution ID: 12
2020-09-23 08:53:15 Kale mlmdutils:142        [INFO]     ---------- Successfully initialized MLMD context ----------
2020-09-23 08:53:15 Kale jputils:241          [INFO]     ---------- Running user code... ----------
2020-09-23 08:53:17 Kale marshalling          [ERROR]    
During data passing, Kale experienced an error.
The error was: 
During data passing, Kale could not load the following file:
  - name: 'NULL'
The error was: No file or folder was found with the requested name.
Please help us improve Kale by opening a new issue at
https://github.com/kubeflow-kale/kale/issues
.
Please help us improve Kale by opening a new issue at
https://github.com/kubeflow-kale/kale/issues
.
2020-09-23 08:53:17 Kale jputils:219          [ERROR]    Received a KaleGracefulExit exception. Exiting...
2020-09-23 08:53:17 Kale jputils:299          [ERROR]    ---------- Failed to run user code ----------

It seems like this might be due to the workflow I am trying to run as the candies-sharing notebook pipeline does execute successfully. However, I tried placing all the data needed for the pipeline in a separate volume and adding that in Kale and this still gave the same error.