katib: Failed to launch Katib experiment - 404 page not found

/kind bug

What steps did you take and what happened: Setup kubeflow and installed katib manually as mentioned in https://github.com/kubeflow/katib/issues/1415 Start a katib experiment with kale out of a jupyter notebook. The experiment was created and the pipeline was also uploaded but not launched.

Type: RPC

Method: katib.create_katib_experiment()

Code: 6 (UnhandledError)

Transaction ID: ylpewg72bh

Message: Failed to launch Katib experiment

Details: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'text/plain; charset=utf-8', 'X-Content-Type-Options': 'nosniff', 'Date': 'Mon, 08 Mar 2021 12:14:43 GMT', 'Content-Length': '19'})
HTTP response body: 404 page not found

kale.log:

2021-03-08 12:14:42 run:83 [[DEBUG]] [TID=axkqxleth9] [] Decoding ctx of RPC function 'kfp.create_experiment'
2021-03-08 12:14:42 run:95 [[DEBUG]] [TID=axkqxleth9] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Decoding kwargs of RPC function 'kfp.create_experiment'
2021-03-08 12:14:42 run:104 [[DEBUG]] [TID=axkqxleth9] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Importing RPC function 'kfp.create_experiment'
2021-03-08 12:14:42 run:114 [[INFO]] [TID=axkqxleth9] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Executing RPC function 'create_experiment(experiment_name=test-v1ef9)'
2021-03-08 12:14:43 _client:352 [[INFO]] Creating experiment test-v1ef9.
2021-03-08 12:14:43 run:83 [[DEBUG]] [TID=ylpewg72bh] [] Decoding ctx of RPC function 'katib.create_katib_experiment'
2021-03-08 12:14:43 run:95 [[DEBUG]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Decoding kwargs of RPC function 'katib.create_katib_experiment'
2021-03-08 12:14:43 run:104 [[DEBUG]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Importing RPC function 'katib.create_katib_experiment'
2021-03-08 12:14:43 run:114 [[INFO]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Executing RPC function 'create_katib_experiment(pipeline_id=832dfc28-61be-4fb5-af12-7877778b26ef, pipeline_metadata={'autosnapshot': True, 'docker_image': 'jupyter-kale:latest', 'experiment': {'id': '7f611f1b-bf8e-4709-80ef-c55d6644931c', 'name': 'test'}, 'experiment_name': 'test-v1ef9', 'katib_metadata': {'parameters': [{'feasibleSpace': {'max': '2000', 'min': '100', 'step': '100'}, 'name': 'N_ESTIMATORS', 'parameterType': 'int'}, {'feasibleSpace': {'list': ['10', '20', '30', '40', '50', '100']}, 'name': 'MAX_DEPTH', 'parameterType': 'categorical'}, {'feasibleSpace': {'max': '4', 'min': '1', 'step': '1'}, 'name': 'MIN_SAMPLES_LEAF', 'parameterType': 'int'}, {'feasibleSpace': {'list': ['2', '5', '10']}, 'name': 'MIN_SAMPLES_SPLIT', 'parameterType': 'categorical'}], 'objective': {'additionalMetricNames': [], 'goal': 0.85, 'objectiveMetricName': 'random-forest-accuracy', 'type': 'maximize'}, 'algorithm': {'algorithmName': 'random', 'algorithmSettings': [{'name': 'random_state', 'value': '10'}, {'name': 'acq_optimizer', 'value': 'auto'}, {'name': 'acq_func', 'value': 'gp_hedge'}, {'name': 'base_estimator', 'value': 'GP'}]}, 'maxTrialCount': 10, 'maxFailedTrialCount': 3, 'parallelTrialCount': 5}, 'katib_run': True, 'pipeline_description': 'Fine tune a RF classifier on the Titanic dataset', 'pipeline_name': 'titanic-hp-tuning', 'snapshot_volumes': True, 'steps_defaults': [], 'volumes': []}, output_path=/home/jovyan/medium/minikf)'
2021-03-08 12:14:43 katib:181 [[INFO]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Saving Katib experiment definition at /home/jovyan/medium/minikf/test-v1ef9.katib.yaml
2021-03-08 12:14:43 katib:91 [[DEBUG]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Launching Katib Experiment 'test-v1ef9'...
2021-03-08 12:14:43 katib:97 [[ERROR]] [TID=ylpewg72bh] [/home/jovyan/medium/minikf/titanic-katib.ipynb] Failed to launch Katib experiment
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/kale/rpc/katib.py", line 95, in _launch_katib_experiment
    katib_experiment)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/apis/custom_objects_api.py", line 178, in create_namespaced_custom_object
    (data) = self.create_namespaced_custom_object_with_http_info(group, version, namespace, plural, body, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/apis/custom_objects_api.py", line 277, in create_namespaced_custom_object_with_http_info
    collection_formats=collection_formats)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 334, in call_api
    _return_http_data_only, collection_formats, _preload_content, _request_timeout)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 168, in __call_api
    _request_timeout=_request_timeout)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/api_client.py", line 377, in request
    body=body)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/rest.py", line 265, in POST
    body=body)
  File "/usr/local/lib/python3.6/dist-packages/kubernetes/client/rest.py", line 221, in request
    raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (404)
Reason: Not Found
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'text/plain; charset=utf-8', 'X-Content-Type-Options': 'nosniff', 'Date': 'Mon, 08 Mar 2021 12:14:43 GMT', 'Content-Length': '19'})
HTTP response body: 404 page not found

What did you expect to happen: The katib experiment is launched.

Anything else you would like to add: Can I figure out which DNS address is being requested?

Environment:

  • Kubeflow version: kfctl v1.2.0-0-gbc038f9
  • OnPremise Kubernetes Cluster
  • Kubernetes version: v1.17.0
  • OS: Ubuntu 18.04.5 LTS

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (7 by maintainers)

Most upvoted comments

@Siddarth-Pattnaik I think you should update your Kubeflow version to 1.1 at least to use Katib SDK. In Kubeflow 1.0.1 we had Katib v1alpha3 version which SDK doesn’t support.