katib: ERROR:grpc._server:Exception calling application: Method not implemented!

/kind bug

Hi, I’m having trouble using katib v1alpha3. First, I installed katib by the followings

  1. git clone https://github.com/kubeflow/katib
  2. sh katib/scripts/v1alpha3/deploy.sh

And I tried to apply random-example.yaml kubectl apply -f random-example.yaml (example in katib/examples/v1alpha3)

Results: kubectl get pods -n kubeflow NAME READY STATUS RESTARTS AGE katib-controller-6c6974678d-zsnlc 1/1 Running 1 24m katib-db-558f649dc6-8cd9t 1/1 Running 0 24m katib-manager-5f74bdff84-4d78z 1/1 Running 0 24m katib-ui-6568bd6b44-qbq5k 1/1 Running 0 24m random-example-random-846dc99654-bxb8j 1/1 Running 0 23m

kubectl get trials -n kubeflow NAME TYPE STATUS AGE random-example-drpkvb4b Running True 23m random-example-k7xv6ktt Running True 23m random-example-w6jlwdp2 Running True 23m

kubectl get experiment -n kubeflow -oyaml apiVersion: v1 items:

  • apiVersion: kubeflow.org/v1alpha3 kind: Experiment metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {“apiVersion”:“kubeflow.org/v1alpha3”,“kind”:“Experiment”,“metadata”:{“annotations”:{},“labels”:{“controller-tools.k8s.io”:“1.0”},“name”:“random-example”,“namespace”:“kubeflow”},“spec”:{“algorithm”:{“algorithmName”:“random”},“maxFailedTrialCount”:3,“maxTrialCount”:12,“objective”:{“additionalMetricNames”:[“accuracy”],“goal”:0.99,“objectiveMetricName”:“Validation-accuracy”,“type”:“maximize”},“parallelTrialCount”:3,“parameters”:[{“feasibleSpace”:{“max”:“0.03”,“min”:“0.01”},“name”:“–lr”,“parameterType”:“double”},{“feasibleSpace”:{“max”:“5”,“min”:“2”},“name”:“–num-layers”,“parameterType”:“int”},{“feasibleSpace”:{“list”:[“sgd”,“adam”,“ftrl”]},“name”:“–optimizer”,“parameterType”:“categorical”}],“trialTemplate”:{“goTemplate”:{“rawTemplate”:“apiVersion: batch/v1\nkind: Job\nmetadata:\n name: {{.Trial}}\n namespace: {{.NameSpace}}\nspec:\n template:\n spec:\n containers:\n - name: {{.Trial}}\n image: docker.io/kubeflowkatib/mxnet-mnist-example\n command:\n - "python"\n - "/mxnet/example/image-classification/train_mnist.py"\n - "–batch-size=64"\n {{- with .HyperParameters}}\n {{- range .}}\n - "{{.Name}}={{.Value}}"\n {{- end}}\n {{- end}}\n restartPolicy: Never”}}}} creationTimestamp: “2019-12-20T07:58:52Z” finalizers:
    • update-prometheus-metrics generation: 2 labels: controller-tools.k8s.io: “1.0” name: random-example namespace: kubeflow resourceVersion: “11682124” selfLink: /apis/kubeflow.org/v1alpha3/namespaces/kubeflow/experiments/random-example uid: 9005bab0-22fe-11ea-8cf0-0679676001a5 spec: algorithm: algorithmName: random algorithmSettings: null maxFailedTrialCount: 3 maxTrialCount: 12 metricsCollectorSpec: collector: kind: StdOut objective: additionalMetricNames:
      • accuracy goal: 0.99 objectiveMetricName: Validation-accuracy type: maximize parallelTrialCount: 3 parameters:
    • feasibleSpace: max: “0.03” min: “0.01” name: --lr parameterType: double
    • feasibleSpace: max: “5” min: “2” name: --num-layers parameterType: int
    • feasibleSpace: list:
      • sgd
      • adam
      • ftrl name: --optimizer parameterType: categorical trialTemplate: goTemplate: rawTemplate: |- apiVersion: batch/v1 kind: Job metadata: name: {{.Trial}} namespace: {{.NameSpace}} spec: template: spec: containers: - name: {{.Trial}} image: docker.io/kubeflowkatib/mxnet-mnist-example command: - “python” - “/mxnet/example/image-classification/train_mnist.py” - “–batch-size=64” {{- with .HyperParameters}} {{- range .}} - “{{.Name}}={{.Value}}” {{- end}} {{- end}} restartPolicy: Never status: conditions:
    • lastTransitionTime: “2019-12-20T07:58:52Z” lastUpdateTime: “2019-12-20T07:58:52Z” message: Experiment is created reason: ExperimentCreated status: “True” type: Created
    • lastTransitionTime: “2019-12-20T08:00:22Z” lastUpdateTime: “2019-12-20T08:00:22Z” message: Experiment is running reason: ExperimentRunning status: “True” type: Running currentOptimalTrial: observation: metrics: null parameterAssignments: null startTime: “2019-12-20T07:58:52Z” trials: 3 trialsRunning: 3 kind: List metadata: resourceVersion: “” selfLink: “”

kubectl logs -n kubeflow random-example-random-846dc99654-bxb8j INFO:hyperopt.utils:Failed to load dill, try installing dill via “pip install dill” for enhanced pickling support. INFO:hyperopt.fmin:Failed to load dill, try installing dill via “pip install dill” for enhanced pickling support. ERROR:grpc._server:Exception calling application: Method not implemented! Traceback (most recent call last): File “/usr/local/lib/python3.6/site-packages/grpc/_server.py”, line 434, in _call_behavior response_or_iterator = behavior(argument, context) File “/usr/src/app/github.com/kubeflow/katib/pkg/apis/manager/v1alpha3/python/api_pb2_grpc.py”, line 135, in ValidateAlgorithmSettings raise NotImplementedError(‘Method not implemented!’) NotImplementedError: Method not implemented!

What can I do to fix it? Thank you for your help in solving this problem.

  • Kubernetes version: (use kubectl version): Client Version: version.Info{Major:“1”, Minor:“13”, GitVersion:“v1.13.5+icp”, GitCommit:“903c3b31caddc675ce2d8bddf62aa0f875c2a3bc”, GitTreeState:“clean”, BuildDate:“2019-05-08T06:16:32Z”, GoVersion:“go1.11.5”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“13”, GitVersion:“v1.13.5+icp”, GitCommit:“903c3b31caddc675ce2d8bddf62aa0f875c2a3bc”, GitTreeState:“clean”, BuildDate:“2019-05-08T06:16:32Z”, GoVersion:“go1.11.5”, Compiler:“gc”, Platform:“linux/amd64”}

  • OS (e.g. from /etc/os-release): CentOS Linux release 7.7.1908 (Core)

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 1
  • Comments: 38 (17 by maintainers)

Most upvoted comments

Bingo! That was it. Thank you @andreyvelich .

we decided to roll back to v1alpha2 at this point. thanks for the help though! @nrchakradhar @andreyvelich