mlrun: Getting error while running workflow on kubernetes

Hi,

I’m using minikube to run kubernetes in local system and trying to run workflow defined in demos/sklearn-pipe/sklearn-project.ipynb but getting the below error message.

Jupyter Cell:

artifact_path = path.abspath('./pipe/{{workflow.uid}}')

run_id = skproj.run(
    'main',
    arguments={}, 
    artifact_path=artifact_path, 
    dirty=True)

Error message: MaxRetryError: HTTPConnectionPool(host='ml-pipeline.default.svc.cluster.local', port=8888): Max retries exceeded with url: /apis/v1beta1/experiments (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fea36705a90>: Failed to establish a new connection: [Errno -2] Name or service not known'))

I have followed the instructions mentioned in below readme file https://github.com/mlrun/mlrun/blob/master/hack/local/README.md

Can anyone help me in resolving the error?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 26 (12 by maintainers)

Most upvoted comments

for issue 1 try (if you followed the instructions):

from mlrun.platforms import mount_pvc
fn.apply(mount_pvc("nfsvol", "nfsvol", "/home/joyan/data"))

@narendra36 you should set both DEFAULT_DOCKER_REGISTRY (url, e.g. https://index.docker.io/v1/) and DEFAULT_DOCKER_SECRET (k8s secret name, in the same namespace), another way is to set the registry per function, add this method to your function object fn.build_config(image='target/image:tag', secret='my_docker')

If it is the docker setup, then you might try the following:

  • get an access token for your docker hub account
  • copy it into the mlrun-local.yaml file:
        - name: DEFAULT_DOCKER_REGISTRY
          value: "https://index.docker.io/v1/"
        - name: DEFAULT_DOCKER_SECRET
          value: "<your-access-token>"
  • create a docker secret in your cluster:
kubectl create -n kubeflow secret docker-registry my-docker     --docker-server=https://index.docker.io/v1/  --docker-username=<docker user> --docker-password=<docker acces token> --docker-email=<email>
  • restart the mlrun-api

@narendra36 looks like the builder pod (the first step) failed to store the image , can you share the log of the first step (in kubeflow pipeline UI), i guess it may be related to the build/registry setup (you need to configure the docker registry)