kubeflow: Spark-operator, spartakus and tensorboard are missing when installing to existing Kubernetes cluster

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.]

I’m trying to install Kubeflow on an existing Kubernetes (1.15.12) cluster following the instructions of the doc:

  1. Download the kfctl v1.1.0 release from the Kubeflow releases page.
  2. Unpack the tar ball
  3. Set environment
  4. Run:
    mkdir -p ${KF_DIR}
    cd ${KF_DIR}
    kfctl apply -V -f ${CONFIG_URI}```
    
    

Where CONFIG_URI="https://raw.githubusercontent.com/kubeflow/manifests/v1.1-branch/kfdef/kfctl_k8s_istio.v1.1.0.yaml"

  1. Things start to deploy, and then I get the following error:
INFO[0087] Successfully applied application bootstrap    filename="kustomize/kustomize.go:273"
INFO[0087] Deploying application spark-operator          filename="kustomize/kustomize.go:248"
2020/08/27 09:33:38 absolute path error in '/home/me/kubeflow/foobar/.cache/manifests/manifests-1.1-branch/stacks/kubernetes/application/spark-operator' : evalsymlink failure on '/home/me/kubeflow/foobar/.cache/manifests/manifests-1.1-branch/stacks/kubernetes/application/spark-operator' : lstat /home/me/kubeflow/foobar/.cache/manifests/manifests-1.1-branch/stacks/kubernetes/application/spark-operator: no such file or directory
ERRO[0087] Error evaluating kustomization manifest for spark-operator: accumulating resources: accumulating resources from '../../.cache/manifests/manifests-1.1-branch/stacks/kubernetes/application/spark-operator': open /home/me/kubeflow/foobar/.cache/manifests/manifests-1.1-branch/stacks/kubernetes/application/spark-operator: no such file or directory  filename="kustomize/kustomize.go:155"
Error: failed to apply:  (kubeflow.error): Code 500 with message: kfApp Apply failed for kustomize:  (kubeflow.error): Code 500 with message: error evaluating kustomization manifest for spark-operator: accumulating resources: accumulating resources from '../../.cache/manifests/manifests-1.1-branch/stacks/kubernetes/application/spark-operator': open /home/me/kubeflow/foobar/.cache/manifests/manifests-1.1-branch/stacks/kubernetes/application/spark-operator: no such file or directory
Usage:
  kfctl apply -f ${CONFIG} [flags]

Flags:
      --context string   Optional kubernetes context to use when applying resources. Currently not used by KFDef resources.
  -f, --file string      Static config file to use. Can be either a local path:
                                        export CONFIG=./kfctl_gcp_iap.yaml
                                or a URL:
                                        export CONFIG=https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_gcp_iap.v1.0.0.yaml
                                        export CONFIG=https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_istio_dex.v1.0.0.yaml
                                        export CONFIG=https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_aws.v1.0.0.yaml
                                        export CONFIG=https://raw.githubusercontent.com/kubeflow/manifests/v1.0-branch/kfdef/kfctl_k8s_istio.v1.0.0.yaml
                                kfctl apply -V --file=${CONFIG}
  -h, --help             help for apply
  -V, --verbose          verbose output default is false

kfctl exited with error: failed to apply:  (kubeflow.error): Code 500 with message: kfApp Apply failed for kustomize:  (kubeflow.error): Code 500 with message: error evaluating kustomization manifest for spark-operator: accumulating resources: accumulating resources from '../../.cache/manifests/manifests-1.1-branch/stacks/kubernetes/application/spark-operator': open /home/me/kubeflow/foobar/.cache/manifests/manifests-1.1-branch/stacks/kubernetes/application/spark-operator: no such file or directory

That spark-operator directory does not exist indeed. Neither does github contain it at https://github.com/kubeflow/manifests/tree/v1.1-branch/stacks/kubernetes/application . Although I could find it under the ibm stack dir: https://github.com/kubeflow/manifests/tree/v1.1-branch/stacks/ibm/application/spark-operator

What did you expect to happen:

Kubeflow is being successfully deployed.

Anything else you would like to add:

After I’ve removed the spark-operator section from kfctl_k8s_istio.v1.1.0.yaml I’ve got the same error for spartakus and then for tensorboard. Removing all of these from that yaml allowed the install process to complete successfully.

Environment:

  • Kubeflow version: dashboard shows: build version dev_local
  • kfctl version: (use kfctl version): kfctl v1.1.0-0-g9a3621e
  • Kubernetes platform: Kubernetes cluster at Scaleway
  • Kubernetes version: (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-16T14:19:25Z", GoVersion:"go1.13.13", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.12", GitCommit:"e2a822d9f3c2fdb5c9bfbe64313cf9f657f0a725", GitTreeState:"clean", BuildDate:"2020-05-06T05:09:48Z", GoVersion:"go1.12.17", Compiler:"gc", Platform:"linux/amd64"}
  • OS (e.g. from /etc/os-release): Ubuntu 20.04.1 LTS

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 8
  • Comments: 22 (8 by maintainers)

Commits related to this issue

Most upvoted comments

Hello there, I just wanted to let you know that I tried to deploy kubeflow using the kfctl_k8s_istio.v1.1.0.yaml from the master branch, but I experienced the same errors in order to install spark-operator, spartakus and tensorboard. Inspecting the file, I found that the respos section is referencing to the v1.1-branch branch so I had to edit it to point to master branch, after that the deployment went ok. Here is the edit:

  repos:
  - name: manifests
    uri: https://github.com/kubeflow/manifests/archive/master.tar.gz
    #uri: https://github.com/kubeflow/manifests/archive/v1.1-branch.tar.gz
  version: master
  #version: v1.1-branch

Best regards.

Hey Yeah, that change needs to be cherry-picked onto the v1.1 branch. I’ll do that right now

I think for now you can remove the spark-operator, tensorboard and Spartakus from the stack and re-install. I’ll add instructions for setting them up without the stack/kubernetes in a couple of hours

@swiftdiaries – Any progress on this doco? The install worked perfectly dropping those lib’s but they were also the entire reason I was looking to leverage Kubeflow. Any insight would be welcome!

@cliveseldon Thanks! I’ll switch k8s versions.

In that case, we should update https://www.kubeflow.org/docs/started/k8s/kfctl-k8s-istio/ to make the K8s version deps explicit. And https://www.kubeflow.org/docs/started/k8s/overview/#minimum-system-requirements should be updated, as it’s wrong 😄

@swiftdiaries Thanks! and no worries 😄

Thanks for testing it out @crosvera 😃 I’m going to mark this as resolved because the change has been merged into the v1.1 branch in https://github.com/kubeflow/manifests/pull/1540.

Please feel free to re-open if anybody else is facing the same issue

Thanks @arshashi. I can see that in master branch the file manifests/kfdef/kfctl_k8s_istio.v1.1.0.yaml was updated yesterday. But the tutorial that I’m following points to v1.1-branch branch, which was updated 15 days ago.

I initially had the same issue. Post raising request, @ swiftdiarieshttps://github.com/swiftdiaries fixed the issue.

If you download the new v1.1 yaml file, the fix is already updated. You don’t even need to comment the path. It worked for me without commenting out at the next new download of .yaml file.

Thanks, Shashi

From: Carlos Ríos notifications@github.com Sent: Tuesday, September 1, 2020 9:54 PM To: kubeflow/kubeflow kubeflow@noreply.github.com Cc: A R, Shashikumar shashikumar.a.r@abc.com; Comment comment@noreply.github.com Subject: Re: [kubeflow/kubeflow] Spark-operator, spartakus and tensorboard are missing when installing to existing Kubernetes cluster (#5246)

Hello there,

I’m new with kubeflow, and I found the same problem, but I made the following changes in the kfctl_k8s_istio.v1.1.0.yaml to deploy kubeflow in order to install spark-operator, spartakus and tensorboard:

  • kustomizeConfig:

    repoRef:

    name: manifests
    
    # path: stacks/kubernetes/application/spark-operator
    
    path: spark/spark-operator/overlays/application
    

    name: spark-operator

  • kustomizeConfig:

    repoRef:

    name: manifests
    
    #path: stacks/kubernetes/application/spartakus
    
    path: common/spartakus/overlays/application
    

    name: spartakus

  • kustomizeConfig:

    repoRef:

    name: manifests
    
    path: tensorboard/overlays/istio
    
    #path: stacks/kubernetes/application/tensorboard
    

    name: tensorboard

Then I run kfctl apply -V -f kfctl_k8s_istio.v1.1.0.yaml and no errors were found 😄. I hope it helps.

Best regards.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/kubeflow/kubeflow/issues/5246#issuecomment-684977861, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFLS7GKNSOJKA2LJWIZPMO3SDUN3NANCNFSM4QMZYHOA.

RUN wget -O /workdir/yaml/kfctl_k8s_istio.v1.1.0.yaml
https://raw.githubusercontent.com/pachyderm/kfdata/e5ccd9f6aef49c8c1687eee7b51cfb6102b1b4fe/kfctl_k8s_istio.v1.1.0.yaml

Thanks @lukemarsden I followed this and the installation is successful. What is the impact on not installing those components?

@lukemarsden the above error would be if you are installing on k8s >=1.18 I believe. At present we support >=1.12 <1.18

Exact same error and situation here, Kubernetes version v1.15.12, tried with kfctl versions v1.0.2-0-ga476281 and v1.1.0-0-g9a3621e, os: Ubuntu 20.04.1 LTS