kubeflow: the kubeflow installation is stuck on the microk8s.enable kubeflow
/kind bug kubeflow can not install on my VM, which is stuck on the microk8s.enable kubeflow, apperently the problem is juju. What steps did you take and what happened: [A clear and concise description of what the bug is.] I followed the steps under the official tutorial, which is https://ubuntu.com/kubeflow/install
What did you expect to happen: so the ErrImagePull happened in many pods, the describe infomation shows that the image can not pull.
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
root@Video04:~# microk8s.kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE controller-microk8s-localhost controller-0 2/2 Running 4 83m controller-microk8s-localhost modeloperator-56d4bb6587-stbg5 1/1 Running 0 64m ingress nginx-ingress-microk8s-controller-kw2kh 1/1 Running 0 3h42m kube-system coredns-588fd544bf-pwncg 1/1 Running 0 103m kube-system dashboard-metrics-scraper-59f5574d4-7mvl8 1/1 Running 0 88m kube-system hostpath-provisioner-75fdc8fccd-c2rph 1/1 Running 0 3h42m kube-system kubernetes-dashboard-6d97855997-lw2zg 1/1 Running 0 88m kube-system metrics-server-c65c9d66-fgztk 1/1 Running 0 88m kubeflow ambassador-6dcd6bf4c9-f9dgb 0/1 ErrImagePull 0 56m kubeflow ambassador-operator-0 1/1 Running 0 56m kubeflow argo-controller-operator-0 1/1 Running 0 51m kubeflow argo-ui-6bd579bc67-s5wrp 0/1 ImagePullBackOff 0 45m kubeflow argo-ui-operator-0 1/1 Running 0 45m kubeflow dex-auth-operator-0 1/1 Running 0 39m kubeflow jupyter-controller-f6cbc6889-bzpn5 0/1 ImagePullBackOff 0 33m kubeflow jupyter-controller-operator-0 1/1 Running 0 33m kubeflow jupyter-web-operator-0 1/1 Running 0 26m kubeflow katib-controller-7547bdb7cd-x4252 0/1 PodInitializing 0 17m kubeflow katib-controller-operator-0 1/1 Running 0 19m kubeflow katib-db-manager-operator-0 1/1 Running 0 8m29s kubeflow katib-db-operator-0 1/1 Running 0 14m kubeflow modeloperator-77587bd695-5krjt 1/1 Running 0 62m
and the describe info (I do not show every error pod’s info, but the errors are the same, all of are owned to the registry.jujucharms.com
)
`root@Video04:~# microk8s.kubectl describe -n kubeflow pod/ambassador-6dcd6bf4c9-f9dgb
Name: ambassador-6dcd6bf4c9-f9dgb
Namespace: kubeflow
Priority: 0
Node: video04/192.168.1.25
Start Time: Mon, 29 Jun 2020 14:44:36 +0800
Labels: juju-app=ambassador
pod-template-hash=6dcd6bf4c9
Annotations: apparmor.security.beta.kubernetes.io/pod: runtime/default
juju.io/controller: 376c9ef1-cde7-4caa-8cb3-aeb53691585a
juju.io/model: f9409cec-a0c8-4737-8d69-9610347cb59c
juju.io/unit: ambassador/0
seccomp.security.beta.kubernetes.io/pod: docker/default
Status: Pending
IP: 10.1.97.11
IPs:
IP: 10.1.97.11
Controlled By: ReplicaSet/ambassador-6dcd6bf4c9
Init Containers:
juju-pod-init:
Container ID: containerd://0e36c9782c18286a7f142c35c3093eef4df970537e1a7d2b1424438b1f4de163
Image: jujusolutions/jujud-operator:2.9-beta1.3843
Image ID: docker.io/jujusolutions/jujud-operator@sha256:e6c084c7754c4d906b9dff293a2b0d42df4adf65cf8c2659435c1ac4928dda74
Port: <none>
Host Port: <none>
Command:
/bin/sh
Args:
-c
export JUJU_DATA_DIR=/var/lib/juju
export JUJU_TOOLS_DIR=$JUJU_DATA_DIR/tools
mkdir -p $JUJU_TOOLS_DIR
cp /opt/jujud $JUJU_TOOLS_DIR/jujud
initCmd=$($JUJU_TOOLS_DIR/jujud help commands | grep caas-unit-init)
if test -n "$initCmd"; then
$JUJU_TOOLS_DIR/jujud caas-unit-init --debug --wait;
else
exit 0
fi
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 29 Jun 2020 14:44:38 +0800
Finished: Mon, 29 Jun 2020 14:44:42 +0800
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/lib/juju from juju-data-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-f6z9l (ro)
Containers:
ambassador:
Container ID:
Image: registry.jujucharms.com/kubeflow-charmers/ambassador/oci-image@sha256:ac028903c79f6913e132522b0e088bb4a1ef312a37d30eb37473cad513434b32
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ErrImagePull
Ready: False
Restart Count: 0
Liveness: http-get http://:8877/ambassador/v0/check_alive delay=30s timeout=1s period=30s #success=1 #failure=3
Readiness: http-get http://:8877/ambassador/v0/check_ready delay=30s timeout=1s period=30s #success=1 #failure=3
Environment:
AMBASSADOR_NAMESPACE: kubeflow
Mounts:
/usr/bin/juju-run from juju-data-dir (rw,path=“tools/jujud”)
/var/lib/juju from juju-data-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-f6z9l (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
juju-data-dir:
Type: EmptyDir (a temporary directory that shares a pod’s lifetime)
Medium:
SizeLimit: <unset>
default-token-f6z9l:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-f6z9l
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
Normal Scheduled 46m default-scheduler Successfully assigned kubeflow/ambassador-6dcd6bf4c9-f9dgb to video04 Normal Pulled 46m kubelet, video04 Container image “jujusolutions/jujud-operator:2.9-beta1.3843” already present on machine Normal Created 46m kubelet, video04 Created container juju-pod-init Normal Started 45m kubelet, video04 Started container juju-pod-init Normal BackOff 37m kubelet, video04 Back-off pulling image “registry.jujucharms.com/kubeflow-charmers/ambassador/oci-image@sha256:ac028903c79f6913e132522b0e088bb4a1ef312a37d30eb37473cad513434b32” Warning Failed 37m kubelet, video04 Error: ImagePullBackOff Warning Failed 21m (x2 over 37m) kubelet, video04 Failed to pull image “registry.jujucharms.com/kubeflow-charmers/ambassador/oci-image@sha256:ac028903c79f6913e132522b0e088bb4a1ef312a37d30eb37473cad513434b32”: rpc error: code = Unknown desc = failed to pull and unpack image “registry.jujucharms.com/kubeflow-charmers/ambassador/oci-image@sha256:ac028903c79f6913e132522b0e088bb4a1ef312a37d30eb37473cad513434b32”: failed to copy: unexpected EOF Warning Failed 21m (x2 over 37m) kubelet, video04 Error: ErrImagePull Normal Pulling 21m (x3 over 45m) kubelet, video04 Pulling image “registry.jujucharms.com/kubeflow-charmers/ambassador/oci-image@sha256:ac028903c79f6913e132522b0e088bb4a1ef312a37d30eb37473cad513434b32”`
what is more, I tried to pull the image by docker or microk8s.ctr, which I get are:
**Error response from daemon: pull access denied for registry.jujucharms.com/kubeflow-charmers/ambassador/oci-image, repository does not exist or may require 'docker login': denied: requested access to the resource is denied**
so I wonder whther the source is fine, or I need more account to access, thank you all.
Environment:
- Kubeflow version: (version number can be found at the bottom left corner of the Kubeflow dashboard): microk8s 1.0.2
- kfctl version: (use
kfctl version
): microk8s 1.18.4 - Kubernetes platform: (e.g.
minikube
) microk8s - Kubernetes version: (use
kubectl version
): microk8s 1.18.4 - OS (e.g. from
/etc/os-release
): ubuntu 18.04 LTS
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 4
- Comments: 16 (6 by maintainers)
Did anyone ever find the fix? I am getting this issue.
@rushins: that password is auto-generated as a default option. You can press enter to accept it, or type in your own instead. I’ll update that text to make that more clear
Any solution or workaround to this issue? The same as Failed to pull from registry.jujucharms.com/kubeflow-charmers
@jiaozhentian: I just tried
microk8s.enable kubeflow
, and it worked fine for me. The error you’re getting looks like a transient network error, are you able to try again and see if that fixes it?idiot…… the link has down