kind: Rare "failed to init node with kubeadm" on Kind v0.7.0
What happened: Occasionally find errors like https://prow.istio.io/view/gcs/istio-prow/pr-logs/pull/istio_istio/20244/integ-telemetry-k8s-tests_istio/7089 failing to start up the kind cluster. I have seen similar errors before, but its very rare, anecdotally I see one a week or so across 1000s of tests. (meta comment: is there a good way to grep across all test logs in GCS? I cannot find one. I used to download them locally but ran out of disk space).
What you expected to happen: ideally no errors
How to reproduce it (as minimally and precisely as possible): I cannot repro
Anything else we need to know?: I don’t really expect too much here. As I see errors in the future I’ll add more context to maybe root cause this. Right now I am not too concerned with this as its extremely uncommon, mostly just opening this to track or if it helps find some issue
Environment:
- kind version: (use
kind version
): v0.7.0 - Kubernetes version: (use
kubectl version
): v1.17.0 - Docker version: (use
docker info
): 19.03.2 - OS (e.g. from
/etc/os-release
): linux
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 15 (15 by maintainers)
hi,
We encountered this issue too and found that the
sync
command inentrypoint
script may run for a long time sometimes (high system load?).https://github.com/kubernetes-sigs/kind/blob/c0a7803bc09961b7a6b84f48fa98fed172812320/images/base/files/usr/local/bin/entrypoint#L28-L31
Maybe we can skip this command when storage driver is not
aufs
?example fail logs:
On this machine,
sync
takes more than 5 minutes to finish: