kind: overlay network cannot be applied when host is behind a proxy
Environment
Host OS: RHEL 7.4 Host Docker version: 18.09.0 Host go version: go1.11.2 Node Image: kindest/node:v1.12.2
kind create cluster
[root@localhost bin]# kind create cluster
Creating cluster 'kind-1' ...
✓ Ensuring node image (kindest/node:v1.12.2) 🖼
✓ [kind-1-control-plane] Creating node container 📦
✓ [kind-1-control-plane] Fixing mounts 🗻
✓ [kind-1-control-plane] Starting systemd 🖥
✓ [kind-1-control-plane] Waiting for docker to be ready 🐋
✗ [kind-1-control-plane] Starting Kubernetes (this may take a minute) ☸
FATA[07:20:43] Failed to create cluster: failed to apply overlay network: exit status 1
Code below in pkg/cluster/context.go is trying to extract k8s version using kubectl version command in order to download the version-specific weave net.yaml. The code is not ok:-
// TODO(bentheelder): support other overlay networks
if err = node.Command(
"/bin/sh", "-c",
`kubectl apply --kubeconfig=/etc/kubernetes/admin.conf -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version --kubeconfig=/etc/kubernetes/admin.conf | base64 | tr -d '\n')"`,
).Run(); err != nil {
return kubeadmConfig, errors.Wrap(err, "failed to apply overlay network")
}
Why is the output of kubectl version command, base64 encoded?
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 1
- Comments: 53 (29 by maintainers)
Commits related to this issue
- Merge pull request #136 from tperez-stratio/0.1-DOCreview [0.17.0-0.1] DOC Restructuration and review — committed to stg-0/kind by stg-0 a year ago
Some updates on this. I have the privilege to work with extremely bright people here, and the problem seems to lie on TLS negotiation (although not 1.3) because our proxy policy hasn’t been updated in a while, and none of the algorithm proposed by the go tls client is supported atm…
We’re working with network and security to update this policy, and I will keep you posted if that solves our problem!
I think this is good now… looking at the logs before sending them, I have noticed that:
And so I decided to give it a try by unsetting all my *_proxy env variables and suddently it worked! I can finally enjoy kind on my pro workstation.
Thanks a lot @pablochacin and @BenTheElder !
so our actual podspec is ~ the contents of the
pod_spec
field in this prowjob (a few things get added for git checkout, environment variables…):@BenTheElder
Thanks for the hint! After investigating more, I have seen that.kubelet
was constantly being killed withSIGKILL (9)
. I have checkeddstat --top-oom
and it showed that whole control plane is constantly being killed by the systemEDIT: Unfortunately after increasing available resources nothing changed. Control plane keeps getting restarted for no reason. What might be important is that when I am testing kind locally inside a docker container
docker run --privileged -it --rm ... sh
and I runkind create cluster
it works, but when I try to do this inside a kubernetes cluster while exec into pod, samekind create cluster
fails with the above error.I opened issue #270 for implementing this.
Huh. I can’t spot anything relevant in there 🤔 the plot thickens 🙃
I think this week I’ll take a stab at pre-loading the CNI images and using a fixed manifest which should help avoid this sort of issue entirely 🤞
On Mon, Jan 14, 2019, 05:25 Matthias Loibl <notifications@github.com wrote:
Ah, that’s almost definitely it!
kind
does nothing special regarding proxies, the rest of the bringup only works because everything else (besides the overlay network config and its images) is pre-packed into the node image and doesn’t need to go out to the internet.We can either try to get these packed into the image ahead of time (which is probably quite doable, and possibly desirable, but maybe a little tricky), or we can try to make this step respect proxy information on the host machine.
It looks like
http_proxy
andHTTPS_PROXY
are mostly a convention that curl and a few others happen to follow to varying degrees, we’d probably need to also set the docker daemon on the “nodes” to respect this as well.Both approaches are probably worth doing. I’ll update this issue to track.
I will test on Monday since I don’t have our corporate proxy at home… thanks for the update!
the next release will contain this fix, but in the meantime it can be installed from the current source 😬
Yes, this works:
However, I have reached IT and it seems our corporate proxy (which requires a local cntlm for AD authentication) uses an old protocol for the man in the middle… and for this reason we cannot upgrade our Docker past 18.06.1-ce Do you think we could be hitting the same issue here?
@matthyx from the log I see that the proxy has been set to
http://127.0.0.1:3129/
This islocalhost
in the host machine, but inside the kind node container this address is the container’s loopback (not the host’s loopback). Therefore, you should set your proxy to an address witch is reachable from the kind node container.@floreks hmm took a quick peek, nothing leapt out 😞 we do run kind extensively inside a docker in docker (not the standard image though) setup for k8s CI.
We have seen kubelet continually evicting the API server in a few cases due to low disk / memory but I didn’t see that in the logs.
@endzyme I suspect some variant on #270 may help. I am also further exploring #200.
i wonder what was fixed.
Upgraded docker from 18.09 to 18.09.1 and problem went away 🎉.