rancher: Rancher 2.4.2 - Windows Cluster using flannel host-gw not working
What kind of request is this (question/bug/enhancement/feature request): Bug
Steps to reproduce (least amount of steps as possible):
- Create a Windows Cluster using flannel / host-gw
Result:
The pod cattle-node-agent-windows
continuously crashes:
WARN: Default docker named pipe is not found
WARN: Please bind mount in the docker named pipe to //./pipe/docker_engine if docker errors occur
WARN: example: docker run -v //./pipe/custom_docker_named_pipe://./pipe/docker_engine ...
FATA: https://rancher.xxx.com is not accessible
Other details that may be helpful:
exec’ing into the pod prior to it crashing and trying to use curl
results in timeouts for all addresses. curl
from the kubelet and/or other pods works fine.
Setting up a cluster using flannel / vxlan works perfectly fine.
Environment information
- Rancher version (
rancher/rancher
/rancher/server
image tag or shown bottom left in the UI): 2.4.2 - Installation option (single install/HA): HA
Cluster information
- Cluster type (Hosted/Infrastructure Provider/Custom/Imported): Custom
- Machine type (cloud/VM/metal) and specifications (CPU/memory): Bare metal, 2 CPU cores, 8 GB RAM
- Kubernetes version (use
kubectl version
):
(Switched to 1.15.11 to see if it'll work on that version. Have also tried 1.16.9)
lient Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:18:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.11", GitCommit:"d94a81c724ea8e1ccc9002d89b7fe81d58f89ede", GitTreeState:"clean", BuildDate:"2020-03-12T21:00:06Z", GoVersion:"go1.12.17", Compiler:"gc", Platform:"linux/amd64"}
- Docker version (use
docker version
):
(This is on the windows node)
Client: Docker Engine - Enterprise
Version: 19.03.5
API version: 1.40
Go version: go1.12.12
Git commit: 2ee0c57608
Built: 11/13/2019 08:00:16
OS/Arch: windows/amd64
Experimental: false
Server: Docker Engine - Enterprise
Engine:
Version: 19.03.5
API version: 1.40 (minimum version 1.24)
Go version: go1.12.12
Git commit: 2ee0c57608
Built: 11/13/2019 07:58:51
OS/Arch: windows/amd64
Experimental: false
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 2
- Comments: 32
This sounds ridiculous. There is no response as to why this is any my company is not planning on following this.
@maxisam what I’ve discovered is that it will work if your master nodes are on a different VLAN. Can’t tell you why though.
@maxisam We are close. We have 3 clusters stood up and are planning to do a load test over the next week or two to prove out that Kubernetes and Windows containers can handle our load.
We are running 1809 in all our clusters.
17763.1457 is the cluster is broken due to the Windows update. 17763.1282 is the cluster that is working.
VM Ware Tools v11.1.0
The issue is most likely a Windows update. That is what I found. Everything was working until the latest Windows updates were applied. Then the Windows agent started crashing. Uninstalling the Windows update fixed the issue.
Here are two that are applying the same “fix” but end up breaking connectivity…
https://support.microsoft.com/en-us/help/4571748/windows-10-update-kb4571748 https://support.microsoft.com/en-us/help/4570333/windows-10-update-kb4570333
Hi @AMoghrabi
Have you been able to troubleshoot this problem? I’m having a similar problem using Flannel/VXLan, my setup using Windows HOst on VMWARE.
Best,
Giang
Get-HnsNetwork | Select-Object -Property Name