kubernetes: HTTP readiness/liveness probes always fail from Windows node against container running on same node

What happened:

Readiness/liveness probes from a Windows node against container running on same node using httpGet always seem to fail causing a never-ending loop of pods being terminated and recreated.

What you expected to happen:

If I configure a readiness/liveness probe for a Windows container in a pod and that container is functioning as expected, I expect the probe to succeed and the pod to continue running.

How to reproduce it (as minimally and precisely as possible):

Add Windows node to a K8s cluster running in overlay mode
Create a pod with a Windows container which exposes an HTTP endpoint and a liveness/readiness httpGet probe that queries that endpoint
Note the pod getting terminated due to the failing probe
Update/patch the pod to remove the probe
Grab the pod IP
Query the probe target using the pod IP from a Linux node and confirm it succeeds
Query the probe target using the pod IP from the Windows node and note that it times out

Anything else we need to know?:

I’ve uploaded the route table for the Windows node just in case that helps. I assume there should be routes configured for the pod network CIDR but I don’t see any?

Environment:

Kubernetes version: v1.15.1
Cloud provider or hardware configuration: Local Hyper-V VMs
OS: Windows Server 2019 1809 (17763)
Network plugin and version: flannel 0.11.0

Attachments:

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 3
Comments: 29 (15 by maintainers)

Commits related to this issue

Switch to a prerelease of flanneld v0.11.0 (the latest release) is now quite old and doesn't create the HostRoute policy for vxlan networks https://github.com/kubernetes/kubernetes/issues/81938 — committed to benmoss/sig-windows-tools by deleted user 5 years ago

Most upvoted comments

It is a Windows bug, https://github.com/Azure/AKS/issues/1014.

zhiweiv on Aug 27, 2019