kind: [WSL2] Sync failed errors in kube-proxy for Service with SessionAffinity: ClientIP

What happened: iptables fail to be updated on the nodes after a Service with sessionAffinity: ClientIP is created. The issue manifests in requests beeing dropped to any Services that were created after the Service with session affinity.

kube-proxy pod is logging the following error:

E0720 14:29:10.934607       1 proxier.go:1507] Failed to execute iptables-restore: exit status 2 (iptables-restore v1.8.3 (legacy): Couldn't load match `recent':No such file or directory

Error occurred at line: 96
Try `iptables-restore -h' or 'iptables-restore --help' for more information.
)
I0720 14:29:10.934636       1 proxier.go:779] Sync failed; retrying in 30s

What you expected to happen: iptables to be updated correctly so that requests could be routed to any Service in the cluster.

How to reproduce it (as minimally and precisely as possible): Create a Service with sessionAffinity: ClientIP

apiVersion: v1
kind: Service
metadata:
  labels:
    alertmanager: main
  name: alertmanager-main
  namespace: monitoring
spec:
  ports:
  - name: web
    port: 9093
    targetPort: web
  selector:
    alertmanager: main
    app: alertmanager
  sessionAffinity: ClientIP

Anything else we need to know?: Issue is reproducible with both kubeProxyMode: iptables (default) and kubeProxyMode: ipvs

Environment:

  • kind version:
    • kind v0.8.1 go1.13.8 linux/amd64
    • kind v0.9.0-alpha+95753c11434213 go1.15beta1 linux/amd64
  • Kubernetes version:
    • Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.5", GitCommit:"e0fccafd69541e3750d460ba0f9743b90336f24f", GitTreeState:"clean", BuildDate:"2020-05-01T02:11:15Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
    • also tried v1.18.2 (default with kind v0.8.1) and v1.18.6 (default with kind v0.9.0-alpha)
  • Docker version: Docker Desktop with WSL2
Client:
 Debug Mode: false

Server:
 Containers: 1
  Running: 1
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 19.03.8
 Storage Driver: overlay2
  Backing Filesystem: <unknown>
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.19.104-microsoft-standard
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 24
 Total Memory: 25GiB
 Name: docker-desktop
 ID: D4I2:L4Y5:PGPS:CEUY:H3TU:C33L:HASQ:VZKB:53SE:SHQG:OOQV:BZMQ
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 48
  Goroutines: 57
  System Time: 2020-07-20T15:35:34.5632783Z
  EventsListeners: 3
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
  • OS: Windows 10 (Build: 19041.388)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 17 (9 by maintainers)

Most upvoted comments

Yes, it looks like the current WSL2 Kernel is built without xt_recent, needed by iptables -m recent ... which kube-proxy uses to implement sessionAffinity: ClientIP. Custom Kernel built with CONFIG_NETFILTER_XT_MATCH_RECENT=y fixed it for me. Submitted https://github.com/microsoft/WSL2-Linux-Kernel/pull/198 (4.19.y) and https://github.com/microsoft/WSL2-Linux-Kernel/pull/199 (5.4.y)

I can confirm. The issue is still reproducible. The solution with custom kernel works. I compiled 5.4.72 to check (the version currently used by WSL2).

The soluton

  1. Build a kernel with xt_recent kernel module enabled
    docker run --name wsl-kernel-builder --rm -it ubuntu:latest bash
    
    WSL_COMMIT_REF=linux-msft-5.4.72 # change this line to the version you want to build
    
    # Install dependencies
    apt update
    apt install -y git build-essential flex bison libssl-dev libelf-dev bc
    
    # Checkout WSL2 Kernel repo
    mkdir src
    cd src
    git init
    git remote add origin https://github.com/microsoft/WSL2-Linux-Kernel.git
    git config --local gc.auto 0
    git -c protocol.version=2 fetch --no-tags --prune --progress --no-recurse-submodules --depth=1 origin +${WSL_COMMIT_REF}:refs/remotes/origin/build/linux-msft-wsl-5.4.y
    git checkout --progress --force -B build/linux-msft-wsl-5.4.y refs/remotes/origin/build/linux-msft-wsl-5.4.y
    
    # Enable xt_recent kernel module
    sed -i 's/# CONFIG_NETFILTER_XT_MATCH_RECENT is not set/CONFIG_NETFILTER_XT_MATCH_RECENT=y/' Microsoft/config-wsl
    
    # Compile the kernel 
    make -j2 KCONFIG_CONFIG=Microsoft/config-wsl
    
    # From host terminal copy the built kernel
    docker cp wsl-kernel-builder:/src/arch/x86/boot/bzImage .
    
  2. Configure WSL to use newly built kernel: https://docs.microsoft.com/en-us/windows/wsl/wsl-config#configure-global-options-with-wslconfig

If someone wants to compile the 5.10 LTS kernel for WSL2 with this option enabled, take a look here https://github.com/WSLUser/WSL2-Linux-Kernel/blob/linux-msft-wsl-5.10.y/Microsoft/config-wsl. Follow https://wsl.dev/wsl2-kernel-zfs/ for steps for compiling your own kernel.

I haven’t needed SessionAffinity for a while now, so not sure if the issue is resolved. I can try to to check when I get some time to do so.

@thavlik Did you run into this issue recently? Is it still reproducible?

If so it might be worth adding the information about cusom kernel to the wsl2 docs.

@thavlik Rebuilt

@hawk29 - that would be a question to WSL2 maintainers; as far as I can tell it is not included in any recent releases. (And I don’t see any PR merging activity at microsoft/WSL2-Linux-Kernel - so maybe they just don’t accept contributions …)

FWIW, in tallaxes/WSL2-Linux-Kernel fork I have configured GitHub Action to build it, so you should be able to get built Kernel image from there - without worrying about downloading/running “mystery meat” bits - since the build process is transparent. The Kernel image is captured as build artifact - click on build run, scroll to Artifacts, look for bzImage. Then follow instructions for configuring global options in .wslconfig, setting kernel key to point to the custom kernel. (Obviously, use at your own risk, #include <disclamer.h> …)

I don’t recall if this existed then but https://kind.sigs.k8s.io/docs/user/using-wsl2/ is where we host what we know needs to be done for WSL2, since the maintainers don’t use WSL2 we can really use any missing bits contributed there, https://kind.sigs.k8s.io/docs/contributing/development/#documentation

thanks!

OP: if your issue is not resolved, please file a new one, I’ve eliminated that bot from this repo, but I think maybe this issue is now stale anyhow 🤔