rancher: Unable to ping pods between each other over non-default service record for k8s 1.21 cluster
What kind of request is this (question/bug/enhancement/feature request): bug
Steps to reproduce (least amount of steps as possible):
TL;DR - pods in one namespace cannot be pinged between each other over custom dns record pointing to a workload - pings are passing over records created automatically when you deploy a deployment. The same testcase described below is working fine on custom RKE1 cluster with k8s v1.20.7-rancher1-1 deployed in the same local cluster.
Reproducer:
- deploy custom RKE1 cluster with k8s
v1.21.1-rancher1-1on single node with all roles under local cluster runningrancher/rancher:master-head 7d7dd99(on RKE1 with k8s v1.20.6) - on the downstream cluster deploy
busybox1deployment indefaultnamespace, enterbusyboxcontainer image end specify Command:/bin/sh -c "while true; do echo blah; sleep 1; done"- do not expose any port - clone the
busybox1deployment and name itbusybox2 - create
arecordcustom service indefaultnamespace under Default project->Resources->Workload->Service Discovery, selectResolves To:->One or more workloadsbullet and point the dns entry tobusybox2deployment/workload. - Execute shell in
busybox1-xxxpod and perform:ping arecord- it will return errorping: bad address 'arecord'
Result:
/ # ping arecord
ping: bad address 'arecord'
/ # ping arecord.default.svc.cluster.local
ping: bad address 'arecord.default.svc.cluster.local'
/ # ping busybox1
PING busybox1 (10.42.3.71): 56 data bytes
64 bytes from 10.42.3.71: seq=0 ttl=64 time=0.052 ms
/ # ping busybox2
PING busybox2 (10.42.3.72): 56 data bytes
64 bytes from 10.42.3.72: seq=0 ttl=64 time=0.026 ms
/ # ping busybox1.default.svc.cluster.local
PING busybox1.default.svc.cluster.local (10.42.3.71): 56 data bytes
64 bytes from 10.42.3.71: seq=0 ttl=63 time=0.054 ms
Other details that may be helpful:
- The very same test case is working fine on RKE1 cluster with k8s v1.20.6
- default DNS entries created automatically for deployments are working
- seems to be similar to v1.21 k3s issue https://github.com/rancher/rancher/issues/32329
Environment information
- Rancher version (
rancher/rancher/rancher/serverimage tag or shown bottom left in the UI):rancher/rancher:master-head 7d7dd99 - Installation option (single install/HA): single
Cluster information
- Cluster type (Hosted/Infrastructure Provider/Custom/Imported): custom
- Machine type (cloud/VM/metal) and specifications (CPU/memory): any
- Kubernetes version (use
kubectl version):v1.21.1-rancher1-1
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 21 (16 by maintainers)
Anything created through the Ember UI is validated as working. The part that is currently stuck is that we have automated tests that use
type: dnsRecordthat has been removed from using in the Ember UI since a while (https://github.com/rancher/ui/commit/1ed161e60c2282e7e18a058838b58410a205aecc) that create Service recources that are not going through the Store that we modified to make this work. Currently looking into the dnsRecord functionality.