edgemesh: "Failed to ensure portal" running kind cluster

What happened: I’m trying to run edgemesh on a kind cluster. I have one control-plane node and two workers. I want default-worker to act as a relay server.

$ kubectl get nodes -o wide        
NAME                    STATUS   ROLES                  AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION    CONTAINER-RUNTIME
default-control-plane   Ready    control-plane,master   3d9h   v1.22.9   10.5.0.2      <none>        Ubuntu 21.10   5.17.0-1016-oem   containerd://1.6.4
default-worker          Ready    <none>                 3d9h   v1.22.9   10.5.0.4      <none>        Ubuntu 21.10   5.17.0-1016-oem   containerd://1.6.4
default-worker2         Ready    <none>                 3d9h   v1.22.9   10.5.0.3      <none>        Ubuntu 21.10   5.17.0-1016-oem   containerd://1.6.4

This is my values.yaml

agent:
  image: kubeedge/edgemesh-agent:latest
  affinity: {}
  nodeSelector: {}
  tolerations: {}
  resources:
    limits:
      cpu: 1
      memory: 256Mi
    requests:
      cpu: 0.5
      memory: 128Mi
  psk: dAc+kaXv1dLeDNB4JR79LwBQCwvBx6k6t5UtinL6OiU=
  relayNodes:
    - nodeName: default-worker
      advertiseAddress:
        - 10.5.0.4
  modules:
    edgeProxy:
      enable: true
    edgeTunnel:
      enable: true

When i deploy the helm chart using my values.yaml i get two pods on my worker nodes

$ kubectl get pods -o wide
NAME                                READY   STATUS    RESTARTS      AGE     IP            NODE                    NOMINATED NODE   READINESS GATES
cloudcore-55f44b557f-zsf9f          2/2     Running   2 (14m ago)   3d9h    10.5.0.4      default-worker          <none>           <none>
edgemesh-agent-2fsjb                1/1     Running   0             6s      10.5.0.3      default-worker2         <none>           <none>
edgemesh-agent-jgwk9                1/1     Running   0             6s      10.5.0.4      default-worker          <none>           <none>
iptables-manager-fwbvb              1/1     Running   0             3d9h    10.5.0.2      default-control-plane   <none>           <none>

However the logs show, that there is some errors

$ kubectl logs edgemesh-agent-2fsjb    
I1010 06:38:34.199247       1 server.go:55] Version: v1.12.0-dirty
I1010 06:38:34.199286       1 server.go:89] [1] Prepare agent to run
I1010 06:38:34.199419       1 netif.go:96] bridge device edgemesh0 already exists
I1010 06:38:34.199473       1 server.go:93] edgemesh-agent running on CloudMode
I1010 06:38:34.199481       1 server.go:96] [2] New clients
W1010 06:38:34.199492       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1010 06:38:34.199903       1 server.go:103] [3] Register beehive modules
W1010 06:38:34.199915       1 module.go:37] Module EdgeDNS is disabled, do not register
I1010 06:38:34.200252       1 server.go:66] Using userspace Proxier.
I1010 06:38:34.292646       1 module.go:34] Module EdgeProxy registered successfully
I1010 06:38:34.362580       1 module.go:159] I'm {12D3KooWNLAPNyViyXgHyAoTtvPc9D8fW3R5RJCAF8iiwpALCUQY: [/ip4/10.5.0.3/tcp/20006 /ip4/127.0.0.1/tcp/20006]}
I1010 06:38:34.362659       1 module.go:181] Bootstrapping the DHT
I1010 06:38:34.362689       1 tunnel.go:387] [Bootstrap] bootstrapping to 12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw
E1010 06:38:34.363161       1 tunnel.go:391] [Bootstrap] failed to bootstrap with {12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw: [/ip4/10.5.0.4/tcp/20006]}: failed to dial 12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw:
  * [/ip4/10.5.0.4/tcp/20006] dial tcp4 10.5.0.4:20006: connect: connection refused
E1010 06:38:34.363282       1 tunnel.go:402] [Bootstrap] Not all bootstrapDail connected, continue bootstrapDail...
I1010 06:38:44.364467       1 tunnel.go:387] [Bootstrap] bootstrapping to 12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw
I1010 06:38:44.372216       1 tunnel.go:397] [Bootstrap] success bootstrapped with {12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw: [/ip4/10.5.0.4/tcp/20006]}
I1010 06:38:44.373418       1 tunnel.go:63] Starting MDNS discovery service
I1010 06:38:44.373442       1 tunnel.go:76] Starting DHT discovery service
I1010 06:38:44.373510       1 module.go:34] Module EdgeTunnel registered successfully
I1010 06:38:44.373524       1 server.go:109] [4] Start all modules
I1010 06:38:44.373601       1 tunnel.go:447] Starting relay finder
I1010 06:38:44.373624       1 core.go:24] Starting module EdgeProxy
I1010 06:38:44.373692       1 core.go:24] Starting module EdgeTunnel
I1010 06:38:44.373950       1 config.go:317] "Starting service config controller"
I1010 06:38:44.374019       1 shared_informer.go:240] Waiting for caches to sync for service config
I1010 06:38:44.373959       1 config.go:135] "Starting endpoints config controller"
I1010 06:38:44.375185       1 shared_informer.go:240] Waiting for caches to sync for endpoints config
I1010 06:38:44.375746       1 loadbalancer.go:239] "Starting loadBalancer destinationRule controller"
I1010 06:38:44.376100       1 shared_informer.go:240] Waiting for caches to sync for loadBalancer destinationRule
I1010 06:38:44.382517       1 tunnel.go:175] Discovery service got a new stream from {12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw: [/ip4/10.5.0.4/tcp/20006]}
I1010 06:38:44.382894       1 tunnel.go:204] [MDNS] Discovery from default-worker : {12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw: [/ip4/10.5.0.4/tcp/20006]}
I1010 06:38:44.383297       1 tunnel.go:118] [MDNS] Discovery found peer: {12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw: [/ip4/10.5.0.4/tcp/20006 /ip4/127.0.0.1/tcp/20006]}
I1010 06:38:44.383478       1 tunnel.go:130] [MDNS] New stream between peer {12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw: [/ip4/127.0.0.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/169.254.96.16/tcp/20006 /ip4/10.5.0.4/tcp/20006]} success
I1010 06:38:44.455824       1 tunnel.go:166] [MDNS] Discovery to default-worker : {12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw: [/ip4/127.0.0.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/10.244.1.1/tcp/20006 /ip4/169.254.96.16/tcp/20006 /ip4/10.5.0.4/tcp/20006]}
I1010 06:38:44.475928       1 shared_informer.go:247] Caches are synced for endpoints config 
I1010 06:38:44.475991       1 shared_informer.go:247] Caches are synced for service config 
I1010 06:38:44.476404       1 shared_informer.go:247] Caches are synced for loadBalancer destinationRule 
E1010 06:38:44.668305       1 proxier.go:552] "Failed to open portal" err="can't open node port for <nil>:30972/TCP: listen tcp :30972: bind: address already in use" serviceName="ingress-nginx/nginx-ingress-nginx-ingress:http"
E1010 06:38:44.688464       1 proxier.go:552] "Failed to open portal" err="can't open node port for <nil>:31163/TCP: listen tcp :31163: bind: address already in use" serviceName="ingress-nginx/nginx-ingress-nginx-ingress:https"
E1010 06:38:44.786637       1 proxier.go:552] "Failed to open portal" err="can't open node port for <nil>:30615/TCP: listen tcp :30615: bind: address already in use" serviceName="metallb-system/nginx"
E1010 06:38:44.824920       1 proxier.go:552] "Failed to open portal" err="can't open node port for <nil>:30550/TCP: listen tcp :30550: bind: address already in use" serviceName="vault/vault:vault"
E1010 06:38:44.845249       1 proxier.go:552] "Failed to open portal" err="can't open node port for <nil>:30738/TCP: listen tcp :30738: bind: address already in use" serviceName="vault/vault:vault-cluster"
E1010 06:38:44.889235       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30550/TCP: listen tcp :30550: bind: address already in use" servicePortName="vault/vault:vault"
E1010 06:38:44.964074       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30738/TCP: listen tcp :30738: bind: address already in use" servicePortName="vault/vault:vault-cluster"
E1010 06:38:45.029171       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30972/TCP: listen tcp :30972: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:http"
E1010 06:38:45.044993       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:31163/TCP: listen tcp :31163: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:https"
E1010 06:38:45.077913       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30615/TCP: listen tcp :30615: bind: address already in use" servicePortName="metallb-system/nginx"
E1010 06:39:14.443733       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30738/TCP: listen tcp :30738: bind: address already in use" servicePortName="vault/vault:vault-cluster"
E1010 06:39:14.476872       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30972/TCP: listen tcp :30972: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:http"
E1010 06:39:14.490054       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:31163/TCP: listen tcp :31163: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:https"
E1010 06:39:14.504344       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30615/TCP: listen tcp :30615: bind: address already in use" servicePortName="metallb-system/nginx"
E1010 06:39:14.537277       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30550/TCP: listen tcp :30550: bind: address already in use" servicePortName="vault/vault:vault
...

and the other pod

$ kubectl  logs edgemesh-agent-jgwk9 
I1010 06:38:34.175432       1 server.go:55] Version: v1.12.0-dirty
I1010 06:38:34.175470       1 server.go:89] [1] Prepare agent to run
I1010 06:38:34.175594       1 netif.go:96] bridge device edgemesh0 already exists
I1010 06:38:34.175632       1 server.go:93] edgemesh-agent running on CloudMode
I1010 06:38:34.175645       1 server.go:96] [2] New clients
W1010 06:38:34.175656       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1010 06:38:34.176073       1 server.go:103] [3] Register beehive modules
W1010 06:38:34.176085       1 module.go:37] Module EdgeDNS is disabled, do not register
I1010 06:38:34.176347       1 server.go:66] Using userspace Proxier.
I1010 06:38:34.360353       1 module.go:34] Module EdgeProxy registered successfully
I1010 06:38:34.364760       1 module.go:159] I'm {12D3KooWP6xc3WDcvWNT51M7vMQKvi4wtspKzMT2tEhgpmi7XjSw: [/ip4/10.5.0.4/tcp/20006 /ip4/127.0.0.1/tcp/20006 /ip4/10.5.0.4/tcp/20006]}
I1010 06:38:34.364852       1 module.go:168] Run as a relay node
I1010 06:38:34.364937       1 module.go:181] Bootstrapping the DHT
I1010 06:38:34.366393       1 tunnel.go:63] Starting MDNS discovery service
I1010 06:38:34.366424       1 tunnel.go:76] Starting DHT discovery service
I1010 06:38:34.366470       1 module.go:34] Module EdgeTunnel registered successfully
I1010 06:38:34.366488       1 server.go:109] [4] Start all modules
I1010 06:38:34.366559       1 tunnel.go:447] Starting relay finder
I1010 06:38:34.366582       1 core.go:24] Starting module EdgeProxy
I1010 06:38:34.366621       1 core.go:24] Starting module EdgeTunnel
I1010 06:38:34.366980       1 config.go:135] "Starting endpoints config controller"
I1010 06:38:34.367093       1 shared_informer.go:240] Waiting for caches to sync for endpoints config
I1010 06:38:34.366984       1 config.go:317] "Starting service config controller"
I1010 06:38:34.367164       1 shared_informer.go:240] Waiting for caches to sync for service config
I1010 06:38:34.367200       1 loadbalancer.go:239] "Starting loadBalancer destinationRule controller"
I1010 06:38:34.367225       1 shared_informer.go:240] Waiting for caches to sync for loadBalancer destinationRule
I1010 06:38:34.468202       1 shared_informer.go:247] Caches are synced for loadBalancer destinationRule 
I1010 06:38:34.468234       1 shared_informer.go:247] Caches are synced for service config 
I1010 06:38:34.468246       1 shared_informer.go:247] Caches are synced for endpoints config 
E1010 06:38:34.599795       1 proxier.go:552] "Failed to open portal" err="can't open node port for <nil>:30972/TCP: listen tcp :30972: bind: address already in use" serviceName="ingress-nginx/nginx-ingress-nginx-ingress:http"
E1010 06:38:34.678098       1 proxier.go:552] "Failed to open portal" err="can't open node port for <nil>:31163/TCP: listen tcp :31163: bind: address already in use" serviceName="ingress-nginx/nginx-ingress-nginx-ingress:https"
E1010 06:38:34.885149       1 proxier.go:552] "Failed to open portal" err="can't open node port for <nil>:30615/TCP: listen tcp :30615: bind: address already in use" serviceName="metallb-system/nginx"
E1010 06:38:35.004706       1 proxier.go:552] "Failed to open portal" err="can't open node port for <nil>:30550/TCP: listen tcp :30550: bind: address already in use" serviceName="vault/vault:vault"
E1010 06:38:35.027619       1 proxier.go:552] "Failed to open portal" err="can't open node port for <nil>:30738/TCP: listen tcp :30738: bind: address already in use" serviceName="vault/vault:vault-cluster"
E1010 06:38:35.062077       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30738/TCP: listen tcp :30738: bind: address already in use" servicePortName="vault/vault:vault-cluster"
E1010 06:38:35.074164       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:31163/TCP: listen tcp :31163: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:https"
E1010 06:38:35.090126       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30615/TCP: listen tcp :30615: bind: address already in use" servicePortName="metallb-system/nginx"
E1010 06:38:35.113323       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30972/TCP: listen tcp :30972: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:http"
E1010 06:38:35.141939       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30550/TCP: listen tcp :30550: bind: address already in use" servicePortName="vault/vault:vault"
E1010 06:38:35.214655       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30738/TCP: listen tcp :30738: bind: address already in use" servicePortName="vault/vault:vault-cluster"
E1010 06:38:35.227935       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:31163/TCP: listen tcp :31163: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:https"
E1010 06:38:35.245275       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30615/TCP: listen tcp :30615: bind: address already in use" servicePortName="metallb-system/nginx"
E1010 06:38:35.272512       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30972/TCP: listen tcp :30972: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:http"
E1010 06:38:35.297487       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30550/TCP: listen tcp :30550: bind: address already in use" servicePortName="vault/vault:vault"
I1010 06:38:44.381790       1 tunnel.go:118] [MDNS] Discovery found peer: {12D3KooWNLAPNyViyXgHyAoTtvPc9D8fW3R5RJCAF8iiwpALCUQY: [/ip4/127.0.0.1/tcp/20006 /ip4/10.5.0.3/tcp/20006]}
I1010 06:38:44.381987       1 tunnel.go:130] [MDNS] New stream between peer {12D3KooWNLAPNyViyXgHyAoTtvPc9D8fW3R5RJCAF8iiwpALCUQY: [/ip4/127.0.0.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/169.254.96.16/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.5.0.3/tcp/20006]} success
I1010 06:38:44.383678       1 tunnel.go:166] [MDNS] Discovery to default-worker2 : {12D3KooWNLAPNyViyXgHyAoTtvPc9D8fW3R5RJCAF8iiwpALCUQY: [/ip4/127.0.0.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/169.254.96.16/tcp/20006 /ip4/10.244.2.1/tcp/20006 /ip4/10.5.0.3/tcp/20006]}
I1010 06:38:44.384062       1 tunnel.go:175] Discovery service got a new stream from {12D3KooWNLAPNyViyXgHyAoTtvPc9D8fW3R5RJCAF8iiwpALCUQY: [/ip4/10.5.0.3/tcp/20006]}
I1010 06:38:44.384175       1 tunnel.go:204] [MDNS] Discovery from default-worker2 : {12D3KooWNLAPNyViyXgHyAoTtvPc9D8fW3R5RJCAF8iiwpALCUQY: [/ip4/10.5.0.3/tcp/20006]}
E1010 06:39:04.470536       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:31163/TCP: listen tcp :31163: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:https"
E1010 06:39:04.509817       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30615/TCP: listen tcp :30615: bind: address already in use" servicePortName="metallb-system/nginx"
E1010 06:39:04.578256       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30738/TCP: listen tcp :30738: bind: address already in use" servicePortName="vault/vault:vault-cluster"
E1010 06:39:04.604133       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30972/TCP: listen tcp :30972: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:http"
E1010 06:39:04.730338       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30550/TCP: listen tcp :30550: bind: address already in use" servicePortName="vault/vault:vault"
E1010 06:39:34.835684       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30550/TCP: listen tcp :30550: bind: address already in use" servicePortName="vault/vault:vault"
E1010 06:39:34.924507       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30738/TCP: listen tcp :30738: bind: address already in use" servicePortName="vault/vault:vault-cluster"
E1010 06:39:34.944426       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:31163/TCP: listen tcp :31163: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:https"
E1010 06:39:34.979393       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30615/TCP: listen tcp :30615: bind: address already in use" servicePortName="metallb-system/nginx"
E1010 06:39:35.024317       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30972/TCP: listen tcp :30972: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:http"
E1010 06:40:05.104791       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30738/TCP: listen tcp :30738: bind: address already in use" servicePortName="vault/vault:vault-cluster"
E1010 06:40:05.116168       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:31163/TCP: listen tcp :31163: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:https"
E1010 06:40:05.134939       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30615/TCP: listen tcp :30615: bind: address already in use" servicePortName="metallb-system/nginx"
E1010 06:40:05.156607       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30972/TCP: listen tcp :30972: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:http"
E1010 06:40:05.193246       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30550/TCP: listen tcp :30550: bind: address already in use" servicePortName="vault/vault:vault"
E1010 06:40:35.300506       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30738/TCP: listen tcp :30738: bind: address already in use" servicePortName="vault/vault:vault-cluster"
E1010 06:40:35.323012       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:31163/TCP: listen tcp :31163: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:https"
E1010 06:40:35.364481       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30615/TCP: listen tcp :30615: bind: address already in use" servicePortName="metallb-system/nginx"
E1010 06:40:35.415594       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30972/TCP: listen tcp :30972: bind: address already in use" servicePortName="ingress-nginx/nginx-ingress-nginx-ingress:http"
E1010 06:40:35.434967       1 proxier.go:422] "Failed to ensure portal" err="can't open node port for <nil>:30550/TCP: listen tcp :30550: bind: address already in use" servicePortName="vault/vault:vault"

Moreover: When i deploy a pod and try to resolve to any hostname, both external and cluster local service resolution is broken, when edgemesh is rolled out.

This is a try before rolling out edgemesh

$ kubectl run alpine --rm -ti --image=alpine -- /bin/sh
Found existing alias for "kubectl". You should use: "k"
If you don't see a command prompt, try pressing enter.
/ # nslookup www.google.com
Server:         10.96.0.10
Address:        10.96.0.10:53

Non-authoritative answer:
Name:   www.google.com
Address: 142.250.186.36

Non-authoritative answer:
Name:   www.google.com
Address: 2a00:1450:4001:827::2004

/ # nslookup cloudcore.kubeedge.svc.cluster.local
Server:         10.96.0.10
Address:        10.96.0.10:53


Name:   cloudcore.kubeedge.svc.cluster.local
Address: 10.96.163.34

This is the same try, when edgemesh is rolled out and gives the errors above:

$ kubectl run alpine --rm -ti --image=alpine -- /bin/sh
Found existing alias for "kubectl". You should use: "k"
If you don't see a command prompt, try pressing enter.
/ # nslookup www.google.com
;; connection timed out; no servers could be reached

/ # nslookup cloudcore.kubeedge.svc.cluster.local
;; connection timed out; no servers could be reached

What you expected to happen:

I’d expect that edgemesh does not show such errors. I’d expect that DNS resolution still works, even if edgemesh has some errors. This totaly breaks the clusterwide DNS resolution!

How to reproduce it (as minimally and precisely as possible): I guess, just run a kind cluster and deploy the helm chart(?)

Anything else we need to know?:

Environment:

  • EdgeMesh version: v1.12.0
  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:38:50Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.9", GitCommit:"6df4433e288edc9c40c2e344eb336f63fad45cd2", GitTreeState:"clean", BuildDate:"2022-05-19T19:53:08Z", GoVersion:"go1.16.15", Compiler:"gc", Platform:"linux/amd64"}
  • KubeEdge version(e.g. cloudcore --version and edgecore --version): 1.11.0

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 17 (17 by maintainers)

Most upvoted comments

So basically cacheDNS is just a setting that allows to configure some fallback DNS servers in case the clusterDNS isn’t reachable? So in my case not needed, right?

You are right.

BTW, you can exec to edgemesh-agent pod in edgenode, then cat /Corefile see some info.

@Poorunga is there any note in the docs? It seems to me that there are plenty of obstacles that might seem obvious to solve to some people, but aren’t at all for others.