k8s-bigip-ctlr: CIS nodepoller doesn't work and stop do the arps process If there is one node in the cluster which the Vtep MAC cannot be obtained
Setup Details
CIS Version : 2.7.0 Build: f5networks/k8s-bigip-ctlr:latest BIGIP Version: BIG-IP 15.1.4 Build 0.0.47 Final AS3 Version: none Agent Mode: CCCL Orchestration: K8S Orchestration Version: kubernetes v1.21.5 Pool Mode: Cluster Additional Setup details: <Platform/CNI Plugins/ cluster nodes/ etc>
Platform : CentOS Linux release 8.4.2105 Kernel: 4.18.0-305.19.1.el8_4.x86_64 CNI Plugins: flannel
Description
Due to one node in the cluster cannot get vtepmac,CIS nodepoller doesn’t work and stop do the arps process.
Steps To Reproduce
- To reproduce the issue simulates the node loss vtepmac , we edit the node yaml file to remove the annotation
flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"5a:de:e9:80:38:7e"}'
which was automatically inserted by flannel.#kubectl edit node cluster1-w1
and remove the annotationflannel.alpha.coreos.com/backend-data: '{"VtepMAC":"5a:de:e9:80:38:7e"}'
and save. - And scale one deployment which watch by CIS , wait long enough to see if VE have refresh configuration
- View the CIS log The normal worker node’s CIDR is 10.42.0.0/24. The abnormal worker node’s CIDR is 10.42.1.0/24.
2022/01/07 01:25:00 [INFO] [INIT] Starting: Container Ingress Services - Version: 2.7.0, BuildInfo: azure-1697-0dd06d23f0761fd29b1f614a52ed4b3695653cdd
2022/01/07 01:25:01 [INFO] ConfigWriter started: 0xc000369020
2022/01/07 01:25:01 [INFO] Started config driver sub-process at pid: 17
2022/01/07 01:25:01 [INFO] [INIT] Creating Agent for cccl
2022/01/07 01:25:01 [INFO] [CCCL] Initializing CCCL Agent
2022/01/07 01:25:01 [INFO] [CCCL] Removing Partition p1_AS3
2022/01/07 01:25:02 [INFO] [CORE] NodePoller (0xc0002645a0) registering new listener: 0x17a6700
2022/01/07 01:25:02 [INFO] [CORE] NodePoller (0xc0002645a0) registering new listener: 0x1757a40
2022/01/07 01:25:02 [INFO] [CORE] NodePoller started: (0xc0002645a0)
2022/01/07 01:25:02 [INFO] [CORE] Not watching Ingress resources.
2022/01/07 01:25:02 [INFO] [CORE] Watching ConfigMap resources.
2022/01/07 01:25:02 [INFO] [CORE] Handling ConfigMap resource events.
2022/01/07 01:25:02 [INFO] [CORE] Not handling Ingress resource events.
2022/01/07 01:25:02 [INFO] [CORE] Registered BigIP Metrics
2022/01/07 01:25:03 [INFO] [2022-01-07 01:25:03,585 __main__ INFO] entering inotify loop to watch /tmp/k8s-bigip-ctlr.config334657854/config.json
2022/01/07 01:25:05 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:25:06 [INFO] [2022-01-07 01:25:06,589 f5_cccl.resource.resource INFO] Deleting IcrArp: /Common/k8s-10.42.1.4
2022/01/07 01:25:06 [INFO] [2022-01-07 01:25:06,731 f5_cccl.resource.resource INFO] Deleting IcrArp: /Common/k8s-10.42.1.5
2022/01/07 01:25:07 [INFO] [2022-01-07 01:25:07,664 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.1.5
2022/01/07 01:25:07 [INFO] [2022-01-07 01:25:07,737 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.1.4
2022/01/07 01:29:06 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:29:06 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.4's node.
2022/01/07 01:29:06 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:29:06 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.5's node.
2022/01/07 01:29:06 [INFO] [2022-01-07 01:29:06,480 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:29:07 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:29:07 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.5's node.
2022/01/07 01:29:07 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:29:07 [INFO] [2022-01-07 01:29:07,723 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_cafevs1
2022/01/07 01:29:07 [INFO] [2022-01-07 01:29:07,952 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:29:08 [INFO] [2022-01-07 01:29:08,342 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.1.4%0
2022/01/07 01:29:08 [INFO] [2022-01-07 01:29:08,420 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.1.5%0
2022/01/07 01:29:08 [INFO] [2022-01-07 01:29:08,702 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.0.247
2022/01/07 01:29:08 [INFO] [2022-01-07 01:29:08,767 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.0.246
2022/01/07 01:29:08 [INFO] [2022-01-07 01:29:08,826 f5_cccl.resource.resource INFO] Deleting IcrArp: /Common/k8s-10.42.1.4
2022/01/07 01:29:08 [INFO] [2022-01-07 01:29:08,894 f5_cccl.resource.resource INFO] Deleting IcrArp: /Common/k8s-10.42.1.5
2022/01/07 01:29:57 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:29:57 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.6's node.
2022/01/07 01:29:57 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:29:57 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.6's node.
2022/01/07 01:29:57 [INFO] [2022-01-07 01:29:57,478 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:29:58 [INFO] [2022-01-07 01:29:58,489 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:29:59 [INFO] [2022-01-07 01:29:59,025 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.0.246%0
2022/01/07 01:30:02 [INFO] [2022-01-07 01:30:02,954 f5_cccl.resource.resource INFO] Updating ApiFDBTunnel: /Common/flannel_vxlan
2022/01/07 01:30:06 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:30:06 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.7's node.
2022/01/07 01:30:06 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:30:06 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.7's node.
2022/01/07 01:30:06 [INFO] [2022-01-07 01:30:06,774 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_cafevs1
2022/01/07 01:30:07 [INFO] [2022-01-07 01:30:07,722 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_cafevs1
2022/01/07 01:30:08 [INFO] [2022-01-07 01:30:08,130 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.0.247%0
2022/01/07 01:31:36 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:31:37 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.7's node.
2022/01/07 01:31:37 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:31:37 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.6's node.
2022/01/07 01:31:37 [INFO] [2022-01-07 01:31:37,345 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_cafevs1
2022/01/07 01:31:38 [INFO] [2022-01-07 01:31:38,490 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_cafevs1
2022/01/07 01:31:39 [INFO] [2022-01-07 01:31:39,007 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.1.7%0
2022/01/07 01:31:50 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:31:50 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.6's node.
2022/01/07 01:31:50 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:31:50 [INFO] [2022-01-07 01:31:50,416 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:31:51 [INFO] [2022-01-07 01:31:51,401 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:31:51 [INFO] [2022-01-07 01:31:51,765 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.1.6%0
2022/01/07 01:31:52 [INFO] [2022-01-07 01:31:52,003 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.0.249
2022/01/07 01:31:52 [INFO] [2022-01-07 01:31:52,053 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.0.248
2022/01/07 01:31:52 [INFO] [2022-01-07 01:31:52,105 f5_cccl.resource.resource INFO] Deleting IcrArp: /Common/k8s-10.42.0.247
2022/01/07 01:31:52 [INFO] [2022-01-07 01:31:52,167 f5_cccl.resource.resource INFO] Deleting IcrArp: /Common/k8s-10.42.0.246
2022/01/07 01:32:05 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:32:05 [INFO] [2022-01-07 01:32:05,723 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:32:07 [INFO] [2022-01-07 01:32:07,175 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.0.250
2022/01/07 01:36:53 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:36:54 [INFO] [2022-01-07 01:36:54,201 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:36:55 [INFO] [2022-01-07 01:36:55,496 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.0.251
2022/01/07 01:37:05 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:37:05 [INFO] [2022-01-07 01:37:05,406 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:37:06 [INFO] [2022-01-07 01:37:06,698 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.0.252
2022/01/07 01:37:16 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:37:16 [INFO] [2022-01-07 01:37:16,493 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:37:17 [INFO] [2022-01-07 01:37:17,999 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.0.253
2022/01/07 01:37:27 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:37:28 [INFO] [2022-01-07 01:37:28,080 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:37:29 [INFO] [2022-01-07 01:37:29,410 f5_cccl.resource.resource INFO] Creating ApiArp: /Common/k8s-10.42.0.254
2022/01/07 01:38:35 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:38:35 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.8's node.
2022/01/07 01:38:36 [INFO] [2022-01-07 01:38:36,166 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_cafevs1
2022/01/07 01:38:38 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:38:38 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.8's node.
2022/01/07 01:38:39 [INFO] [2022-01-07 01:38:39,086 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_cafevs1
2022/01/07 01:38:42 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:38:42 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:38:42 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:38:42 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:38:43 [INFO] [2022-01-07 01:38:43,260 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_cafevs1
2022/01/07 01:38:44 [INFO] [2022-01-07 01:38:44,198 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_cafevs1
2022/01/07 01:38:44 [INFO] [2022-01-07 01:38:44,605 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.0.248%0
2022/01/07 01:40:24 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:24 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:24 [INFO] [2022-01-07 01:40:24,964 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:40:25 [INFO] [2022-01-07 01:40:25,728 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.0.254%0
2022/01/07 01:40:35 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:35 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:35 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:35 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:35 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:36 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:36 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:36 [INFO] [2022-01-07 01:40:36,280 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:40:36 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:38 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:38 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:38 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:38 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:38 [INFO] [2022-01-07 01:40:38,897 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:40:39 [INFO] [2022-01-07 01:40:39,492 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.0.252%0
2022/01/07 01:40:39 [INFO] [2022-01-07 01:40:39,577 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.0.253%0
2022/01/07 01:40:40 [INFO] [2022-01-07 01:40:40,201 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:40:40 [INFO] [2022-01-07 01:40:40,580 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.0.251%0
2022/01/07 01:40:44 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:45 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:45 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:45 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:45 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:45 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:45 [INFO] [2022-01-07 01:40:45,531 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:40:46 [INFO] [2022-01-07 01:40:46,813 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
2022/01/07 01:40:47 [INFO] [CCCL] Wrote 0 Virtual Server and 2 IApp configs
2022/01/07 01:40:47 [INFO] [2022-01-07 01:40:47,417 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.0.249%0
2022/01/07 01:40:47 [ERROR] [VxLAN] Vxlan manager could not get VtepMac for 10.42.1.10's node.
2022/01/07 01:40:47 [INFO] [2022-01-07 01:40:47,539 f5_cccl.resource.resource INFO] Deleting IcrNode: /p1/10.42.0.250%0
2022/01/07 01:40:47 [INFO] [2022-01-07 01:40:47,976 f5_cccl.resource.resource INFO] Updating ApiApplicationService: /p1/default_tea
- View the pod IP
[root@cluster1-m1 1]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coffee-87b9987b4-lzch9 1/1 Running 0 3m58s 10.42.1.10 cluster1-w1 <none> <none>
coffee-87b9987b4-nmsk2 1/1 Running 0 4m6s 10.42.1.8 cluster1-w1 <none> <none>
coffee-87b9987b4-q778v 1/1 Running 0 4m2s 10.42.1.9 cluster1-w1 <none> <none>
tea-67977d68b-4qbcz 1/1 Running 0 117s 10.42.0.6 cluster1-m1 <none> <none>
tea-67977d68b-664f9 1/1 Running 0 116s 10.42.0.7 cluster1-m1 <none> <none>
tea-67977d68b-8xtwl 1/1 Running 0 2m8s 10.42.0.5 cluster1-m1 <none> <none>
tea-67977d68b-j2lb8 1/1 Running 0 114s 10.42.0.8 cluster1-m1 <none> <none>
tea-67977d68b-rxwkl 1/1 Running 0 2m8s 10.42.0.3 cluster1-m1 <none> <none>
tea-67977d68b-tsc6k 1/1 Running 0 2m8s 10.42.0.2 cluster1-m1 <none> <none>
- View the VE ARP list and you see that the new pod ip did not update even the pod running in the normal and healthy worker node.
Expected Result
CIS outputs the error log, but nodepoller and arp process still works.
Actual Result
CIS outputs the error log, but nodepoller and arp process does not works.
Diagnostic Information
The CIS yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: k8s-bigip-ctlr1
name: cc-k8s-to-bigip1
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: k8s-bigip-ctlr1
template:
metadata:
labels:
app: k8s-bigip-ctlr1
name: k8s-bigip-ctlr1
spec:
containers:
- args:
- --bigip-username=$(BIGIP_USERNAME)
- --bigip-password=$(BIGIP_PASSWORD)
- --manage-ingress=false
- --bigip-partition=partition1
- --bigip-url=https://10.1.20.252
- --pool-member-type=cluster
- --flannel-name=/Common/flannel_vxlan
- --insecure=true
- --agent=cccl
command:
- /app/bin/k8s-bigip-ctlr
env:
- name: BIGIP_USERNAME
valueFrom:
secretKeyRef:
key: username
name: bigip-login1
optional: false
- name: BIGIP_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: bigip-login1
optional: false
image: f5networks/k8s-bigip-ctlr:2.7.0
imagePullPolicy: Always
name: k8s-bigip-ctlr1
serviceAccount: bigip-ctlr
serviceAccountName: bigip-ctlr
Observations (if any)
It may be related to the flannel bug “Flannel Annotations “flannel.alpha.coreos.com” issue #1122”
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 15 (10 by maintainers)
@myf5
I guess a quick hack/fix is to remove
log.Errorf
andreturn
, uselog.Infof ("[VxLAN] %v", err)
https://github.com/F5Networks/k8s-bigip-ctlr/blob/master/pkg/vxlan/vxlanMgr.go#L229-L233