kube-router: Sporadic Connection Refused for services on Network Policy sync

Some time ago I informally raised on slack that there might be an issue with the NetworkPolicyController causing intermittent connection failures while reconciling. I can now confirm that this happens, although it can be hard to re-create.

I think I narrowed it down to this chunk of code in network_policy_controller.go:

// TODO use iptables-restore to better implement the logic, than flush and add rules
err = iptablesCmdHandler.ClearChain("filter", policyChainName)
if err != nil && err.(*iptables.Error).ExitStatus() != 1 {
	return nil, nil, fmt.Errorf("Failed to run iptables command: %s", err.Error())
}

err = npc.processIngressRules(policy, targetDestPodIpSetName, activePolicyIpSets)
if err != nil {
	return nil, nil, err
}

err = npc.processEgressRules(policy, targetSourcePodIpSetName, activePolicyIpSets)
if err != nil {
	return nil, nil, err
}

I tried commenting out the ClearChain line and made sure that ingress/egress rules were only processed once (on kube-router startup) and now I couldn’t re-produce the problem.

My guess is, that in some cases, incoming packets are dropped due to the time gap between chain clearing and chain re-building, although I’m not entirely sure. For sure is though, that I get intermittent Connection Refused issues when talking to services and they always occur at the exact time of network policy Sync().

Reproduce

To reproduce the error, I run something like:

time while true ; do curl -s -o /dev/null http://myservice.default.svc.cluster.local/api || break ; sleep 0.1 ; done

hint: set --iptables-sync-period low, or apply some continuous changes to the cluster (triggering sync) in order not to wait forever to see the error. Perhaps you need to have a certain amount of network policies as well, in order for the gap between chain flush and rule creation to be noticeable.

Suggestions

// TODO use iptables-restore to better implement the logic, than flush and add rules

a fix is already suggested in the above comment ^^. Although I don’t know if this would provide better atomicity:

  1. Building a new chain
  2. Changing the chain reference
  3. (defer) Deleting the old chain

For the sake of atomicity/consistency between different NW policies. E.g. When service A talks to service B, A having an egress rule and B the corresponding ingress rule; perhaps it’s better to rebuild all chains in one loop and afterwards changing all references in a second loop?

Of course best of all would be to not at all touch what haven’t been changed. Perhaps iptables-restore can help on that.

Let me hear your thoughts?

also, Nitpick:

glog.V(1).Info("Starting periodic sync of iptables")

… should probably just be Starting sync of iptables, since this function is invoked both for APIserver events and for the periodic reconcile.

Environment

Kubernetes: v1.9.6 Kube-router: v0.2.0-beta.6

Below is a slightly redacted dump of iptables -t filter -L from the node running the pod I test against:

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
KUBE-FIREWALL  all  --  anywhere             anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
KUBE-POD-FW-2T5M5FYYBTRHPIYV  all  --  10.233.6.51          anywhere             PHYSDEV match --physdev-is-bridged /* rule to jump traffic from POD name:node-exporter-m4t4m namespace: monitoring to chain KUBE-POD-FW-2T5M5FYYBTRHPIYV */
KUBE-POD-FW-2T5M5FYYBTRHPIYV  all  --  10.233.6.51          anywhere             /* rule to jump traffic from POD name:node-exporter-m4t4m namespace: monitoring to chain KUBE-POD-FW-2T5M5FYYBTRHPIYV */
KUBE-POD-FW-5HZOC6QEWDN2OQQY  all  --  10.233.6.3           anywhere             PHYSDEV match --physdev-is-bridged /* rule to jump traffic from POD name:ors-maintenance-65bf64cf67-5v6c9 namespace: iscrum-dit to chain KUBE-POD-FW-5HZOC6QEWDN2OQQY */
KUBE-POD-FW-5HZOC6QEWDN2OQQY  all  --  10.233.6.3           anywhere             /* rule to jump traffic from POD name:ors-maintenance-65bf64cf67-5v6c9 namespace: iscrum-dit to chain KUBE-POD-FW-5HZOC6QEWDN2OQQY */
KUBE-POD-FW-6Q423ANJ5FJYSYLE  all  --  10.233.6.50          anywhere             PHYSDEV match --physdev-is-bridged /* rule to jump traffic from POD name:filebeat-pxzfh namespace: platform to chain KUBE-POD-FW-6Q423ANJ5FJYSYLE */
KUBE-POD-FW-6Q423ANJ5FJYSYLE  all  --  10.233.6.50          anywhere             /* rule to jump traffic from POD name:filebeat-pxzfh namespace: platform to chain KUBE-POD-FW-6Q423ANJ5FJYSYLE */
KUBE-POD-FW-5HZOC6QEWDN2OQQY  all  --  anywhere             10.233.6.3           PHYSDEV match --physdev-is-bridged /* rule to jump traffic destined to POD name:ors-maintenance-65bf64cf67-5v6c9 namespace: iscrum-dit to chain KUBE-POD-FW-5HZOC6QEWDN2OQQY */
KUBE-POD-FW-5HZOC6QEWDN2OQQY  all  --  anywhere             10.233.6.3           /* rule to jump traffic destined to POD name:ors-maintenance-65bf64cf67-5v6c9 namespace: iscrum-dit to chain KUBE-POD-FW-5HZOC6QEWDN2OQQY */
KUBE-POD-FW-6Q423ANJ5FJYSYLE  all  --  anywhere             10.233.6.50          PHYSDEV match --physdev-is-bridged /* rule to jump traffic destined to POD name:filebeat-pxzfh namespace: platform to chain KUBE-POD-FW-6Q423ANJ5FJYSYLE */
KUBE-POD-FW-6Q423ANJ5FJYSYLE  all  --  anywhere             10.233.6.50          /* rule to jump traffic destined to POD name:filebeat-pxzfh namespace: platform to chain KUBE-POD-FW-6Q423ANJ5FJYSYLE */
KUBE-POD-FW-2T5M5FYYBTRHPIYV  all  --  anywhere             10.233.6.51          PHYSDEV match --physdev-is-bridged /* rule to jump traffic destined to POD name:node-exporter-m4t4m namespace: monitoring to chain KUBE-POD-FW-2T5M5FYYBTRHPIYV */
KUBE-POD-FW-2T5M5FYYBTRHPIYV  all  --  anywhere             10.233.6.51          /* rule to jump traffic destined to POD name:node-exporter-m4t4m namespace: monitoring to chain KUBE-POD-FW-2T5M5FYYBTRHPIYV */
ACCEPT     all  --  anywhere             anywhere             /* allow outbound traffic from pods */
ACCEPT     all  --  anywhere             anywhere             /* allow inbound traffic to pods */
ACCEPT     all  --  anywhere             anywhere             /* allow outbound node port traffic on node interface with which node ip is associated */

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
KUBE-FIREWALL  all  --  anywhere             anywhere            
KUBE-POD-FW-5HZOC6QEWDN2OQQY  all  --  anywhere             10.233.6.3           /* rule to jump traffic destined to POD name:ors-maintenance-65bf64cf67-5v6c9 namespace: iscrum-dit to chain KUBE-POD-FW-5HZOC6QEWDN2OQQY */
KUBE-POD-FW-6Q423ANJ5FJYSYLE  all  --  anywhere             10.233.6.50          /* rule to jump traffic destined to POD name:filebeat-pxzfh namespace: platform to chain KUBE-POD-FW-6Q423ANJ5FJYSYLE */
KUBE-POD-FW-2T5M5FYYBTRHPIYV  all  --  anywhere             10.233.6.51          /* rule to jump traffic destined to POD name:node-exporter-m4t4m namespace: monitoring to chain KUBE-POD-FW-2T5M5FYYBTRHPIYV */

Chain DOCKER-USER (0 references)
target     prot opt source               destination         

Chain KUBE-FIREWALL (2 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere             /* kubernetes firewall for dropping marked packets */ mark match 0x8000/0x8000

Chain KUBE-NWPLCY-AJ4HEMWOQH4WRYTX (1 references)
target     prot opt source               destination         

Chain KUBE-NWPLCY-GIIJQFEHBJRDJKP7 (1 references)
target     prot opt source               destination         

Chain KUBE-NWPLCY-JZALGAZ2SIGPNZPY (1 references)
target     prot opt source               destination         

Chain KUBE-NWPLCY-R3ES7SXPUHQHIML7 (1 references)
target     prot opt source               destination         

Chain KUBE-NWPLCY-VYCJOSP4PDSOS2DK (1 references)
target     prot opt source               destination         

Chain KUBE-NWPLCY-XTNPBDVHIRENHZ3T (1 references)
target     prot opt source               destination         

Chain KUBE-POD-FW-2T5M5FYYBTRHPIYV (5 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere             /* rule for stateful firewall for pod */ ctstate RELATED,ESTABLISHED
KUBE-NWPLCY-R3ES7SXPUHQHIML7  all  --  anywhere             anywhere             /* run through nw policy denyall */
ACCEPT     all  --  anywhere             10.233.6.51          /* rule to permit the traffic traffic to pods when source is the pod's local node */ ADDRTYPE match src-type LOCAL
REJECT     all  --  anywhere             anywhere             /* default rule to REJECT traffic destined for POD name:node-exporter-m4t4m namespace: monitoring */ reject-with icmp-port-unreachable

Chain KUBE-POD-FW-5HZOC6QEWDN2OQQY (5 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere             /* rule for stateful firewall for pod */ ctstate RELATED,ESTABLISHED
KUBE-NWPLCY-AJ4HEMWOQH4WRYTX  all  --  anywhere             anywhere             /* run through nw policy outgoing */
KUBE-NWPLCY-JZALGAZ2SIGPNZPY  all  --  anywhere             anywhere             /* run through nw policy outgoing-ors */
KUBE-NWPLCY-XTNPBDVHIRENHZ3T  all  --  anywhere             anywhere             /* run through nw policy denyall */
KUBE-NWPLCY-VYCJOSP4PDSOS2DK  all  --  anywhere             anywhere             /* run through nw policy payara */
ACCEPT     all  --  anywhere             10.233.6.3           /* rule to permit the traffic traffic to pods when source is the pod's local node */ ADDRTYPE match src-type LOCAL
REJECT     all  --  anywhere             anywhere             /* default rule to REJECT traffic destined for POD name:ors-maintenance-65bf64cf67-5v6c9 namespace: iscrum-dit */ reject-with icmp-port-unreachable

Chain KUBE-POD-FW-6Q423ANJ5FJYSYLE (5 references)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere             /* rule for stateful firewall for pod */ ctstate RELATED,ESTABLISHED
KUBE-NWPLCY-GIIJQFEHBJRDJKP7  all  --  anywhere             anywhere             /* run through nw policy denyall */
ACCEPT     all  --  anywhere             10.233.6.50          /* rule to permit the traffic traffic to pods when source is the pod's local node */ ADDRTYPE match src-type LOCAL
REJECT     all  --  anywhere             anywhere             /* default rule to REJECT traffic destined for POD name:filebeat-pxzfh namespace: platform */ reject-with icmp-port-unreachable

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 17 (15 by maintainers)

Commits related to this issue

Most upvoted comments

Sure. We will do a release in the week end.