rancher: Rancher canal network provider has wrong MTU

Rancher: 2.0.2

Creating a cluster with Canal seems to leave the default MTU at 1500.

However, calico+flannel should have 1450 for the 50 bytes vxlan overhead, as documented in https://docs.projectcalico.org/v3.1/usage/configuration/mtu

Example container:

# ip -d link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
3: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP mode DEFAULT 
    link/ether 2e:88:0a:f0:64:ff brd ff:ff:ff:ff:ff:ff
    veth 

This is a veth to the docker0 interface that has 1500 MTU, but it should respect the flannel.1 1450 host interface instead.

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether fa:16:3e:4c:a6:72 brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default 
    link/ether 02:42:96:c3:98:89 brd ff:ff:ff:ff:ff:ff promiscuity 0 
    bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 0 vlan_protocol 802.1Q bridge_id 8000.2:42:96:c3:98:89 designated_root 8000.2:42:96:c3:98:89 root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer  134.54 vlan_default_pvid 1 vlan_stats_enabled 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 4 mcast_hash_max 512 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3125 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default 
    link/ether 02:23:2e:e5:d3:a4 brd ff:ff:ff:ff:ff:ff promiscuity 0 
    vxlan id 1 local 10.130.179.96 dev eth0 srcport 0 0 dstport 8472 nolearning ageing 300 noudpcsum noudp6zerocsumtx noudp6zerocsumrx addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
5: cali3f0ed62327c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 0 promiscuity 0 
    veth addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
6: calia2b582fff19@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 1 promiscuity 0 
    veth addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
7: calid5d19988e17@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default 
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netnsid 2 promiscuity 0 
    veth addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

Not having the correct MTU can cause intermittent tcp retransmission errors and poor performance.

The fix can be simply done in RKE canal config map template: https://github.com/rancher/rke/blob/master/templates/canal.go by adding "mtu": "1450" to the cni_network_conf plugins.

Ideally this should become a template variable that we define in Rancher type objects somewhere.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 8
  • Comments: 22 (4 by maintainers)

Most upvoted comments

Currently, RKE does not support configuring MTU,which can be configured manually:

  1. Edit configmaps through kubectl

kubectl edit configmaps -n kube-system canal-config

  1. Add "mtu": "1450" under "type": "calico"
image
  1. Delete the canal pod and let it be recreated

kubectl get pod -n kube-system |grep canal |awk '{print $1}' | xargs kubectl delete -n kube-system pod

  1. Create the application to view the MTU value

Related: https://github.com/coreos/flannel/issues/1011

Just changing the MTU value in the configmap and recreating the flannel pods, doesn’t reset the MTU on the flannel.1 interface.

The workaround is to change the mtu value of the primary network interface using ifcfg-* scripts.

Note: It’s very important to have the same MTU (Lowest MTU value of all the nodes) configured on ALL the nodes of the kubernetes cluster, otherwise there is a high possibility of running into some weird networking problems.

@krwrang yes, I tested that too, I forgot to mention it back on this ticket. It worked but we had to roll through all our nodes in the cluster, restarting them. For us the issue was 2 sites, ipsec vpn, therefore we had to lower the mtu to make all packages pass. Before some http requests bigger than 450 bytes disappeared.

@galal-hussein we need a way to pass the MTU setting from rkeconfig.yml.

I’m closing this issue as we have introduced the ability to set your own MTU as of v2.3.4. If this doesn’t satisfy what is needed for your use case, please open a new issue.

MTU is specific to each person’s environment, so there is no way for us to predict what the value should be on everyone’s setup.

fwiw, I found I had to update the /etc/cni/net.d/calico-kubeconfig on the host node, add the ‘mtu’ attribute there, and reboot the machine in order to get the cali* interfaces to have a custom MTU.

As @sboulkour notes, simply adding it to the K8S ConfigMap for ‘canal-config’ didn’t affect the cali* interfaces.

My problem is MTU in 1450. The default 1500 is OK for me but I run canal.

Terraform + RKE no set MTU then canal setup in 1450. I change in config map, as is https://github.com/rancher/rke/blob/master/templates/calico.go actually.

My config calico+flannel and I am not can change this. Then, finally I set eth0 in MTU 1550 for flannel.1. Automatically flannel mtu is 1500.

The main problem, azure or openstack load balance, not talking correctly with origin mtu and target mtu (eth0+flannel.1)

My problem is MTU in 1450. The default 1500 is OK for me but I run canal.

Terraform + RKE no set MTU then canal setup in 1450. I change in config map, as is https://github.com/rancher/rke/blob/master/templates/calico.go actually.