kubernetes: Topology Aware Hints fail without explanation

After enabling the topology aware hint feature gate, labeling nodes with zones, adding the annotation, and satisfying all the conditions specified in the documentation, the EndPointSlice Controller does not add any hints to relevant endpoints. Nothing is logged to explain the behavior.

Service:

apiVersion: v1
kind: Service
metadata:
  name: maxscale
  annotations:
    service.kubernetes.io/topology-aware-hints: auto
spec:
  type: ClusterIP
  selector:
    app: maxscale
  ports:
    - protocol: TCP
      port: 3308
      targetPort: 3308 
      name: mysql-split
    - protocol: TCP
      port: 3307
      targetPort: 3307 
      name: mysql-slave
    - protocol: TCP
      port: 3306
      name: mysql-master
      targetPort: 3306 

kube-controller-manager manifest (truncated):

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    component: kube-controller-manager
    tier: control-plane
  name: kube-controller-manager
  namespace: kube-system
spec:
  containers:
  - command:
    - kube-controller-manager
    - --allocate-node-cidrs=true
    - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
    - --bind-address=127.0.0.1
    - --client-ca-file=/etc/kubernetes/pki/ca.crt
    - --cluster-cidr=10.11.0.0/16
    - --cluster-name=kubernetes
    - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
    - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
    - --controllers=*,bootstrapsigner,tokencleaner
    - --kubeconfig=/etc/kubernetes/controller-manager.conf
    - --leader-elect=true
    - --port=0
    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    - --root-ca-file=/etc/kubernetes/pki/ca.crt
    - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
    - --service-cluster-ip-range=10.96.0.0/12
    - --use-service-account-credentials=true
    - --feature-gates=TopologyAwareHints=true

....

similar for apiserver and kube-scheduler

each node has a label with its own name as zone, like so:

frege     Ready    control-plane,master   219d   v1.21.3   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes-host=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=frege,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,speed=fast,topology.kubernetes.io/zone=frege

unique endpointslice corresponding to above service:

Name:         maxscale-5pgg4
Namespace:    default
Labels:       endpointslice.kubernetes.io/managed-by=endpointslice-controller.k8s.io
              kubernetes.io/service-name=maxscale
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2021-07-24T01:28:15Z
AddressType:  IPv4
Ports:
  Name          Port  Protocol
  ----          ----  --------
  mysql-master  3306  TCP
  mysql-split   3308  TCP
  mysql-slave   3307  TCP
Endpoints:
  - Addresses:  10.11.175.69
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/maxscale-lght2
    NodeName:   quokka
    Zone:       quokka
  - Addresses:  10.11.152.224
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/maxscale-p6jt2
    NodeName:   papaki
    Zone:       papaki
etc..

other notes:

  • this cluster currently has 5 nodes in total, each with a zone label that is its name as above
  • all nodes have similar allocatable cpus (4 show 32 with “describe node”, one 24)
  • the endpoints are members of a daemonset so there are 5 endpoints for 5 nodes
  • after enabling the feature gate in apiserver, kube-scheduler, and controller-manager, I made sure that all 3 pods restarted for each node, but kubelet hasn’t been restarted. neither has kube-proxy.

The only potentially relevant log item I see from the kube components is many lines like this from kube-proxy pods:

W0724 00:27:37.725694 1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice

I hope I’m just doing something wrong.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 20 (11 by maintainers)

Commits related to this issue

Most upvoted comments

Silly me! I misread “kube-proxy” as “kube-scheduler”. I didn’t notice this because I didn’t get to the point of worrying about kube-proxy. The problem I was reporting above is that the endpointslice controller isn’t adding hints to my endpointslice. That’s still the case after adding the feature gate to kube-proxy. The only change is that now I have kube-proxy telling me that there are no zone hints on the endpointslice.

Here are my relevant manifests, better formatted:

The maxscale service:

Name:              maxscale
Namespace:         default
Labels:            <none>
Annotations:       service.kubernetes.io/topology-aware-hints: auto
Selector:          app=maxscale
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                10.99.175.138
IPs:               10.99.175.138
Port:              mysql-split  3308/TCP
TargetPort:        3308/TCP
Endpoints:         10.11.152.250:3308,10.11.175.119:3308,10.11.191.57:3308 + 2 more...
Port:              mysql-slave  3307/TCP
TargetPort:        3307/TCP
Endpoints:         10.11.152.250:3307,10.11.175.119:3307,10.11.191.57:3307 + 2 more...
Port:              mysql-master  3306/TCP
TargetPort:        3306/TCP
Endpoints:         10.11.152.250:3306,10.11.175.119:3306,10.11.191.57:3306 + 2 more...
Session Affinity:  None
Events:            <none>

Endpointslice:

Name:         maxscale-5pgg4
Namespace:    default  
Labels:       endpointslice.kubernetes.io/managed-by=endpointslice-controller.k8s.io
              kubernetes.io/service-name=maxscale
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2021-07-24T19:13:25Z
AddressType:  IPv4
Ports:
  Name          Port  Protocol
  ----          ----  --------
  mysql-master  3306  TCP
  mysql-split   3308  TCP
  mysql-slave   3307  TCP
Endpoints:
  - Addresses:  10.11.175.119
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/maxscale-g7f8r
    NodeName:   quokka
    Zone:       quokka
  - Addresses:  10.11.152.250
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/maxscale-sfdqb
    NodeName:   papaki
    Zone:       papaki
  - Addresses:  10.11.191.57
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/maxscale-n8qw5
    NodeName:   possum
    Zone:       possum
  - Addresses:  10.11.211.156
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/maxscale-56bjv
    NodeName:   frege
    Zone:       frege
  - Addresses:  10.11.58.226
    Conditions:
      Ready:    true
    Hostname:   <unset>
    TargetRef:  Pod/maxscale-phttg
    NodeName:   russell
    Zone:       russell
Events:         <none>

Nodes with their region labels

NAME      STATUS   ROLES                  AGE    VERSION   LABELS
frege     Ready    control-plane,master   221d   v1.21.3   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes-host=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=frege,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,speed=fast,topology.kubernetes.io/zone=frege
papaki    Ready    control-plane,master   27d    v1.21.3   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=papaki,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,speed=medium,topology.kubernetes.io/zone=papaki
possum    Ready    control-plane,master   157d   v1.21.3   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=possum,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,speed=slow,topology.kubernetes.io/zone=possum
quokka    Ready    control-plane,master   33d    v1.21.3   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=quokka,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,speed=medium,topology.kubernetes.io/zone=quokka
russell   Ready    control-plane,master   164d   v1.21.3   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,datamaster=yes,kubernetes.io/arch=amd64,kubernetes.io/hostname=russell,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,speed=fast,topology.kubernetes.io/zone=russell

About the safeguards

  1. Sufficient number of endpoints. I have 5 for 5 nodes.
  2. Impossible to achieve balanced allocation. 4 nodes have 32 allocable CPUs and one has 24. Could that be it?
  3. One or more Nodes has insufficient information. Nodes all have zones as shown above.
  4. One or more endpoints does not have a zone hint <== this is for kube-proxy. this is what I’m trying to fix.
  5. A zone is not represented in hints <== ditto

About the constraints that could be relevant

  1. Topology Aware Hints are not used when either externalTrafficPolicy or internalTrafficPolicy is set to Local on a Service. <= this is not in use as shown above.
  2. The EndpointSlice controller ignores unready nodes as it calculates the proportions of each zone. nodes are ready