kubernetes: Topology Aware Hints fail without explanation
After enabling the topology aware hint feature gate, labeling nodes with zones, adding the annotation, and satisfying all the conditions specified in the documentation, the EndPointSlice Controller does not add any hints to relevant endpoints. Nothing is logged to explain the behavior.
Service:
apiVersion: v1
kind: Service
metadata:
name: maxscale
annotations:
service.kubernetes.io/topology-aware-hints: auto
spec:
type: ClusterIP
selector:
app: maxscale
ports:
- protocol: TCP
port: 3308
targetPort: 3308
name: mysql-split
- protocol: TCP
port: 3307
targetPort: 3307
name: mysql-slave
- protocol: TCP
port: 3306
name: mysql-master
targetPort: 3306
kube-controller-manager manifest (truncated):
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --allocate-node-cidrs=true
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --cluster-cidr=10.11.0.0/16
- --cluster-name=kubernetes
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
- --controllers=*,bootstrapsigner,tokencleaner
- --kubeconfig=/etc/kubernetes/controller-manager.conf
- --leader-elect=true
- --port=0
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --root-ca-file=/etc/kubernetes/pki/ca.crt
- --service-account-private-key-file=/etc/kubernetes/pki/sa.key
- --service-cluster-ip-range=10.96.0.0/12
- --use-service-account-credentials=true
- --feature-gates=TopologyAwareHints=true
....
similar for apiserver and kube-scheduler
each node has a label with its own name as zone, like so:
frege Ready control-plane,master 219d v1.21.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes-host=true,kubernetes.io/arch=amd64,kubernetes.io/hostname=frege,kubernetes.io/os=linux,node-role.kubernetes.io/control-plane=,node-role.kubernetes.io/master=,node.kubernetes.io/exclude-from-external-load-balancers=,speed=fast,topology.kubernetes.io/zone=frege
unique endpointslice corresponding to above service:
Name: maxscale-5pgg4
Namespace: default
Labels: endpointslice.kubernetes.io/managed-by=endpointslice-controller.k8s.io
kubernetes.io/service-name=maxscale
Annotations: endpoints.kubernetes.io/last-change-trigger-time: 2021-07-24T01:28:15Z
AddressType: IPv4
Ports:
Name Port Protocol
---- ---- --------
mysql-master 3306 TCP
mysql-split 3308 TCP
mysql-slave 3307 TCP
Endpoints:
- Addresses: 10.11.175.69
Conditions:
Ready: true
Hostname: <unset>
TargetRef: Pod/maxscale-lght2
NodeName: quokka
Zone: quokka
- Addresses: 10.11.152.224
Conditions:
Ready: true
Hostname: <unset>
TargetRef: Pod/maxscale-p6jt2
NodeName: papaki
Zone: papaki
etc..
other notes:
- this cluster currently has 5 nodes in total, each with a zone label that is its name as above
- all nodes have similar allocatable cpus (4 show 32 with “describe node”, one 24)
- the endpoints are members of a daemonset so there are 5 endpoints for 5 nodes
- after enabling the feature gate in apiserver, kube-scheduler, and controller-manager, I made sure that all 3 pods restarted for each node, but kubelet hasn’t been restarted. neither has kube-proxy.
The only potentially relevant log item I see from the kube components is many lines like this from kube-proxy pods:
W0724 00:27:37.725694 1 warnings.go:70] discovery.k8s.io/v1beta1 EndpointSlice is deprecated in v1.21+, unavailable in v1.25+; use discovery.k8s.io/v1 EndpointSlice
I hope I’m just doing something wrong.
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 20 (11 by maintainers)
Commits related to this issue
- Tests to recreate issue #103888 — committed to robscott/kubernetes by robscott 3 years ago
Silly me! I misread “kube-proxy” as “kube-scheduler”. I didn’t notice this because I didn’t get to the point of worrying about kube-proxy. The problem I was reporting above is that the endpointslice controller isn’t adding hints to my endpointslice. That’s still the case after adding the feature gate to kube-proxy. The only change is that now I have kube-proxy telling me that there are no zone hints on the endpointslice.
Here are my relevant manifests, better formatted:
The maxscale service:
Endpointslice:
Nodes with their region labels
About the safeguards
About the constraints that could be relevant