kubernetes: Inconsistent reset of changes to kubernetes endpoints

What happened:

If you change some attributes of the kubernetes endpoint in the default namespace, they are not reset

Attributes that are reset (correctly)
  • IP address
Attributes that are not reset (incorrect)
  • Port
  • Port name
  • Protocol

If you change those attibutes of the kubernetes endpoint in the default namespace it is not reset to the correct value. This differs from the IP address which is reset rapidly.

What you expected to happen:

All attributes to be reset to the correct values.

Example of what I expect to happen

Here’s an example of the IP being reset to 192.168.65.4 rather than being left at the wrong values (1.1.1.1)

$ kubectl patch endpoints kubernetes -n default --type='json' -p='[{"op": "replace", "path": "/subsets/0/addresses/0/ip", "value":"1.1.1.1"}]' && sleep 60 && kubectl -n default get endpoints kubernetes -ojsonpath="{.subsets[0].addresses[0].ip}{'\n'}"
endpoints/kubernetes patched
192.168.65.4

How to reproduce it (as minimally and precisely as possible):

Example to patch the port to some nonsense that won’t work

kubectl patch endpoints kubernetes -n default --type='json' -p='[{"op": "replace", "path": "/subsets/0/ports/0/port", "value":443}]'

To reset to default

kubectl patch endpoints kubernetes -n default --type='json' -p='[{"op": "replace", "path": "/subsets/0/ports/0/port", "value":6443}]'

Example to patch the port name

kubectl patch endpoints kubernetes -n default --type='json' -p='[{"op": "replace", "path": "/subsets/0/ports/0/name", "value":"http"}]'

To reset to default

kubectl patch endpoints kubernetes -n default --type='json' -p='[{"op": "replace", "path": "/subsets/0/ports/0/name", "value":"https"}]'

Example to patch the protocol

kubectl patch endpoints kubernetes --type='json' -n default -p='[{"op": "replace", "path": "/subsets/0/ports/0/protocol", "value":"UDP"}]'

To reset to default

kubectl patch endpoints kubernetes --type='json' -n default -p='[{"op": "replace", "path": "/subsets/0/ports/0/protocol", "value":"TCP"}]'

Anything else we need to know?:

The resetting behaviour of the IP address prevents #97076 being used to MITM traffic to the kubernetes API, so that’s good.

Environment:

  • Kubernetes version (use kubectl version): Client Version: version.Info{Major:“1”, Minor:“21”, GitVersion:“v1.21.2”, GitCommit:“092fbfbf53427de67cac1e9fa54aaa09a28371d7”, GitTreeState:“clean”, BuildDate:“2021-06-16T12:59:11Z”, GoVersion:“go1.16.5”, Compiler:“gc”, Platform:“darwin/amd64”} Server Version: version.Info{Major:“1”, Minor:“21”, GitVersion:“v1.21.2”, GitCommit:“092fbfbf53427de67cac1e9fa54aaa09a28371d7”, GitTreeState:“clean”, BuildDate:“2021-06-16T12:53:14Z”, GoVersion:“go1.16.5”, Compiler:“gc”, Platform:“linux/amd64”}

  • Cloud provider or hardware configuration: Kubernetes enabled on docker desktop on macos

  • OS (e.g: cat /etc/os-release): image

  • Kernel (e.g. uname -a): N/A

  • Install tools:

  • Network plugin and version (if this is a network-related bug):

  • Others:

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 19 (8 by maintainers)

Most upvoted comments

I confirm I can repro this. From pkg/controlplane/reconcilers/instancecount.go:

// Requirements:
//  * All apiservers MUST use the same ports for their {rw, ro} services.
//  * All apiservers MUST use ReconcileEndpoints and only ReconcileEndpoints to manage the
//      endpoints for their {rw, ro} services.
//  * All apiservers MUST know and agree on the number of apiservers expected
//      to be running (c.masterCount).
//  * ReconcileEndpoints is called periodically from all apiservers.

That’s not super helpful. It looks like that code INTENDS to fix ports, but it’s not obvious at a glance why it doesn’t. Oh. That’s WAY up-stack in pkg/controlplane/controller.go:

        // Service definition is not reconciled after first     
        // run, ports and type will be corrected only during
        // start.   
        if err := c.UpdateKubernetesService(false); err != nil {

I don’t immediately see why that can’t be true (or frankly, why it is a param at all). Some archaeology may be needed, but this doesn’t seem particularly hard.

/triage accepted

This exhibits the same behavior as current endpoints generated by the endpoints controller, users with permission can modify endpoints (and they are not reconciled), if you have power permissions the system doesn’t stop you to do “bad things” ( i.e. sudo rm -rf / ) 😄

See related discussion here #98066 (comment)

Moreover, Jordan comment is spot on #107773 (comment) , right now there is the convention that all apiserver have the same port, and this change will make impossible to move to extra ports in the future.

In my opinion we should not modify this behavior

Thanks for chiming in, I take your point about the API server port and that k8s doesn’t usually stop you doing something bad but I do think this is a case where I would expect the API server to reconcile this back to the “correct” state.

I’m sure there are alternative ways of fixing this that will enable heterogeneous endpoint ports?

/sig network /kind support for triage

you might get more responses on: http://git.k8s.io/kubernetes/SUPPORT.md