operator: The `controlPlaneNodeSelector` installation field doesn't effect Typha
Expected Behavior
Based on the documentation for controlPlaneNodeSelector it applies to all components which aren’t DaemonSets. That means that it should apply to the Typha deployment.
Current Behavior
The controlPlaneNodeSelector
doesn’t apply to the Typha deployment. I suspect this might be because there is the typhaAffinity
field, but affinity and node selectors can be used in parallel.
Possible Solution
Use controlPlaneNodeSelector
with the Typha deployment.
Steps to Reproduce (for bugs)
n/a
Context
n/a
Your Environment
n/a
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 18 (14 by maintainers)
I think we should fill this out:
Consistent and covers all of the bases 😅
@caseydavenport I agree that “Add component-specific fields for Typha and calico/node.” is the best approach from an end user aspect. Is it worth considering if Calico node needs affinity or customisable selectors? Also Calico Node configuration could use the daemonset prefix, e.g.
daemonsetTolerations
.Slightly related, could you point me at the documentation that defines the data plane and control plane? I incorrectly assumed that Typha was part of the control plane; mainly because it’s usually only the daemonsets that are considered data plane, but also because my understanding of Typha is that it caches the K8s API and I’d consider the K8s API to be more control plane than data plane (possibly incorrectly).
Hey @stevehipwell @aquam8 @sarthakjain271095 and @aarondav,
We’ve put up an outline of proposed changes to operator component configuration. Among other things, this will allow overriding
tolerations
and nodeaffinity
/nodeSelectors
. Please take a look if you can. We’d appreciate your input on the proposed changes: https://github.com/tigera/operator/issues/1990That would work too! The one caveat we’ve heard of around affinity is that it is much more expensive for the scheduler to enforce compared to nodeSelectors, which could limit the size of the Kubernetes cluster in terms of number of pods (around 10k pods per cluster). But we haven’t hit that limit in our use-case yet.
Agreed
I think the options here are:
I think the latter is probably the right path forward.
controlPlane
makes sense for controllers and such that are not critical path for applications functioning (kube-controllers, apiserver, etc). Those can be bunched up.calico/node and calico/typha are, unfortunately but necessarily, special system components that require fine-tuning.
So, for typha I think we should have:
I’m not a huge fan of encoding component names into the API - I think it leaks implementation details, but in this case the implementation is part of the feature that is relevant to the end user, so there might be no way around that.
calico/node
is even more awkward, because it is named pretty vaguely…^ These all seem non-obvious for the new user - e.g., is it “CalicoNode affinity or Calico NodeAffinity”? I think better names are needed for those.
Yeah that was a bad decision to have the two controlPlane* configs not apply to the same set of components. @caseydavenport WDYT should we create a typhaTolerations so the controlPlane ones are consistent in where they apply?
I expect the user to take on that cognitive load because it should be a specific decision if they need to use affinity for typha. We are talking about a component that if it cannot be deployed then pod networking will not function in a cluster, so if someone wants Node Selector type behavior for typha, it should not be an easy or quick decision.
@tmjd I’ve configured
typhaAffinity
but as Typha uses thecontrolPlaneTolerations
value it really doesn’t make sense that it doesn’t use thecontrolPlaneNodeSelector
value. Also for a simple node selectortyphaAffinity
is overkill and adds significant cognitive load.When building Kubernetes platforms it’s really important to have control over scheduling decisions for central components; this is where a lack of flexibility in operators can make them un-usable. It’s a common pattern to run system node pools to rull all the central components on, leaving user provisioned nodes to only run daemonsets and user workloads.