kubernetes: kube-proxy ipvs session affinity is always sticky and doesn't honor the timeout

Which jobs are failing:

ci-kubernetes-e2e-gci-gce-ipvs

Which test(s) are failing:

.*Services should have session affinity timeout work.*

Since when has it been failing:

These tests were added recently in https://github.com/kubernetes/kubernetes/pull/88409/commits/64c4876ccd013b309777d594ed3b14bee8cabf6e

They test that the affinity timeout works and that once the timeout is reached the session stops to be sticky

Testgrid link:

https://testgrid.k8s.io/sig-network-gce#gci-gce-ipvs&include-filter-by-regex=affinity

Reason for failure:

Mar 23 07:16:29.046: FAIL: Session is sticky after reaching the timeout

Anything else we need to know:

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 1
  • Comments: 40 (36 by maintainers)

Most upvoted comments

I get this locally when running e2e. But when I check it manually it seem to work. A theory I have is that the stickiness remains as long as ipvs has “inactive” connections. Meaning that a timeout lower than the TIME_WAIT (2 minutes) will not work as expected. I will try to find time to investigate further.

@andrewsykim We are not except for the kubernetes service on older clusters (it used to be set by default) We patched it a while back when we added support for graceful termination and it seemed to work as expected (we set expire_quiescent_template, see https://github.com/kubernetes/kubernetes/pull/71834). I’m not very surprised it does not work well for very low timeouts (which would not very useful in real scenarios I think, but I understand it makes testing harder).

We had a discussion for this on the April 16th SIG Network call.

The conclusion is that this is not a really a bug since the minimum session affinity timeout is still respected. Although it is probably an unexpected behavior for users that the stickiness lasts longer than the timeout specified.

I have an action item to add documentation for this limitation. As for the failing tests we agreed to increase the session affinity timeout in the tests to be > 2m for the IPVS proxy case (https://github.com/kubernetes/kubernetes/pull/89854). Unfortunately this makes the session affinity tests really slow but I think it’s okay since the IPVS presubmit is optional and only runs when triggered via comment.

cc @thockin

/assign

From http://www.linuxvirtualserver.org/docs/persistence.html I think the key point here is:

The template expires in a configurable time, and the template won't expire until all its connections expire. 

I don’t think “inactive connections” are considered “expired” so in this case as long as there are inactive connections, the persistence timer is not started. I think if we lower tcpFinTimeout enough (< 15s because that’s the interval between connections) it should work. I’ll verify shortly.

Maybe we should always set sessionAffinity timeout greater than 2MSL.

Yeah I was thinking about this but given we check 10 times that would make the session affinity tests take a really long time. If there’s another way to do this we should consider that.

@andrewsykim I set tcpFinTimeout=30 but connections are still sticky after 60s, please see https://github.com/kubernetes/kubernetes/issues/89358#issuecomment-606731216

I can confirm the theory. sessionAffinity with timeout < 120s will get timeout = 120 sec.

I don’t know any way to fix this. Ipvs has a number of sysctl’s but I can’t any obvious candidate (I have not investigated that in any depth).