kubernetes: pull-kubernetes-e2e-gce is nearing its timeout

If forget which timeout applies to which layer, but there is both:

Tests are near 60m at this point. https://testgrid.k8s.io/presubmits-kubernetes-blocking#pull-kubernetes-e2e-gce&graph-metrics=test-duration-minutes&include-filter-by-regex=Timeout|Overall

It’s difficult to tell how we’re doing on the CI equivalent of this job because it seems to be flaking so badly that it’s perpetually failing? https://testgrid.k8s.io/sig-release-master-blocking#gce-cos-master-default&width=5 (this seems like a separate issue)

I asked BigQuery and exported into data studio to get a chart of the time since 2018

SELECT
  timestamp_trunc(started, day) day,
  avg(elapsed)
FROM
  `k8s-gubernator.build.all`
WHERE
      job = "ci-kubernetes-e2e-gci-gce" 
  AND started >= timestamp('2018-01-01')
GROUP BY
  day
ORDER BY
  day asc

So yeah it’s been steadily going up with some notable bumps here and there: screen shot 2019-02-21 at 2 18 44 pm

Opening this because I believe it’s more productive to hold the current threshold than it is to just raise the timeout. We should identify some of the top offenders in slowness and kick them out.

/kind cleanup /sig release as owners of this job /priority important-soon We may need to bump this up if we start hitting the timeout more

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 2
  • Comments: 54 (36 by maintainers)

Most upvoted comments

/remove-lifecycle rotten I re-ran the query listed in the description, which shows duration for the CI job ci-kubernetes-e2e-gci-gce duration (seconds) vs  day

It looks like we might need to reconsider this in a year but it’s probably fine as closed for now

Another problem we have is not that any particular test is slow but just that we only add more tests.

With lots of features trying to go GA, and every one of them adding a conformance test, we’ve added a bunch more tests and times have gone up for presubmits cc @aojea.

/remove-lifecycle stale /milestone clear I don’t think this should be tracked against the release cycle

correct on the x/y, I can provide a link to something that has more concrete numbers but the main point was to illustrate that it’s been continually going up and to the right