katib: Go Test failed sometimes

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.]

One of github action for Go Test failed sometimes, but it succeed when retest.

I’ve check for some cases:

All case failed for either one of these:

    trial_controller_test.go:261: 
        Timed out after 40.000s.
        Expected
            <bool>: false
        to be true
FAIL
    experiment_controller_test.go:350: 
        Timed out after 40.000s.
        Expected
            <bool>: false
        to be true

What did you expect to happen:

If these issue occured just from the network issue, what about increasing the timeout ? Does it make sense ? WYDT ?

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Kubeflow version (kfctl version):
  • Minikube version (minikube version):
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 16 (13 by maintainers)

Most upvoted comments

We need to inspect if we have a better solution with gingko v2

We should have various function names for each test Experiment. For example, TestReconcileJobFailed(t *testing.T), TestReconcileJobAvailableMetrics(t *testing.T), TestReconcileJobUnavailableMetrics(t *testing.T).

Yes, your suggestion is right. But the reaseon why I intended “we couldn’t call the test function as ReconcileTest anymore?” was that I felt handling those 3 cases at once make sense.

However, splitting those 3 would ok.