odo: More often the issue "waited 4m0s but couldn't find running pod matching selector"

I have made changes to the scripts/openshiftci-presubmit-all-tests.sh and scripts/configure-installer-tests-cluster.sh for Power on the lines of the Z changes. Also done the setup required like creating the secret to pull redhat registry images. Currently I am trying to run the scripts/openshiftci-presubmit-all-tests.sh and seeing the below issues for test-generic test suite:

Summarizing 3 Failures:

[Fail] odo generic when running odo push with flag --show-log [It] should be able to push changes
/usr/local/go/src/github.com/openshift/odo/tests/helper/helper_run.go:34

[Fail] odo generic when component's deployment config is deleted with oc [It] should delete all OpenShift objects except the component's imagestream
/usr/local/go/src/github.com/openshift/odo/tests/helper/helper_run.go:34

[Fail] odo generic deploying a component with a specific image name [It] should deploy the component
/usr/local/go/src/github.com/openshift/odo/tests/helper/helper_run.go:34

Ran 17 of 177 Specs in 332.933 seconds
FAIL! -- 14 Passed | 3 Failed | 0 Pending | 160 Skipped


Ginkgo ran 1 suite in 5m38.279029757s
Test Suite Failed
make: *** [Makefile:149: test-generic] Error 1

Disabled the code which deletes the test project after every execution, due to which test pods also were deleted. Found the following for another test run, Investigating more on this:-

[root@shivani2-bastion odo]# oc project rmrtyctmpa
Now using project "rmrtyctmpa" on server "https://api.shivani2.example.com:6443".
[root@shivani2-bastion odo]# oc get pods
NAME                  READY   STATUS   RESTARTS   AGE
vusmsb-app-1-deploy   0/1     Error    0          20m
[root@shivani2-bastion odo]# oc describe pod vusmsb-app-1-deploy
Name:         vusmsb-app-1-deploy
Namespace:    rmrtyctmpa
Priority:     0
Node:         shivani2-worker-1/192.166.25.14
Start Time:   Tue, 30 Jun 2020 08:49:53 -0400
Labels:       openshift.io/deployer-pod-for.name=vusmsb-app-1
Annotations:  k8s.v1.cni.cncf.io/networks-status:
              openshift.io/deployment-config.name: vusmsb-app
              openshift.io/deployment.name: vusmsb-app-1
              openshift.io/scc: restricted
Status:       Failed
IP:           10.128.2.23
IPs:
  IP:  10.128.2.23
Containers:
  deployment:
    Container ID:   cri-o://6cca90dbaec1d92ddcc86a985ed5a3a509da1993c7fd999cbcb3743c2d982a33
    Image:          quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7c23bc9185f21d95ec599eda8386b59ee6a9ea6a9e3a6ae34c025ac7909f5938
    Image ID:       quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7c23bc9185f21d95ec599eda8386b59ee6a9ea6a9e3a6ae34c025ac7909f5938
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 30 Jun 2020 08:49:56 -0400
      Finished:     Tue, 30 Jun 2020 08:59:58 -0400
    Ready:          False
    Restart Count:  0
    Environment:
      OPENSHIFT_DEPLOYMENT_NAME:       vusmsb-app-1
      OPENSHIFT_DEPLOYMENT_NAMESPACE:  rmrtyctmpa
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from deployer-token-tf66v (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  deployer-token-tf66v:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  deployer-token-tf66v
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age        From                        Message
  ----    ------     ----       ----                        -------
  Normal  Scheduled  <unknown>  default-scheduler           Successfully assigned rmrtyctmpa/vusmsb-app-1-deploy to shivani2-worker-1
  Normal  Pulled     20m        kubelet, shivani2-worker-1  Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7c23bc9185f21d95ec599eda8386b59ee6a9ea6a9e3a6ae34c025ac7909f5938" already present on machine
  Normal  Created    20m        kubelet, shivani2-worker-1  Created container deployment
  Normal  Started    20m        kubelet, shivani2-worker-1  Started container deployment
[root@shivani2-bastion odo]# oc logs vusmsb-app-1-deploy
--> Scaling vusmsb-app-1 to 1
error: update acceptor rejected vusmsb-app-1: pods for rc 'rmrtyctmpa/vusmsb-app-1' took longer than 600 seconds to become available
[root@shivani2-bastion odo]#

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 30 (26 by maintainers)

Most upvoted comments

@amitkrout is there any way we can by-pass or disable the failing/flaky tests till they are fixed and move onto the next set of tests?

@sarveshtamba This flake comes when you push a component. In out test script you will find most of the test scenario has atleast a push command which is a key validation steps.

We have a --skip=REGEXP in ginkgo but using this will complete defeat the purpose of testing odo as it skips almost 90% of your total tests.

I would recommend wait until https://github.com/openshift/odo/pull/3232 is delivered. BTW this pr will only resolves the build timeout issue of git component. For local component build timeout issue anyone from odo team will start working this may in the next sprint.

amitkrout on Jul 6, 2020