che: workspace fails and starts several times. not sure why

Summary

image

chectl chectl/0.0.20220809-next.c9f4ba1 darwin-x64 node-v16.13.2

command used to install eclipse-che chectl server:deploy --che-operator-cr-patch-yaml=/Users/divine/Documents/office_tasks/TAP-4540/terraformCheJune17/eclipseche_yaml/che-operator-cr-patch.yaml --platform=k8s --installer=operator --debug --k8spoderrorrechecktimeout=1500000 --domain=eclipseche-dchelladurai-chejune15.calatrava.vmware.com --k8spodreadytimeout=1500000

project-clone container in pod “workspace0fbe074d8c144b0a-66b54c49d9-2gqkt” shows below error image

kubectl events on user namespace divine-chelladurai-che-pmjg9u image

below is my devfile.yaml

schemaVersion: 2.1.0
metadata:
  name: cbfsel-repo1
projects:
  - name: cbfsel-project1
    git:
      checkoutFrom:
        revision: master
      remotes:
        origin: https://gitlab.eng.com/dchelladurai/cbf-sel.git
components:
  - container:
      image: 'quay.io/devfile/universal-developer-image:ubi8-b452131'
      memoryLimit: 4G
    name: javacontainer

project-clone container logs when workspace starts successfully (log1)

2022/08/30 04:04:30 Using temporary directory /projects/project-clone-2868426929
2022/08/30 04:04:30 Read DevWorkspace at /devworkspace-metadata/flattened.devworkspace.yaml
2022/08/30 04:04:30 Processing project cbfsel-project1
2022/08/30 04:04:30 Cloning project cbfsel-project1 to /projects/project-clone-2868426929/cbfsel-project1
Cloning into '/projects/project-clone-2868426929/cbfsel-project1'...
fatal: unable to get credential storage lock in 1000 ms: Permission denied
Updating files:  41% (1749/4191)
Updating files:  42% (1761/4191)
Updating files:  43% (1803/4191)
Updating files:  44% (1845/4191)
Updating files:  45% (1886/4191)
Updating files:  46% (1928/4191)
Updating files:  47% (1970/4191)
Updating files:  48% (2012/4191)
Updating files:  49% (2054/4191)
Updating files:  50% (2096/4191)
Updating files:  51% (2138/4191)
Updating files:  52% (2180/4191)
Updating files:  53% (2222/4191)
Updating files:  54% (2264/4191)
Updating files:  55% (2306/4191)
Updating files:  56% (2347/4191)
Updating files:  57% (2389/4191)
Updating files:  58% (2431/4191)
Updating files:  59% (2473/4191)
Updating files:  60% (2515/4191)
Updating files:  61% (2557/4191)
Updating files:  62% (2599/4191)
Updating files:  63% (2641/4191)
Updating files:  64% (2683/4191)
Updating files:  65% (2725/4191)
Updating files:  66% (2767/4191)
Updating files:  67% (2808/4191)
Updating files:  68% (2850/4191)
Updating files:  69% (2892/4191)
Updating files:  70% (2934/4191)
Updating files:  71% (2976/4191)
Updating files:  72% (3018/4191)
Updating files:  73% (3060/4191)
Updating files:  74% (3102/4191)
Updating files:  75% (3144/4191)
Updating files:  76% (3186/4191)
Updating files:  77% (3228/4191)
Updating files:  78% (3269/4191)
Updating files:  79% (3311/4191)
Updating files:  80% (3353/4191)
Updating files:  81% (3395/4191)
Updating files:  82% (3437/4191)
Updating files:  83% (3479/4191)
Updating files:  84% (3521/4191)
Updating files:  85% (3563/4191)
Updating files:  86% (3605/4191)
Updating files:  87% (3647/4191)
Updating files:  88% (3689/4191)
Updating files:  89% (3730/4191)
Updating files:  90% (3772/4191)
Updating files:  91% (3814/4191)
Updating files:  92% (3856/4191)
Updating files:  93% (3898/4191)
Updating files:  94% (3940/4191)
Updating files:  95% (3982/4191)
Updating files:  96% (4024/4191)
Updating files:  96% (4033/4191)
Updating files:  97% (4066/4191)
Updating files:  98% (4108/4191)
Updating files:  99% (4150/4191)
Updating files: 100% (4191/4191)
Updating files: 100% (4191/4191), done.
2022/08/30 04:05:02 Cloned project cbfsel-project1 to /projects/project-clone-2868426929/cbfsel-project1
2022/08/30 04:05:02 Setting up remotes for project cbfsel-project1
fatal: unable to get credential storage lock in 1000 ms: Permission denied
2022/08/30 04:05:02 Fetched remote origin at https://gitlab.eng.vmware.com/dchelladurai/cbf-sel.git
2022/08/30 04:05:03 Encountered error while setting up project cbfsel-project1: failed to checkout revision: failed to read remote origin: authentication required

project-clone container logs when workspace starts successfully (log2)

2022/08/30 04:36:03 Using temporary directory /projects/project-clone-42631352
2022/08/30 04:36:03 Read DevWorkspace at /devworkspace-metadata/flattened.devworkspace.yaml
2022/08/30 04:36:03 Processing project cbfsel-project1
2022/08/30 04:36:03 Project 'cbfsel-project1' is already cloned and has all remotes configured

Relevant information

No response

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 16 (7 by maintainers)

Most upvoted comments

I’m not able to debug the cluster itself, but some useful information about the cluster/images would be

  • What is the final status of the workspaces that are failing to start? I.e. when the workspace fails, what is in the .status field of the DevWorkspace CR in your namespace
    • You can get this information from the kubernetes commandline: kubectl get devworkspace <workspace-name> -n <namespace> --output yaml
    • <workspace-name> can be found through the Che dashboard (what’s the name of the failing workspace) or through the commandline: kubectl get devworkspaces -n <namespace>
    • <namespace> is the namespace your workspaces are created in and depends on your username in Che or on Che configuration
  • You can get logs from the DevWorkspace Operator by using kubectl logs -n "$NAMESPACE" -f deploy/devworkspace-controller-manager -c devworkspace-controller where $NAMESPACE is the namespace where you installed Che
  • How are you building the image divine6/customopenjdk8:v10? Is it available in the cluster? What is the entrypoint/args for the image?
  • Try removing the checkoutFrom from the devfile as suggested:
     schemaVersion: 2.1.0
     metadata:
       name: cbfsel-repo4
     projects:
       - name: cbfsel-project4
         git:
    -      checkoutFrom:
    -        revision: master
           remotes:
             origin: https://gitlab.eng.vmware.com/dchelladurai/cbf-sel.git
     components:
       - container:
           image: 'quay.io/devfile/universal-developer-image:ubi8-latest'
           memoryLimit: 4G
         name: javacontainer
    

You’re reporting what seems to be 3-4 disparate issues:

  1. Project clone is failing to clone your project – you should be able to work around this by removing checkoutFrom from the devfile as above. The underlying issue is https://github.com/devfile/devworkspace-operator/issues/913
  2. The Devfile is failing to start when you use divine6/customopenjdk8:v10 instead of quay.io/devfile/universal-developer-image:ubi8-latest – for this you need to figure out what the issue with your image is (I can’t access it to look at it), but I suspect it’s a terminating image or something similar. The DevWorkspace CR status as described above would give us tips here
  3. There is/was some issue cluster-wide that is also happening – not sure if that’s resolved but there’s the warning event that 0/4 nodes available. That’s beyond our scope for fixing.

Does the devfile below work on your cluster?

schemaVersion: 2.1.0
metadata:
  name: cbfsel-repo4
projects:
  - name: cbfsel-project4
    git:
      remotes:
        origin: https://gitlab.eng.vmware.com/dchelladurai/cbf-sel.git
components:
  - container:
      image: 'quay.io/devfile/universal-developer-image:ubi8-latest'
      memoryLimit: 4G
    name: javacontainer

@amisevsk can you look at this issue, please? There seem some problems with the project-clone container.