topolvm: Topolvm can't find node to schedule pod

(I can’t login to your slack, therefor i open this issue)

I’m on Openshift 4 and have some issues regarding scheduling a pod. OCP 4.8.2 Kubernetes 1.21.1

❯ oc -n sfe-test describe pod nginx-1
Name:                 nginx-1
Namespace:            sfe-test
Priority:             1000000
Priority Class Name:  topolvm
Node:                 <none>
Labels:               app=nginx-1
Annotations:          capacity.topolvm.cybozu.com/00default: 3221225472
                      openshift.io/scc: anyuid
Status:               Pending
IP:
IPs:                  <none>
Containers:
  nginx:
    Image:      quay.io/bitnami/nginx:latest
    Port:       <none>
    Host Port:  <none>
    Limits:
      topolvm.cybozu.com/capacity:  1
    Requests:
      topolvm.cybozu.com/capacity:  1
    Environment:                    <none>
    Mounts:
      /data from data (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-f6nvz (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  nginx-1
    ReadOnly:   false
  kube-api-access-f6nvz:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              node-role.kubernetes.io/app=
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  4m30s  default-scheduler  0/9 nodes are available: 3 Insufficient topolvm.cybozu.com/capacity, 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.
  Warning  FailedScheduling  4m28s  default-scheduler  0/9 nodes are available: 3 Insufficient topolvm.cybozu.com/capacity, 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate.

Funny observation: Before I deployed the topolvm-scheduler i was able to schedule one pod per active topolvm node, but no more regardless of available capacity.

csi-provisioner log (topolvm-controller):

W0729 07:41:13.127859       1 feature_gate.go:235] Setting GA feature gate Topology=true. It will be removed in a future release.
I0729 07:41:13.127939       1 csi-provisioner.go:132] Version:
I0729 07:41:13.127961       1 csi-provisioner.go:155] Building kube configs for running in cluster...
I0729 07:41:13.136892       1 connection.go:153] Connecting to unix:///run/topolvm/csi-topolvm.sock
I0729 07:41:17.442322       1 common.go:111] Probing CSI driver for readiness
I0729 07:41:17.446994       1 csi-provisioner.go:244] CSI driver does not support PUBLISH_UNPUBLISH_VOLUME, not watching VolumeAttachments
I0729 07:41:17.453226       1 leaderelection.go:243] attempting to acquire leader lease syn-topolvm/topolvm-cybozu-com...
I0729 07:41:17.462838       1 leaderelection.go:253] successfully acquired lease syn-topolvm/topolvm-cybozu-com
I0729 07:41:17.563293       1 controller.go:838] Starting provisioner controller topolvm.cybozu.com_topolvm-controller-57955564c-f66gb_50795981-c847-4a54-8f74-c938016a4744!
I0729 07:41:17.563324       1 volume_store.go:97] Starting save volume queue
I0729 07:41:17.664039       1 controller.go:887] Started provisioner controller topolvm.cybozu.com_topolvm-controller-57955564c-f66gb_50795981-c847-4a54-8f74-c938016a4744!

Please let me know if you need more information.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 16 (8 by maintainers)

Most upvoted comments

Sorry, if you use scheduler.enabled: false , you also set webhook.podMutatingWebhook.enabled: false

lvmd polls the free VG size every 10 minutes. So you just wait 10 minutes after vg extended.