kubernetes: CSI E2E tests: fail with upcoming CSI release

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

What happened:

When testing k8s master with the “canary” CSI images (= the images which will be tagged as the next release end of this week), the CSI volume tests fail.

PVCs remain pending because the external-attacher runs into a permission issue (from its log):

E0919 08:34:56.218106       1 leaderelection.go:224] error retrieving resource lock e2e-tests-csi-mock-plugin-l84wg/csi-hostpath: endpoints "csi-hostpath" is forbidden: User "system:serviceaccount:e2e-tests-csi-mock-plugin-l84wg:csi-hostpath-service-account" cannot get resource "endpoints" in API group "" in the namespace "e2e-tests-csi-mock-plugin-l84wg"
I0919 08:34:56.218136       1 leaderelection.go:180] failed to acquire lease e2e-tests-csi-mock-plugin-l84wg/csi-hostpath

What you expected to happen:

Should work?

How to reproduce it (as minimally and precisely as possible):

build k8s master
start cluster: RUNTIME_CONFIG= ALLOW_PRIVILEGED=1 FEATURE_GATES="BlockVolume=true,MountPropagation=true,KubeletPluginsWatcher=true" hack/local-up-cluster.sh -O
run tests: make WHAT=test/e2e/e2e.test && go run hack/e2e.go -- --provider=local --test --test_args="--ginkgo.focus=CSI.plugin.test.using.CSI.driver:.hostPath -csiImageVersion=canary"

Anything else we need to know?:

It works with the latest released versions of the CSI containers (i.e. without -csiImageVersion=canary). That is how this test currently runs in the k8s CI.

If new permissions are needed, then https://kubernetes-csi.github.io/docs/Example.html also needs to be updated.

About this issue

Original URL
State: closed
Created 6 years ago
Comments: 30 (30 by maintainers)

Commits related to this issue

CSI E2E: drop usage of builtin system:csi-* roles As discussed in https://github.com/kubernetes/kubernetes/issues/68819#issuecomment-422833436, keeping the builtin roles in sync with the external com... — committed to pohly/kubernetes by pohly 6 years ago
CSI E2E: add new roles for leadership election This roles are needed for external-attacher (if leadership election is enabled, which will be done separately) and for external-provisioner (in the upco... — committed to pohly/kubernetes by pohly 6 years ago

Most upvoted comments

Hi all, I noticed that the conversation kind of floated between this issue and 2 other PR’s so in an effort to understand what was going on and what was decided I made a little summary that I will share here. Let me know if there is any misunderstanding or omission and I can edit this comment.

Assumptions:

Each driver may require a different set of cluster roles to function
Tests should be able to run in parallel
Cluster roles should be easy(ish) to patch
ClusterRoles in production deployment and test should not differ

Conclusions:

We should use manifest yamls instead of bootstrapped roles or roles in code (1) (3) (https://github.com/kubernetes/kubernetes/pull/68821#issuecomment-423252660)
We should deprecate the bootstrapped cluster roles ASAP (3) (1) (https://github.com/kubernetes/kubernetes/issues/68819#issuecomment-422834929)
Cluster Role names should be unique, potentially even tagging UUID’s onto the end of the names when creating from manifests (2) (1)
Ideally some sort of sync/import of the manifests between the external repo (source of truth) and the test manifest (4)

davidz627 on Sep 21, 2018

David Zhu notifications@github.com writes:

Assumptions:

Each driver may require a different set of cluster roles to function

Tests should be able to run in parallel

Cluster roles should be easy(ish) to patch

ClusterRoles in production deployment and test should not differ

Conclusions:

We should use manifest yamls instead of bootstrapped roles or roles in code (1) (3) (https://github.com/kubernetes/kubernetes/pull/68821#issuecomment-423252660)

We should deprecate the bootstrapped cluster roles ASAP (3) (1) (https://github.com/kubernetes/kubernetes/issues/68819#issuecomment-422834929)

Cluster Role names should be unique, potentially even tagging UUID’s onto the end of the names when creating from manifests (2) (1)

Ideally some sort of sync/import of the manifests between the external repo (source of truth) and the test manifest (4)

I agree that this is a good direction. Both PRs address a subset of this (Jan’s starts to use manifests, mine drops usage of the boostrapped cluster roles). I’m fine with merging PR #68887 first and then continuing the work based on that.

pohly on Sep 21, 2018

@wongma7, that means new external-storage release and rebase of external-provisioner, right? Do we still have time for that in 1.12 or shall we stick to endpoints there and move everything to Lease in 1.13? It’s ugly, but perhaps it’s better than rushed releases.

Yes it means another release. Actually I didn’t know about this Lease object so, I think if we are going to move to it in 1.13 anyway let’s keep external-provisioner on endpoints for 1 release and avoid rushing yet another release within 1.12 timeframe. Waiting until 1.13 will be easier to communicate and easier for users to stomach, they can update external-attacher+external-provisioner in one go. So let’s stick to endpoints for now, just 1 release. 👍

wongma7 on Sep 20, 2018