sigstore: Workload Identity Federation is not working with GCP KMS support
Description
Recently, we (w/@dentrax @erkanzileli) added other key management system support to Kyverno while verifying image signatures.^1 Then, I tried this feature on GCP while using GCP KMS and GKE. To achieve this I took advantage of Workload Identity Federation^2. To enable this I’ve used the following commands:
🎗 Cross-ref: https://github.com/kyverno/website/pull/376
$ export PROJECT_ID=$(gcloud config get-value project)
$ export CLUSTER_NAME="gke-wif"
$ gcloud container clusters create $CLUSTER_NAME \
--workload-pool=$PROJECT_ID.svc.id.goog --num-nodes=2
$ export GSA_NAME=kyverno-sa
$ gcloud iam service-accounts create $GSA_NAME
$ gcloud iam service-accounts add-iam-policy-binding \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:${PROJECT_ID}.svc.id.goog[kyverno/kyverno]" \
${GSA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com
$ gcloud projects add-iam-policy-binding ${PROJECT_ID} \
--role roles/cloudkms.admin \
--member serviceAccount:${GSA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com
$ kubectl annotate serviceaccount \
--namespace kyverno \
kyverno \
iam.gke.io/gcp-service-account=${GSA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com
Then, I tried it with Kyverno but it didn’t work as I expected. So, I decided to do a small test together with the google/cloud-sdk:slim image. So, I ran a Pod with this image, everything worked fine.
kubectl run -it --rm \
--image google/cloud-sdk:slim \
--serviceaccount kyverno \
--namespace kyverno \
workload-identity-test

About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 28 (19 by maintainers)
I found this but am not sure whether it refers to our problem but this –k8s-keychain flag seems to help us because I found these two blog posts and all of them use this flag to enable workload identity feature, am I right @mattmore @dlorenc?
WOOOOOOOOOOOOOOOOOWW, it worked @JimBugwadia @ribbybibby @dlorenc 🤩
All the problems are related to the KMS roles I’ve configured for the service account, thanks a ton to @ribbybibby for fixing my mistake. When I changed the role from
roles/cloudkms.admintoroles/cloudkms.viewerandroles/cloudkms.verifier, all worked properly.Okay, if it’s a public image then the keychain is probably nothing to do with it.
KMS definitely works for me with workload identity, so I suspect it might be a configuration issue in your environment.
What service account are the kyverno pods using? I notice from your original post that you were attaching workload identity to a service account called ‘kyverno-service-account’ but I think helm usually creates and uses a service account called just ‘kyverno’.
@developer-guy Yes, please 🙇
If it still doesn’t work, then it would nice to have some more details:
In 1.6, the cloud provider keychains are only setup when the ‘kubernetes keychain’ is initialized and that only happens if there are image pull secrets specified by the flag
-imagePullSecrets.I think this PR needs to land in a Kyverno release before workload identity will work off the bat: https://github.com/kyverno/kyverno/pull/3116.
@developer-guy a workaround you can use in the meantime is to create an empty image pull secret and use that:
I’ve not been able to find
disable_gcpstring in neithercosignnorsigstoreso I’m not sure what that tag is expected to do.I looked into the
cosignsources, while you are pointing tosigstoreso I’m not sure what I said make sense. Indeed is most probably wrong. I’ll dig into thesigstoreGCP code later.Hello @lukehinds, as I mentioned above, recently, we’ve added other key supports for verification by using cosign to Kyverno. As you already know, KMS is one of them, so, I wanted to test this by deploying Kyverno on GKE and using GCP KMS for storing keys. But, to access my keys on GCP KMS, Kyverno should use Workload Identity, I did everything to make it work, but it didn’t work. So, I thought that the reason for this is that Kyverno is using sigstore internally for authentication to use GCP services.
I took a look, my initial hypothesis was that the GCP SDK library in use by
cosignis not up to date and lacks that functionality.I say
cosignas is the packages used in the linked PR.The
google/cloud-sdk:slimimage used SDK version 364.0.0 (extracted fromDockerfile)cosignis usinggoogle.golang.org/apipackage at versionv0.60.0, which as of today is the latest version.So I would not expect this to be a version related issue.
Following GKE documentation about using Workload Identity from code I suspect there are issues in the authentication code; it may be is not performing authentication is a way that supports metadata authentication, as per GCP docs: