argo-workflows: artifactGC not working with IRSA
Pre-requisites
- I have double-checked my configuration
- I can confirm the issues exists when I tested with
:latest
- I’d like to contribute the fix myself (see contributing guide)
What happened/what you expected to happen?
what happened? artifactgc step failed, object still on s3 and workflow did not get archived
what expected? artifactgc step succeed, object removed off s3 and workflow gets archived
Version
v3.4.1
Paste a small workflow that reproduces the issue. We must be able to run the workflow; don’t enter a workflows that uses private images.
spec:
serviceAccountName: argo-workflows-server
artifactGC:
strategy: OnWorkflowCompletion
serviceAccountName: argo-workflows-workflow-controller
podMetadata:
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::redact:role/redact..approle"
logs from the artgc pod:
{"log":"time=\"2022-10-04T10:16:46.328Z\" level=info msg=\"S3 Delete artifact: key: redact/main.log\"\n","stream":"stderr","time":"2022-10-04T10:16:46.32863739Z"}
{"log":"time=\"2022-10-04T10:16:46.328Z\" level=info msg=\"Creating minio client using AWS SDK credentials\"\n","stream":"stderr","time":"2022-10-04T10:16:46.328647872Z"}
{"log":"time=\"2022-10-04T10:16:46.573Z\" level=warning msg=\"Non-transient error: NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\"\n","stream":"stderr","time":"2022-10-04T10:16:46.573355156Z"}
note: for other AWS things to work i set volume/volumemounts/env variables as in https://github.com/argoproj/argo-workflows/discussions/7461 but artifactgc pod doesn’t seem to accept that?
### Logs from the workflow controller
kubectl logs -n argo deploy/workflow-controller | grep ${workflow}
### Logs from in your workflow's wait container
kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 21 (18 by maintainers)
Unfortunately, there is this issue related to the ArtifactGC test. It seems to be an issue of artifacts often not getting saved to minio. I believe if each Workflow is run manually one at a time it doesn’t happen.
By the way, maybe you already found it but the current logic for grouping artifacts by pod access is here.