pipeline: TaskRun fails during initialization when disable-home-env-overwrite=true
This is closely related to the on-going Tekton $HOME issue (https://github.com/tektoncd/pipeline/issues/2013#issuecomment-585908031). I am testing disable-home-env-overwrite before it gets flipped.
This comment says
With this new flag Tekton will no longer interfere with HOME - it will be whatever you expect it to be when the container runs in a Pod.
Previously $HOME would have been set to /tekton/home but now it won’t be. So I would expect
$HOME/.docker/config.jsonto be written to/root/.docker/config.jsonif the user is root and the image doesn’t specify its own HOME.
I don’t think this is the case. I am testing gcr.io/cloud-builders/gradle, but Tekton fails as it tries to create a directory /.docker.
"level":"fatal",
"ts":1583431818.4164164,
"caller":"creds-init/main.go:41",
"msg":"Error initializing credentials: mkdir /.docker: permission denied",
"stacktrace":
main.main
github.com/tektoncd/pipeline/cmd/creds-init/main.go:41
runtime.main
runtime/proc.go:203
Note the “permission denied” error is not the issue here. The issue is that it is /.docker instead of /root/.docker.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 35 (17 by maintainers)
I’ve gone with the approach of placing the creds in a fixed location (
/tekton/home) and made a PR here: https://github.com/tektoncd/pipeline/pull/2180check-dirsStep always sees the correct $HOME value,/workspace. We set this explicitly on the Step. In some of our runs above, though, the Task dies before it gets tocheck-dirs. The log output we see in these cases is only forgit-source-xxxxandcreate-dir-image-xxxx. They fail writing to/.One small nit here: we don’t set the
securityContexton the Task but on the Step. Socheck-dirs, the step container, receives thesecurityContext. Butcreds-init(an injected initContainer) and pipeline resource injected containers do not receive thatsecurityContext. If you remove thesecurityContextfrom thecheck-dirsStep then the TaskRun’ssecurityContextis applied to all containers equally.This is all immeasurably confusing. I’m going to try to illustrate the different scenarios here:
First scenario:
disable-home-override: “true” No TaskRun securityContext:
UID=rootNocheck-dirssecurityContext:UID=rootOrder of operations:
rootownership.root, copy creds from/tekton/credsto/.root, writes to/.gitconfigsuccessfully.check-dirscontainer, runs asroot, read creds from/tekton/creds, write creds to/workspace.Second scenario:
disable-home-override: “true” No TaskRun securityContext:
UID=rootcheck-dirshas securityContext:UID=1234Order of operations:
rootownership.root, copy creds from/tekton/credsto/.root, writes to/.gitconfigsuccessfully.check-dirscontainer, runs as1234, dies copying creds from/tekton/credsto/workspacebecause they’re owned byroot. Error:[check-dirs] 2020/05/01 14:53:01 unsuccessful cred copy: ".ssh" from "/tekton/creds" to "/workspace": unable to open source: open /tekton/creds/.ssh/known_hosts: permission deniedThird scenario:
disable-home-override: “true” TaskRun has securityContext:
UID=1111check-dirshas securityContext:UID=1234Order of operations:
1111ownership.1111, fail to copy creds from/tekton/credsto/. Messages:unsuccessful cred copy: ".ssh" from "/tekton/creds" to "/": unable to create destination directory: mkdir /.ssh: permission denied1111, fatal error: dies writing to/.gitconfig. Error:{"level":"error","ts":1588180626.7862072,"caller":"git/git.go:41","msg":"Error running git [config --global http.sslVerify true]: exit status 255\nerror: could not lock config file //.gitconfig: Permission denied\n","stacktrace":"github.com/tektoncd/pipeline/pkg/git.run\n\tgithub.com/tektoncd/pipeline/pkg/git/git.go:41\ngithub.com/tektoncd/pipeline/pkg/git.Fetch\n\tgithub.com/tektoncd/pipeline/pkg/git/git.go:82\nmain.main\n\tgithub.com/tektoncd/pipeline/cmd/git-init/main.go:53\nruntime.main\n\truntime/proc.go:203"}Here
check-dirsnever runs. Therefore no mention of/workspace.This is a really confusing dance and there’s quite a bit of work to do to get all of Tekton’s movements in lock-step.
Excellent, I’ve been able to reproduce the problem exactly.
create-dir-image-XXXXXis injected into a Task when either the GCS PipelineResource is used or Tekton decides it needs to create an extra directory during PipelineResource linking. It doesn’t have a HOME when the home override flag is “true” and it doesn’t run asrootwhensecurityContextsets non-zero user ID. So it reports errors when theentrypointtries to copy credentials out of/tekton/credsinto/.git-source-XXXXXis placed into a Task when the Git PipelineResource is used. It shares the same problems as above and also adds another wrinkle: it can’t lock the$HOME/.gitconfigfile for setting configuration options. This is again because $HOME isn’t set, it defaults to/, and it’s running as a non-zero user ID. Unlikecreate-dir-image-this is a fatal error for the Git PipelineResource and the Task dies here.Ultimately the errors with these two Steps are happening because PipelineResources don’t have a HOME set and they’re trying to write to
/as a non-root user due to thesecurityContext.So summarizing the various problems that have been discovered here:
The Git PipelineResource needs to be able to lock and write files in $HOME. Specifically
$HOME/.gitconfig.Credentials need to be written to
/tekton/credsusing the UID of the currently running Step.creds-initcan’t do this on its own because UID can differ from container to container.And the likely solutions seem to me:
PipelineResources need to have their HOME set somewhere they can always write regardless of UID. I’m thinking
/tekton/homesince it’s always mounted (even when the override flag is true) and it’s always world-writeable since it’s anemptyDir.creds-initprobably needs to go away completely and have its logic moved into theentrypointer. This is the only solution I can think of that will allow UID to be random, creds copied out of secret volumes with the correct file permissions, andHOMEto be discovered at runtime.Ideally the
entrypointercould copy credentials straight out of secret volumes and into wherever they think$HOMEis. Unfortunately an annoying extra problem that I’ve brought upon myself is that I’ve introduced$(credentials.path), which I’ve documented as pointing to a single location. So theentrypointeris going to need to copy the creds to/tekton/credsas well as copying them to wherever$HOMEis.I’ll create issues for each of these problems and then start working on fixes for both.
Just to reiterate from the Pull Request that closed this Issue:
/tekton/credswhen the disable-home-env-overwrite flag is “true”.$(credentials.path)which points to the place where creds-init wrote the credentials.$(credentials.path)to the Step’s HOME. We find the HOME directory using go-homedir rather than relying on just the$HOMEenv var.@chanseokoh once v0.11.0-rc3 is released this fix will be available to try out. Very keen to hear your feedback / experience with the changes!
Design doc for this problem to be discussed in WG on wednesday: https://docs.google.com/document/d/1SVuDt-SXPHymz41dveSXFSPrx5Z-Wzb9eHliJAyYg4o
Yeah that’s a fair point and I understand not wanting to take this path if this isn’t an approach that everyone uses. I’m wondering whether this should become a recommendation for catalog authors though - to expose (optional) workspaces for credentials to be mounted into. If everyone was doing it then it might not be bad?