pipeline: Solve chicken and egg problem of Tekton config-as-code
Expected Behavior
It should be possible to define Pipeline and Task definitions and have them committed to a repo which they act on (e.g. this is the intent of the tekton dir in this repo!).
- Changes to these definitions that are committed to the repo should be picked up and used as soon as they are merged
- Changes to these definitions in pull requests should be used when testing the pull request, but without affecting any other Runs (e.g. already executing Runs, Runs executing against other Pull Requests)
Actual Behavior
We aren’t yet using tekton fully via dogfooding, but once we do we will run into a chicken and egg problem: the PipelineRuns that will be generated by Prow (see https://github.com/tektoncd/pipeline/issues/531 and https://github.com/kubernetes/test-infra/pull/11888) will be created assuming that the Pipeline it refers to (and the Tasks it refers to) already exist in the cluster.
Additionally, the PipelineRuns it generates for pull requests will all refer to the same Pipeline, which means that if a pull request changes a Pipeline or Task:
- Nothing would be applying the changed
PipelineorTask - If we do add a mechanism to apply the changes before the
PipelineRunis created, this change would apply to allPipelineRunsin the same cluster that refer to thosePipelinesandTasks
Additional information
This is being presented in the context of Prow b/c that is what we will be using in the near future for triggering PipelineRuns against this repo, but the solution to this issue should be applicable regardless of what the event triggering system is, whether it’s Prow or something completely different (see #315).
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 3
- Comments: 19 (6 by maintainers)
Thanks for the link! I really need to learn kustomize.
Rotten issues close after 30d of inactivity. Reopen the issue with
/reopen. Mark the issue as fresh with/remove-lifecycle rotten./close
Send feedback to tektoncd/plumbing.
I’ve been thinking a cool solution to this problem could be referring to Pipelines (and Tasks) in git repos.
In #1839 we’re adding support for storing Tasks (and Pipelines) in OCI image registries, and we’ve opened the possibility of storing them in git repos.
e.g. OCI image:
e.g. git:
If you setup your triggering such that the PipelineRuns created for a project refer to a Pipeline in a git repo, you could have logic that adds the commitish of a PullRequest such that whatever is in the repo gets run.
You could combine this with OCI images in whatever way makes sense to you, e.g. Pipelines are referred to by their location in a git repo (they might be more likely to change) and Tasks are usually referred to by their location in an image registry (assuming they change less).
This potentially would solve the problem for Pipelines and Tasks but not for the triggering logic itself. Could be an improvement tho?
Some thoughts about this. It is definitely an important issue to solve, but I’m not convinced it’s something that can fully be solved on Tekton side. The ability to run pre-merge pipeline/task definitions is a functionality of the CI system that triggers the pipeline.
That said we could provide functionality on Tekton side that help CI system is realizing this. Something that might help is the ability for a task/pipelineRun to reference a task/pipeline by some kind of URI (instead of name only) e.g. git://repo/path/to/pipeline@gitref. A task/pipeline defined that way would be pulled from git and expanded in the task/pipelineRun definition. That would allow combining the power of embedding with the reuse of pipelines.
What happens when a PR that modifies the task/pipeline definition is merged? The desired behaviour would be that any running tasks/pipelines keep running on the old definition and any new task/pipeline uses the new definition. However if the new definition is applied to the cluster this would affect in-flight tasks/pipelines.
I think there are two ways we can solve this. One is two always embed tasks/pipelines (either explicitly or by URI) and the other is to have the controller creating a copy of the tasks/pipelines before starting a run, and updating the run to use the copy instead of the original. Either way we would have a proliferation of copies of Tasks and Pipelines, and a single Task/Pipeline resource applied to the cluster would only be useful for things like UIs that visualize / help build the task / pipeline.