pipeline: task time exceeded `timeouts.tasks` when task retried
Expected Behavior
If the timeouts.tasks time is exceeded, the task will not be retried.
finally should always be executed.
Actual Behavior
If the task is retried, the task time is exceeded timeouts.tasks.
In addition, if this causes the pipeline execution time to exceed timeouts.pipeline, finally is force timeouted.
Steps to Reproduce the Problem
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: please-say-bye
spec:
tasks:
- name: hi
retries: 2
taskSpec:
steps:
- name: hi
image: alpine:3.12
script: |
echo 'hi'
sleep 10
finally:
- name: bye
taskSpec:
steps:
- name: bye
image: alpine:3.12
script: |
echo 'bye'
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
generateName: please-say-bye-
spec:
timeouts:
pipeline: 10s
tasks: 5s
finally: 5s
pipelineRef:
name: please-say-bye
❯ tkn -n pipelines tr ls --label tekton.dev/pipeline=please-say-bye
NAME STARTED DURATION STATUS
please-say-bye-tzjdt-bye-pft4x 25 seconds ago 1 second Failed(TaskRunTimeout)
please-say-bye-tzjdt-hi-qqlgs 30 seconds ago 5 seconds Failed(TaskRunTimeout)
❯ tkn -n pipelines pr desc please-say-bye-tzjdt
Name: please-say-bye-tzjdt
Namespace: pipelines
Pipeline Ref: please-say-bye
Service Account: default
Labels:
tekton.dev/pipeline=please-say-bye
🌡️ Status
STARTED DURATION STATUS
1 minute ago 14 seconds Failed
💌 Message
Tasks Completed: 2 (Failed: 2, Cancelled 0), Skipped: 0 (TaskRun "please-say-bye-tzjdt-bye-pft4x" failed to finish within "1s")
📦 Resources
No resources
⚓ Params
No params
📝 Results
No results
📂 Workspaces
No workspaces
🗂 Taskruns
NAME TASK NAME STARTED DURATION STATUS
∙ please-say-bye-tzjdt-bye-pft4x bye 1 minute ago 1 second Failed(TaskRunTimeout)
∙ please-say-bye-tzjdt-hi-qqlgs hi 1 minute ago 5 seconds Failed(TaskRunTimeout)
⏭️ Skipped Tasks
No Skipped Tasks
Additional Info
- Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-12T14:18:45Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.4-eks-6b7464", GitCommit:"6b746440c04cb81db4426842b4ae65c3f7035e53", GitTreeState:"clean", BuildDate:"2021-03-19T19:33:03Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
- Tekton Pipeline version:
Client version: 0.19.0
Pipeline version: v0.25.0
Triggers version: v0.14.0
Dashboard version: v0.17.0
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 17 (11 by maintainers)
Commits related to this issue
- Stop executing retried `Tasks` past their timeouts Today, when a `Task` is retried, it executes past its timeout. Even worse, the `Task` executing past its timeout causes the execution of `Finally Ta... — committed to jerop/pipeline by jerop 3 years ago
- Stop executing retried `Tasks` past their timeouts Today, when a `Task` is retried, it executes past its timeout. Even worse, the `Task` executing past its timeout causes the execution of `Finally Ta... — committed to jerop/pipeline by jerop 3 years ago
- Stop executing retried `Tasks` past their timeouts Today, when a `Task` is retried, it executes past its timeout. Even worse, the `Task` executing past its timeout causes the execution of `Finally Ta... — committed to jerop/pipeline by jerop 3 years ago
- Stop executing retried `Tasks` past their timeouts Today, when a `Task` is retried, it executes past its timeout. Even worse, the `Task` executing past its timeout causes the execution of `Finally Ta... — committed to jerop/pipeline by jerop 3 years ago
- Stop executing retried `Tasks` past their timeouts Today, when a `Task` is retried, it executes past its timeout. Even worse, the `Task` executing past its timeout causes the execution of `Finally Ta... — committed to jerop/pipeline by jerop 3 years ago
- Stop executing retried `Tasks` past their timeouts Today, when a `Task` is retried, it executes past its timeout. Even worse, the `Task` executing past its timeout causes the execution of `Finally Ta... — committed to jerop/pipeline by jerop 3 years ago
I’m returning to this issue to see if it has been resolved by #5134 (FYI @abayer).
@ornew, I’m curious why you say in your original comment that the finally task should not be timed out if
timeouts.pipelineis exceeded. I think not running the finally task is intended behavior, astimeouts.pipelineshould refer to the entire time the pipeline is running. I tried using your example posted in the original comment, and it results in the pipelinerun being timed out after 10 seconds and finally tasks not being run, which I believe is the correct behavior, as the pipelinerun should stop running after 10s.If you’d like to allow the finally tasks to run indefinitely but have the tasks section time out after some time, I think you need to specify
timeouts.tasks= 5s andtimeouts.pipeline= 0 (no timeout). (Unfortunately this doesn’t work-- filed https://github.com/tektoncd/pipeline/issues/5459 but should be easily fixable.There’s still a bug, though, with the retried taskrun not being timed out when
timeouts.tasksis reached. Here’s a reproducer:In this example, each attempt (there are 3) sleeps for 10s and fails, the finally task is run, and the pipelinerun fails. I would expect that the first attempt fails and retries, the taskrun is canceled before the second attempt completes, the finally task runs, and the pipelinerun fails.
/assign @vsinghai