pipeline: Failed TaskRuns should still report results

Expected Behavior

Results produced by failed steps should still be include in the task results.

Actual Behavior

Results are present in the “message” field of the step, but not in the task results.

Steps to Reproduce the Problem

Apply the following to a cluster with Tekton installed:

apiVersion: tekton.dev/v1beta1
kind: ClusterTask
metadata:
  name: result-task
spec:
  results:
  - name: sampleResult
    description: Result
  steps:
  - name: write-result
    image: gcr.io/google-containers/busybox:1.27
    script: |
      #!/bin/sh
      /bin/echo 'Result expected' > $(results.sampleResult.path)
      exit 1

---
apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
  name: result-task-run
spec:
  taskRef:
    kind: ClusterTask
    name: result-task

Additional Info

Kubernetes version:

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.13-gke.401", GitCommit:"eb94c181eea5290e9da1238db02cfef263542f5f", GitTreeState:"clean", BuildDate:"2020-09-09T00:57:35Z", GoVersion:"go1.13.9b4", Compiler:"gc", Platform:"linux/amd64"}

Tekton Pipeline version: v0.17.1 (I first ran into this in V16.3)

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 19 (6 by maintainers)

Most upvoted comments

Is it possible that the user facing abstraction you are creating could access these logs? If the main problem is actually that it’s difficult for you to fetch the logs, maybe we should tackle that directly.

@bobcatfish This sums up the bulk of the issue: I was building an operator that, as part of reconciliation, is managing Tekton tasks. I could definitely add logic to read logs, but I was trying to avoid the work (and needing to add the corresponding permissions).

I do think that there is room for error details to be distinct from logs. Tools can have noisy outputs (e.g. prompts for surveys), and STDERR is generally more valuable. Additionally, error detection in a script (e.g. checking if an output file is empty) could use echo, but writing directly to an error log might be preferable to produce actionable output.

ekupershlak on Dec 1, 2020

Great discussion!!! Thanks @ekupershlak and @vincent-pli for going into detail about your use cases, this really helps.

I’m thinking 2 things:

Maybe we could nail down the problem statement a bit, e.g. in the form of a TEP? (https://github.com/tektoncd/community/tree/master/teps) The problem seems to be about understanding why a Task failed and we’ve generated a couple of alternatives in this issue already, e.g. leveraging Task results and also letting a Task emit a reason to provide more context. We’ve been trying to use an approach with TEPs where we nail down the problem statement before getting too deep into the solution
I actually want to double check that using logs isn’t the best solution here? (I think a very clear problem statement could help explain why using logs doesn’t accomplish what you need as well) @ekupershlak you said:

Having the user do it breaks encapsulation and doing it programatically requires access to the underlying Pods.

Is it possible that the user facing abstraction you are creating could access these logs? If the main problem is actually that it’s difficult for you to fetch the logs, maybe we should tackle that directly.

My thinking is that even if we DO add a failure reason or allow Tasks to emit results even when they fail, you’re still going to need to write ALL of your Tasks in such a way that they are aware of their own failure and can introspect. That’s going to be difficult to do since you’d need to handle the failure of ANY part of your Task.

bobcatfish on Dec 1, 2020

We’re essentially describing Results that are only optionally populated (i.e. they only populate when there’s an error). Elsewhere we are proposing that a Task which doesn’t emit a declared result should be an automatic failure: https://github.com/tektoncd/pipeline/issues/3497. We’ve proposed this because users should be able to depend on Task Results so that other Tasks in a Pipeline can use them reliably. A missing Result should be considered a bug in a successful Task.

So, let’s look at an example Task with a failure result:

spec:
  results:
  - name: errorMessage
    description: If this task fails, it will return an error message in this result.

When this Task succeeds it must return a value for errorMessage if we implement #3497. What’s the correct value of errorMessage when the Task succeeds? Empty string? "No errors" message? An exit code “0”? And then, thinking about this further, could a UI or other tool make use of that failure information and display it in some way to help users? How does the UI/tool discover the correct Result to read the error information from?

So I guess I’m still not that convinced that we need to muddy the contract around Results to support this. I’d personally prefer to see Results only returned on success and a different field returned on failure. See my earlier comment https://github.com/tektoncd/pipeline/issues/3439#issuecomment-718073395 for a suggestion of a different possible way to tackle the same problem.

ghost on Dec 1, 2020

@ekupershlak Your case is exactly same as mine. I need delivery something like error source, error message and some customized data to outer world when a Pipelinerun get failed. I think TaskRunResult is a good candidate, just watch the Pipelinerun and parse the TaskRunResult, then know the root cause and decide what to do next.

@pritidesai I think the result consuming in finally can help on my case, then I can launch my custom task(since finally do not support task chain) in finally with the result as param, no need to retrieval the object Pipelinerun, but seems the pr still not merged.

So, any reason we cannot let a failed task report it’s result? I have no idea about previous discussion about that, any clarification? @sbwsg

vincent-pli on Dec 1, 2020