pipeline: Timed Out or Cancelled TaskRun Pods are Deleted

Expected Behavior

After a TaskRun is timed out or cancelled, I should still be able to view the logs of the failed or cancelled TaskRun.

Actual Behavior

It appears that the pods for TaskRuns are being deleted due to a change implemented in #2365. In the func, failTaskRun in taskrun.go, the pod associated with the TaskRun is being deleted as noted here.

Assuming this is not an expected feature, I would suggest checking for the failure reason before deleting the TaskRun pod:

if reason != v1beta1.TaskRunReasonTimedOut && reason != v1beta1.TaskRunReasonCancelled {
		err := c.KubeClientSet.CoreV1().Pods(tr.Namespace).Delete(tr.Status.PodName, &metav1.DeleteOptions{})

		if err != nil && !k8serrors.IsNotFound(err) {
			logger.Infof("Failed to terminate pod: %v", err)
			return err
		}
}

If this is expected, it would help to document this behavior for TaskRuns as well as recommended best practices for preserving logs.

Steps to Reproduce the Problem

  1. Create a TaskRun with a timeout such that the timeout will be violated or cancel a running TaskRun
  2. Run kubectl get pods to find if the pod of the TaskRun has been deleted

Additional Info

  • Kubernetes version:

N/A

  • Tekton Pipeline version:

Using what is in latest as of v0.15.0.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

Yes, my misunderstanding, but now it makes sense. I now see this is what is required to actually stop the TaskRun and guess I never noticed the behavior. So I guess what could be nice here are two things:

  1. This behavior should be documented under the TaskRun documentation. My original thought when this came up was to see if this was expected by looking at docs, but now nothing is there.
  2. This should be changed to a feature to find a way to preserve logs. By either the entrypoint approach or something like Jason mentioned on Slack:

In general we should have users expect to find logs in some log persistence solution (e.g., Stackdriver) and not rely on the Pod still being around at all