kubectl: kubectl logs --follow shouldn't timeout or exit non-zero otherwise

What happened?

kubectl logs -f pod

with a pod silent for a while, say 10 mins, after a while it gets disconnected.

What did you expect to happen?

The logs are streamed indefinitely.

Alternatively, if a timeout is the cause of the interruption, exit with non-zero status.

How can we reproduce it (as minimally and precisely as possible)?

Run a pod which does something like

date
sleep 3600
date

and watch its logs with logs -f

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.6", GitCommit:"ff2c119726cc1f8926fb0585c74b25921e866a28", GitTreeState:"archive", BuildDate:"2023-01-19T00:00:00Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.5", GitCommit:"804d6167111f6858541cef440ccc53887fbbc96a", GitTreeState:"clean", BuildDate:"2022-12-11T00:17:11Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

Azure AKS

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, …) and versions (if applicable)

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 18 (13 by maintainers)

Most upvoted comments

Thanks @mpuckett159 for letting me know! Opened a PR fixing the same.

@mpuckett159 sounds good, maybe we can reword this issue then to focus on the exit code and add a new one to focus on the timeout which might or might not be fixed in the future.

Personally yes however it will likely be very low on the priority list, and honestly this will likely be very difficult to diagnose the specific root cause of because of the sheer number of “things” that could be generating the disconnection.

@mpuckett159 sure, I’ll give it a try, thanks!

I’ll look into changing the default to false instead of true.

/assign

/triage accepted We would like to implement a non-zero exit code for a timeout/failure for sure, we will investigate the ability to ignore timeouts due to inactivity.