kubernetes: log rotation on node breaks kubectl logs --follow / docker logs --follow
Hello,
There is an upstream problem with docker’s json-file log driver. The command ‘docker logs --follow’ (and subsequently ‘kubectl logs --follow’) stops generating log output when the node that hosts the given container rotates the container logs.
Environment:
Running kubernetes 1.2.4 on AWS.
Impact
logspout or other tools that rely on ‘docker logs --follow’ / ‘kubectl logs --follow’ break during log rotation.
Steps to reproduce:
- Launch a pod that generates regular stdout.
- Launch
kubectl logs --tail POD
for the pod above. - Instead of waiting for log rotation, force it: SSH into the node that hosts that pod/container, and run
sudo /usr/sbin/logrotate --force /etc/logrotate.conf
- Observe output stop. Restarting the kubectl logs command will resume log streaming.
To reproduce at the docker level, repeat the procedure above using ‘docker logs’ on the node, e.g.:
sudo DOCKER_HOST=unix:///var/run/docker.sock docker logs --tail=0 --follow CONTAINER_ID
Environment details
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.4", GitCommit:"3eed1e3be6848b877ff80a93da3785d9034d0a4f", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.4", GitCommit:"3eed1e3be6848b877ff80a93da3785d9034d0a4f", GitTreeState:"clean"}
docker version:
Client:
Version: 1.9.1
API version: 1.21
Go version: go1.4.2
Git commit: a34a1d5
Built: Fri Nov 20 12:59:02 UTC 2015
OS/Arch: linux/amd64
Server:
Version: 1.9.1
API version: 1.21
Go version: go1.4.2
Git commit: a34a1d5
Built: Fri Nov 20 12:59:02 UTC 2015
OS/Arch: linux/amd64
docker info:
Containers: 17
Images: 249
Server Version: 1.9.1
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 283
Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
CPUs: 2
Total Memory: 7.815 GiB
Name: XXXXXXXXXXXXXXXXX
ID: 5MWW:M4LM:3UIM:5UOC:7AQD:DSWA:MIAA:WBFF:EMHR:2DH6:ITQC:QBVW
WARNING: No swap limit support
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 6
- Comments: 16 (2 by maintainers)
Commits related to this issue
- Run systests in Job (#3730) ## Motivation System tests in CI fail sporadically because `kubectl logs -f <systest pod>` exits prematurely (while tests are still running). An example of a CI run faile... — committed to spacemeshos/go-spacemesh by poszu 2 years ago
- kubelet: fix handle new log file in ReadLogs after log rotation related issues: https://github.com/kubernetes/kubernetes/issues/59902 https://github.com/kubernetes/kubernetes/issues/28369 — committed to icefed/kubernetes by icefed a year ago
Why is this issue closed? It still appears to be a problem for many people (including myself). Every time I try to run
kubectl logs -f
, the output stops after about 10 minutes.I’m using the latest version of
kubectl
that I installed today withgcloud components install kubectl
My environment looks like this (if it makes any difference):
Who do we need to bribe to get this reopened?
Why was this issue closed?
Rotten issues close after 30d of inactivity. Reopen the issue with
/reopen
. Mark the issue as fresh with/remove-lifecycle rotten
.Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close
I came here looking for a fix for this problem, just to be disappointed as there’s no fix, and the issue was autoclosed by the bot. 😦
Here is my workaround in NodeJS for Kontinuous (as CI/CD tool we’re developping), the implementation is very specific (and relative to jobs tailing) but the principles can be decoupled for general usage, I share it if this can help someone https://github.com/SocialGouv/kontinuous/blob/fe7345c5e33bda5990b08e2c60e3c997fc1c7a7d/packages/common/utils/kubejob-tail.js
This is happening on GKE master version
1.13.6-gke.13
and a GKE node version of1.12.8-gke.7
which has a docker version of17.03.2-ce
. Based on https://github.com/moby/moby/issues/37646 it seems like this should work since based on the hostconfig.json, docker is running with--max-size=10m --max-files=5
, but it doesn’t seem to.Does the kubernetes API server tail through docker, or through the logs in
/var/lib/pods/
? If it’s the latter, since the log in that directory is just a symlink, and docker log rotate does a rename/create, the issue might be that the API server doesn’t listen to changes on the symlink to pick up the new file.EDIT: Based on https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_pods.go#L1218 it looks like it does end up hitting the Docker daemon directly for streaming logs 🤔