kubernetes: Pod logs: long lines are corrupted when using timestamps=true

What happened:

Docker has a hardcoded 16K limit on log line length, which splits lines longer than that into multiple messages in the log file. ReadLogs in logs.go, however, doesn’t account for this when timestamps is set to true and prints the timestamp in front of every entry in the log file instead of every logical line. This leads to random timestamps appearing in the middle of log lines, potentially breaking the applications that parse these logs.

What you expected to happen:

ReadLogs inserts timestamps once per logical line, so original log data is preserved. Alternatively, allow /logs API to request the logs in a sane format like JSON, which would allow merging the lines back and combining them with the timestamps more intelligently on the application side.

How to reproduce it (as minimally and precisely as possible):

  • Create foo.yaml:

    apiVersion: v1
    kind: Pod
    metadata:
      name: log-lines
    spec:
      containers:
      - name: main
        image: busybox
        command: ['sh', '-c', 'while sleep 1; do echo -n ___; for i in $(seq 1 18000); do echo -n A; done; echo ___; done']
    
  • Start the pod with kubectl apply -f foo.yaml

  • Fetch the logs with kubectl logs log-lines --timestamps, or using the API (curl https://…/api/v1/namespaces/default/pods/log-lines/log?timestamps=true&container=main)

  • Observe random timestamps inserted in the middle of the log lines.

Environment:

  • Kubernetes version (use kubectl version):
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:19:22Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 6
  • Comments: 41 (19 by maintainers)

Commits related to this issue

Most upvoted comments

I received a bug report about this issue. The root cause is that the buffer reads up to a new line. If the line is greater than 4096 bytes and ‘–timestamps’ is turrned on the kubelet will write the timestamp and the partial log line. We will need to refactor the ReadLogs function to allow for a partial line read.

https://github.com/kubernetes/kubernetes/blob/f892ab1bd7fd97f1fcc2e296e85fdb8e3e8fb82d/pkg/kubelet/kuberuntime/logs/logs.go#L325

@aermakov-zalando I will work on it.

/sig cli /remove-sig node