kubernetes: TOB-K8S-007: Log rotation is not atomic

This issue was reported in the Kubernetes Security Audit Report

Description kubelets use a log to store metadata about the container system, such as readiness status. As is normal for logging, kubelets will rotate their logs under certain conditions:

// rotateLatestLog rotates latest log without compression, so that container can still write
// and fluentd can finish reading.
func (c *containerLogManager) rotateLatestLog(id, log string) error {
    timestamp := c.clock.Now().Format(timestampFormat)
    rotated := fmt.Sprintf("%s.%s", log, timestamp)
    if err := os.Rename(log, rotated); err != nil {
        return fmt.Errorf("failed to rotate log %q to %q: %v", log, rotated, err)
    }
    if err := c.runtimeService.ReopenContainerLog(id); err != nil {
        // Rename the rotated log back, so that we can try rotating it again
        // next round.
        // If kubelet gets restarted at this point, we'll lose original log.
        if renameErr := os.Rename(rotated, log); renameErr != nil {
            // This shouldn't happen.
            // Report an error if this happens, because we will lose original
            // log.
            klog.Errorf("Failed to rename rotated log %q back to %q: %v, reopen container log error: %v", rotated, log, renameErr, err)
        }
        return fmt.Errorf("failed to reopen container log %q: %v", id, err)
    }
    return nil
}

Figure 22.1: One of the log rotation mechanisms within kubelet

However, if the kubelet were restarted during the rotation, the logs and their contents would be lost. This could have a wide range of impacts to the end user, from missing threat-hunting data to simple error discovery.

Exploit Scenario Alice is running a Kubernetes cluster for her organization. Eve has position sufficient to watch the logs, and understands when log rotation will occur. Even then faults a kubelet when rotation occurs, ensuring that the logs are removed.

Recommendation Short term, move to a copy-then-rename approach. This will ensure that logs aren’t lost from simple rename mishaps, and that at worst they are named under a transient name.

Long term, shift away from log rotation and move towards persistent logs regardless of location. This would mean that logs would be written to in linear order, and a new log would be created whenever rotation is required.

Anything else we need to know?:

See #81146 for current status of all issues created from these findings.

The vendor gave this issue an ID of TOB-K8S-007 and it was finding 24 of the report.

The vendor considers this issue Low Severity.

To view the original finding, begin on page 65 of the Kubernetes Security Review Report

Environment:

  • Kubernetes version: 1.13.4

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Comments: 15 (12 by maintainers)

Most upvoted comments

Was thinking about the following where log is copied to rotated :

diff --git a/pkg/kubelet/logs/container_log_manager.go b/pkg/kubelet/logs/container_log_manager.go
index 0d37b123fc..5057542468 100644
--- a/pkg/kubelet/logs/container_log_manager.go
+++ b/pkg/kubelet/logs/container_log_manager.go
@@ -368,13 +368,23 @@ func (c *containerLogManager) compressLog(log string) error {
 func (c *containerLogManager) rotateLatestLog(id, log string) error {
        timestamp := c.clock.Now().Format(timestampFormat)
        rotated := fmt.Sprintf("%s.%s", log, timestamp)
-       if err := os.Rename(log, rotated); err != nil {
+       source, err := os.Open(log)
+       if err != nil {
+               return fmt.Errorf("failed to rotate log %q to %q: %v", log, rotated, err)
+       }
+       defer source.Close()
+       destination, err := os.Create(rotated)
+       if err != nil {
+               return fmt.Errorf("failed to rotate log %q to %q: %v", log, rotated, err)
+       }
+       defer destination.Close()
+       _, err = io.Copy(destination, source)
+       if err != nil {
                return fmt.Errorf("failed to rotate log %q to %q: %v", log, rotated, err)
        }
        if err := c.runtimeService.ReopenContainerLog(id); err != nil {
                // Rename the rotated log back, so that we can try rotating it again
                // next round.
-               // If kubelet gets restarted at this point, we'll lose original log.
                if renameErr := os.Rename(rotated, log); renameErr != nil {
                        // This shouldn't happen.
                        // Report an error if this happens, because we will lose original

Since container_log_manager_test.go needs to be modified, I post the change to get some feedback before creating PR.