kubernetes: Possible Bug: Garbage Collection Not Triggered

what

I’m struggling to trigger image garbage collection (GC) by the kubelet process by simulating disk usage. In short, I can’t get it to work and can’t figure out if it’s a bug or a misunderstanding.

expectations

Upon disk usage crossing the 90% threshold (default setting for image-gc-high-threshold) the kubelet will output messages to stdout or stderr indicating that it’s vacuuming images
A reduction of disk space below the high-watermark of 90%

One of our problems was probably as a result of using a symlinked docker-root (see: https://github.com/kubernetes/kubernetes/issues/17994). We’ve since deployed the kubelet with —docker-root=/vol/docker as well as explicitly passed the —graph=/vol/docker directory to the docker daemon. Despite these changes images are not being gc’d.

reproduction

To simulate being out of space, I’ve created a tmpfile that uses up 89% of free space.

dd if=/dev/zero of=/vol/docker/tmpfile bs=1G count=75

(I should have used a smaller volume!)

Disk space is now at 89% utilization as reported by df

Then I’ve downloaded 11G of docker images with docker pull by iterating over every release of an internal project to take me over the line to 91% as reported by df.

From what I observe from journalctl is that the kubelet is not performing any GC and the df does not change. I’ve also tried restarting the kublet to no avail.

kubernetes version

Client Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.4", GitCommit:"3eed1e3be6848b877ff80a93da3785d9034d0a4f", GitTreeState:"clean"}
Server Version: version.Info{Major:"1", Minor:"2", GitVersion:"v1.2.4", GitCommit:"3eed1e3be6848b877ff80a93da3785d9034d0a4f", GitTreeState:"clean"}

About this issue

Original URL
State: closed
Created 8 years ago
Comments: 30 (19 by maintainers)

Commits related to this issue

Merge pull request #27996 from ronnielai/gc-threshold Automatic merge from submit-queue Image GC logic should compensate for reserved blocks Calculating the disk usage based on available bytes inst... — committed to kubernetes/kubernetes by k8s-github-robot 8 years ago
Merge pull request #27996 from ronnielai/gc-threshold Automatic merge from submit-queue Image GC logic should compensate for reserved blocks Calculating the disk usage based on available bytes inst... — committed to eparis/kubernetes by k8s-github-robot 8 years ago
Merge pull request #27501 from ronnielai/test1 Automatic merge from submit-queue Log all image deletion errors instead of just the last one #27169 — committed to kubernetes/kubernetes by k8s-github-robot 8 years ago

Most upvoted comments

@osterman please do

ronnielai on Jul 20, 2016

Not able to reproduce after upgrading to 1.3.0. I think this fixed it!

Disk usage on "/dev/xvdf" (/vol/docker) is at 90% which is over the high threshold (90%). Trying to free 9836734054 bytes

This happened within a minute or so of reaching the threshold.

osterman on Jul 19, 2016

Yeah. Changing the logic to use available makes sense.

On Thu, Jun 23, 2016 at 4:15 PM, ronnielai notifications@github.com wrote:

The df’s output is:

Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvdf 98166048 88238072 4668712 95% /vol/docker

The df’s number matches cadvisor’s (88238072/98166048 = .89). Also there’s a discrepancy between Used + Available and Capacity. (My guess is that the missing blocks are the reserved blocks)

@vishh https://github.com/vishh Do you think that the image GC threshold should be changed to 1 - (available/capacity) to account for reserved blocks?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubernetes/kubernetes/issues/27169#issuecomment-228213212, or mute the thread https://github.com/notifications/unsubscribe/AGvIKLOs6khP92ga6fnwFoffCC5buRYAks5qOxOXgaJpZM4IymXe .

vishh on Jun 23, 2016