moby: overlay2 infinitely eats server disk space

I have a two running docker containers: postgresql and redis with uptime about 3 weeks. What I noticed, my free disk space is infinitely decreases everyday. But Postgresql database size is about 18MB, and redis .rdb is 550kB

pydf output:

Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 7928M 6161M 1342M 77.7 /
/dev/xvda1 7928M 6161M 1342M 77.7 /var/lib/docker/overlay2
overlay 7928M 6161M 1342M 77.7 /var/lib/docker/overlay2/31d9d8d30b544fb32e4d8eb13a2acf697eb27fd151cb67c4ed1e8f38c3dc87ee/merged
overlay 7928M 6161M 1342M 77.7 /var/lib/docker/overlay2/37968963e030769905a880bb5410b6b124a2077394950760fa8d9191b5c934f2/merged

How to cleanup overlay2 merged directory? Or is it a bug of overlay2?

After the many hours of googling, I still didn’t find the information how to avoid this issue. Containers is runned by docker-compose command.

Docker version: Docker version 17.03.1-ce, build c6d412e

Host: Ubuntu 16.04.1 LTS

docker-compose.yml content:

restart: always
image: postgres:9.6.3
ports:
  - "5450:5432"
volumes:
  - /var/opt/containerfiles/pgdata:/var/lib/postgresql/data
environment:
  POSTGRES_DB: mydb
  POSTGRES_USER: mydb
  POSTGRES_PASSWORD: password

redis:
restart: always
image: redis:3.2.9
ports:
  - "6385:6379"
volumes:
  - /var/opt/containerfiles/redisdata/data:/data
  - /var/opt/containerfiles/redisdata/redis.conf:/redis.conf
command: redis-server /redis.conf --appendonly yes
docker-compose restart didn't help.

docker ps -a output:

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
51aba7ac4ca6 postgres:9.6.3 “docker-entrypoint…” 3 weeks ago Up 34 minutes 0.0.0.0:5450->5432/tcp docker_postgres_1
876938679ced redis:3.2.9 “docker-entrypoint…” 3 weeks ago Up 34 minutes 0.0.0.0:6385->6379/tcp docker_redis_1

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 56
  • Comments: 74 (19 by maintainers)

Most upvoted comments

@gustawdaniel There is no issue here.

Overlay doesn’t use data. The size reported in df is the amount of space used by the underlying fs. Why do you think it’s overlay causing the issue?

What does docker ps --size say the size of the containers are?

I’m seeing this. I have deleted almost all my containers, and /var/lib/docker/overlay is still full of stuff from old containers. None of the pruning commands make a dent in it.

@thaJeztah @cpuguy83 Thank you all, I figured out! With ncdu util I found directories that are infinitely growing and aren’t relevant to Docker/overlay2

Docker team, could we re-open this issue? Clearly, there is something here.

@Droow good catch. In my case it doesn’t help much, but it’s a great tweak.

Directory overlay2 is still the main issue.

See how much space per directory

cd /var/lib/docker
echo; pwd; echo; ls -AlhF; echo; du -h --max-depth=1; echo; du -sh;

/var/lib/docker

total 184K
drwx------   2 root root 4.0K Nov 18 19:26 builder/
drwx--x--x   4 root root 4.0K Nov 18 19:26 buildkit/
drwx------ 301 root root  44K Dec 12 20:46 containers/
drwx------   3 root root 4.0K Nov 18 19:26 image/
drwxr-x---   3 root root 4.0K Nov 18 19:26 network/
drwx------ 754 root root  92K Dec 12 20:46 overlay2/
drwx------   4 root root 4.0K Nov 18 19:26 plugins/
drwx------   2 root root 4.0K Nov 18 21:23 runtimes/
drwx------   5 root root 4.0K Nov 18 21:23 swarm/
drwx------   2 root root 4.0K Dec 12 16:27 tmp/
drwx------   2 root root 4.0K Nov 18 19:26 trust/
drwx------   3 root root 4.0K Dec 12 20:46 volumes/

4.0K	./tmp
868M	./containers
44M	./image
4.0K	./runtimes
127M	./swarm
20K	./builder
20K	./plugins
60K	./volumes
292K	./network
72K	./buildkit
4.0K	./trust
33G	./overlay2
34G	.

34G	.

See log files

find /var/lib/docker/containers -type f -name "*.log" | xargs du -sh

...
88K	/var/lib/docker/containers/3a863e7846cae23c2ee00ebcba493692809e552ec6a156ad4eb698cdcfe5dd2e/3a863e7846cae23c2ee00ebcba493692809e552ec6a156ad4eb698cdcfe5dd2e-json.log
7.3M	/var/lib/docker/containers/8e30cad7d6ffcd76fdf016c54ca0979ec395b359774687f5fd739a8f88711153/8e30cad7d6ffcd76fdf016c54ca0979ec395b359774687f5fd739a8f88711153-json.log
3.3M	/var/lib/docker/containers/0135172c73407feb3247c8efb196609f6f8cdd21ff983efb9fc345e71a019e1f/0135172c73407feb3247c8efb196609f6f8cdd21ff983efb9fc345e71a019e1f-json.log
4.0K	/var/lib/docker/containers/83e47b1a9624379c2f7e63ea85eea3ea26ebe30fd0309405e595785122641dea/83e47b1a9624379c2f7e63ea85eea3ea26ebe30fd0309405e595785122641dea-json.log
1.2M	/var/lib/docker/containers/b7a858d59575336b853f7044f0e07a7e6497c8982b4479b60623e7ddb7a3ea23/b7a858d59575336b853f7044f0e07a7e6497c8982b4479b60623e7ddb7a3ea23-json.log
308K	/var/lib/docker/containers/f32bf33b4c7882a1ff8ce863e0c2ef261ce6e42c41397f052504316a331a6e9c/f32bf33b4c7882a1ff8ce863e0c2ef261ce6e42c41397f052504316a331a6e9c-json.log
84K	/var/lib/docker/containers/55b562c946f7d56d1e866986289734020efe5091bc11cb5e7b20afb1b888ceea/55b562c946f7d56d1e866986289734020efe5091bc11cb5e7b20afb1b888ceea-json.log
...

See how much space logs take

find /var/lib/docker/containers -type f -name "*.log" -print0 | du -shc --files0-from - | tail -n1

855M	total

trunc logs

truncate -s 0 /var/lib/docker/containers/*/*-json.log

This is now smaller :-p

13M	./containers

Hi @vartagg,

Can you share how you use the ncdu util to determine the disk-space is use in linux fs and by docker?

Thanks.

@thaJeztah I do run these cmd via crontab once a week on each nodes.

docker system prune -af && \
docker image prune -af && \
docker system prune -af --volumes && \
docker system df

I do not stop active container/services. But still, having 53GB of stuff here does not make sense.

root@jill2:/var/lib/docker/overlay2# du -shc /var/lib/docker/overlay2/*/diff | grep total
53G	total

It’s not a bug; generally;

  • remove containers, volumes, images that you no longer use (that’s what’s occupying the disk space)
  • make sure you’re on a current version of docker (if you encounter “failed to remove root filesystem” errors)

For anyone who might be still interested in this, for me the issue were log files! Apparently, by default docker appends all the logs for each container into a single file. So two of my container log files had 20GB in size (the app was running for almost 2 years). I solved it by running truncate -s 0 /var/lib/docker/containers/*/*-json.log and updating the docker config accordingly (https://stackoverflow.com/questions/42510002/how-to-clear-the-logs-properly-for-a-docker-container#4251031 helped a lot). Hope it helps 😉

Can you run docker system df?

I am having this same issue. Nothing in this thread has resolved my issue, it’s not log files for me. I’m running Ubuntu 20.04 host machine with a 10 GB volume. I’m doing docker pull to download and launch a single image, I launch it once (during the startup script) and let it run. After about 3 weeks the host disk completely fills up. The docker image is also running a minimal Ubuntu 20.04.

It’s my /var/lib/docker/overlay2 directory that keeps growing. This SO post unpacks all the diagnosis I’ve done so far. I’m still stuck:

https://stackoverflow.com/questions/67523890/why-does-a-long-running-docker-instance-fill-up-my-disk-space

For some reason, using Selenium with docker generate massive amount of data in var/lib/docker/overlay2 and I couldn’t find any way to remedy that.

Hi there,

I got the same issue, where the overlay is keeping growing, I was thinking it’s consumed by docker. However, after debugging with

ncdu

I figured out the issue is caused by the growing folder at

/tmp

directory, where it has lots of

rust_mozprofile.XXXXX

After a bit research, it turns out the rust_mozprofile is generated by geckodriver, which I used for my crawler app. Guess I will need to change the way how I handle geckodriver code.

Hopefully, this would help. Ian

df -h commands point to /var/lib/docker/overlay2/<id>/merged - 6.1G usage…

@vartagg That’s the total space used on your root partition, not the amount of space taken by overlay as overlay does not actually take space but rather a mount that gives you a merged view of things from the underlying filesystem.

You can double check by performing du -h /var/lib/docker all the space being used by docker.

du -h /var/lib/docker | sort -h (last line) 829M /var/lib/docker

Yes, du says that /var/lib/docker occupies 829M, but pydf and df -h commands point to /var/lib/docker/overlay2/<id>/merged - 6.1G usage…

@krschacht it’s “resolved” it self as I scrap&start a new cluster every month. But I’m not sure the core issue is resolve.

I found this explanation of how overlay2 works very helpful: https://forums.docker.com/t/some-way-to-clean-up-identify-contents-of-var-lib-docker-overlay/30604 (scroll to the end) TL;DR: use this to assess your storage taken by Docker images: du -shc /var/lib/docker/overlay2/*/diff | grep total It actually matches the output of docker system df.

In my case, restarting docker service freed space.

Just to give a feedback, because someone else may be following the wrong path as I was. I am facing this issue since the last weekend and I could not find such huge log files, because I was seeking it on /var/lib/docker/overlay2/ , instead of /containers/ like pointed by @Droow and @pascalandy. Running find command in the containers folder I managed to locate the culprit ones and after running truncate on that everything is fine now.

Thanks you both guys, you’ve saved my life. 😃

Why is this issue closed? Having force-removed all images, containers, volumes, etc… I still have 6gb sitting in /var/lib/docker/overlay2 and no clue how to safely remove them. Here’s a nuclear option but that feels not OK.

Hi @krschacht. Since this is about CI, I assume that we ran into a similar issue. The comments here helped me a lot to identify the problem. So I tweaked a script posted earlier: echo; pwd; echo; sudo ls; sudo du -x --max-depth=1 | sort -n -r; echo; sudo du -shx; and started running it from the root and towards the largest directories one by one. I found that:

  1. -x option in du -shx resolves the issue of counting files twice. For me it seemed that overlay2 was 125G but with -x it was just 65G which agreed also with docker system df.
  2. most of the disk space in overlay2 is used by the jenkins agents by data that are not cleaned properly by the pipelines. In your case I would check where the projects you pull and the test results are stored and if by more than one CI agents.

Hope it helps. 😃

I have a situation where I’ve removed my stack, verified there are no running containers/images, did a docker system prune -a, restarted docker, did another docker system prune -a, and yet /var/lib/docker/overlay2 is still consuming several GB’s of data.

# docker system df
TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              0                   0                   0B                  0B
Containers          0                   0                   0B                  0B
Local Volumes       6                   0                   1.328GB             1.328GB (100%)
Build Cache         0                   0                   0B                  0B

# du -shc /var/lib/docker/overlay2/*/diff | grep total
2.7G	total

What am I missing?