cli: volumes wont be deleted since update to docker 23

Since the update to docker 23 unused volumes will not be deleted anymore with docker volume prune nor docker system prune --volumes.

The answer is always Total reclaimed space: 0B. When I delete the volumes manually I even get these warning when running docker-compose.yml Error response from daemon: open /var/lib/docker/volumes/docker_volume1/_data: no such file or directory

Whats happening?

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 31
  • Comments: 50 (11 by maintainers)

Commits related to this issue

Most upvoted comments

Sorry I gave the wrong command in the first line there, docker volume prune --filter all=1

The default change allows us to:

1. Differentiate an anonymous volume from a named one

2. Make `docker volume prune` safer to execute. Given that named volumes typically are named so they can be easily referenced

I guess that’s fine reasoning, but the execution of this change was very strange for multiple reasons:

  • The CLI help output doesn’t explain anything about these distinctions. (Still says “Remove all unused local volumes”, nothing about anonymous vs named)
  • The documentation does not explain anything about these distinctions. (Same as above). We had to look in a specific change log to find the information.
  • As mentioned (and acknowledged by you), this change was not actually propagated to other parts of the CLI.
  • The previous default behavior is not a flag or something obvious to re-enable, but something non-obvious (set env vars? set a particular filter?). Seems like when you change something like this, “principal of least surprise” should apply. It seems like it would’ve been more conservative to at least use deprecation warnings over multiple releases, or not change it at all and add the ‘anonymous’ distinction as an opt-in flag.
  • The behavior simply isn’t obvious. I don’t care about anonymous volumes vs named volumes, for one. A novice Docker user is certainly not going to care or understand why the prune command isn’t deleting their volume. I think trying to save the user from themselves using non-explicit behavior is awfully confusing. I think at the very least, the previous behavior either needs to be explicitly mentioned in the help output or provided as a standalone flag. That way, if a Docker user sees a volume isn’t getting cleaned up, they can run -h, and hopefully notice mention of named vs anonymous volumes.

I have spent * days * trying to figure out why all of our gitlab runners are suddenly out of disk space, resorting to crude methods of stopping all containers manually in the middle of the night so I can loop through volumes and delete them. I expect a command like docker volume prune -f to Remove all unused local volumes as the docs say and have said for years (the example usage on the docs page even explicitly has a my-named-vol being removed).

Regardless of doc updates being missed (we’re humans, it happens), a change with this big of an impact should have had deprecation/compatibility warnings for at least one version. “My logs are being polluted, what’s going on? Oh, I need to update a command, cool.” is a much easier problem to deal with than “Why is the entire fleet of servers all running out of disk at the same time?!”

The changelog has this listed under “bug fixes and enhancements” and has the prefix of API: with no mention whatsoever of this effecting the cli. Even the most vigilant changelog readers would have missed this without intimate knowledge of the cli/API interaction.

@Emporea docker volume prune --all should give the old behavior. I understand docker system prune doesn’t support this yet (not intentionally).

Do you suggest to delete everything and recreate every volume

No, I don’t think that should be necessary, it would likely be simpler to use docker volume prune --filter all=1 in addition to docker system prune until your older volumes are no longer in use.

You can also use DOCKER_API_VERSION=1.40 docker system prune --volumes to get the old behavior.

Yes this sounds like it is related to the mentioned change. The default change allows us to:

  1. Differentiate an anonymous volume from a named one
  2. Make docker volume prune safer to execute. Given that named volumes typically are named so they can be easily referenced

It does mean that upgrading causes volumes which were created prior to 23.0.0 to not be considered for pruning except for when specifying --all. However “anonymous” volumes created after 23.0.0 will be considered for pruning… and of course again --all gives the old behavior.

Also if your client uses the older API version it will also get the old behavior.

Also seeing this issue. We had to revert to 20.10 because it was filling up disks with no way to recover and causing outages.

It seems like it’s this change in 23 (mentioned in the 23 changelog). When I run this all=true filter on docker volume prune it works. But docker system prune does not accept this filter, so now seems broken

Not clear why this default needed to change

Thanks for the reports. I found the bug and will post a patch momentarily.

If its a hex string then it is most likely not a named volume (unless you have something generating strings when you create the volume). You should be able to inspect the volume and check for the label: com.docker.volume.anonymous. If this key exists (value is unused) then it may be collected by docker volume prune.

It would be nice if the change would be cascaded into system prune and system df as raised by others.
prune still claims it removes all volumes not linked to any containers if passed --volumes.
df, similarly list those volumes as reclaimable space, which should now be taken with a grain of salt.

I spent half a day today on this, I expected that my volumes were being deleted… thank you for changing the behavior and not sending a message to the console that it no longer works as before (without negativity)

@neersighted just to be precise though, from what I know the -a, --all in system prune affect only images. For volumes, there is another option/flag, which is --volumes.
Thinking about it, I could definitely see a --volumes that would default to --volumes anonymous and a --volumes all allowing to cascade the need to delete named volume to the docker volume prune command, here.
But I also see the place where this “better safe than sorry” change is coming.

https://github.com/docker/cli/pull/4497 was accepted and cherry-picked, which addresses the docs/--help issue. It is intentional that system prune -a no longer affects named volumes; I think a --really-prune-everything flag is out of scope for this issue, but feel free to open a feature request if you think that it is useful in the 90% case. My earlier comments are still quite relevant, I think:

system prune -a is a command often fired indiscriminately, and has lead to much data loss and gnashing of teeth. While having to run two commands is a mild pain, it helps with preventing frustration and loss of data for new users copying commands out of tutorials. We can certainly explore a system prune --all=with-named-volumes or something in the future for users who understand exactly what they are doing, but currently the need to run a separate command is by design

I’m going to close this for now, as the last set of sharp edges identified here are solved (and will be in the next patch release), but please feel free to continue the discussion or open that follow-up feature request.

As stated in this thread, system prune -a no longer prunes named volumes, by design. If you need to prune named volumes, the method to use is currently volune prune -a.

system prune -a is a command often fired indiscriminately, and has lead to much data loss and gnashing of teeth. While having to run two commands is a mild pain, it helps with preventing frustration and loss of data for new users copying commands out of tutorials.

We can certainly explore a system prune --all=with-named-volumes or something in the future for users who understand exactly what they are doing, but currently the need to run a separate command is by design.

We have not changed the behavior of system prune -a to imply volume prune -a.

I just use docker volume rm $(docker volume ls -q) since I can’t remember the filter options.

@sudo-bmitch I’m getting the same behavior as described by @Re4zOon with docker compose.

Edit: With plain docker too. I can reproduce with docker run -e MARIADB_ROOT_PASSWORD=root mariadb for example. If I stop and remove the container, the volume stays and prune won’t delete it, unless I specify --filter all=1.

$ docker volume inspect b8ef83d3ed925b7d92cfac80a095b047f20e8487749d3a30cd6700c17318ff95
[
    {
        "CreatedAt": "2023-03-11T13:54:58-08:00",
        "Driver": "local",
        "Labels": null,
        "Mountpoint": "/var/lib/docker/volumes/b8ef83d3ed925b7d92cfac80a095b047f20e8487749d3a30cd6700c17318ff95/_data",
        "Name": "b8ef83d3ed925b7d92cfac80a095b047f20e8487749d3a30cd6700c17318ff95",
        "Options": null,
        "Scope": "local"
    }
]

$ docker volume prune -f
Total reclaimed space: 0B

$ docker volume prune -f --filter all=1
Deleted Volumes:
b8ef83d3ed925b7d92cfac80a095b047f20e8487749d3a30cd6700c17318ff95

Total reclaimed space: 156.5MB

Output of docker system info:

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  0.10.3
    Path:     /usr/lib/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  2.16.0
    Path:     /usr/lib/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 49
 Server Version: 23.0.1
 Storage Driver: btrfs
  Btrfs:
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 1e1ea6e986c6c86565bc33d52e34b81b3e2bc71f.m
 runc version:
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.2.2-zen1-1-zen
 Operating System: Arch Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 20
 Total Memory: 31.03GiB
 Name: xenomorph
 ID: M6ZE:N2VB:3P6Z:7V55:H6W7:KQJA:QESD:MUPC:T762:4ENW:KUUX:2WU3
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

The CLI help output doesn’t explain anything about these distinctions. (Still says “Remove all unused local volumes”, nothing about anonymous vs named)

Yeah, the help needs to change to say ‘anonymous.’

The documentation does not explain anything about these distinctions. (Same as above). We had to look in a specific change log to find the information.

Same as the help output, PRs welcome.

Seems like when you change something like this, “principal of least surprise” should apply. I don’t care about anonymous volumes vs named volumes, for one.

I’d like to point out you’re a tiny minority there – for the vast majority of users, “Docker deleted my data after I ran system prune -a” has been a sharp edge for years. Most users expect prune to ‘clean up garbage,’ not ‘clean up the things I wanted Docker to keep.’

As mentioned (and acknowledged by you), this change was not actually propagated to other parts of the CLI.

The only part where this possibly needs to propagate is docker system prune -a and we’re still not sure what the better behavior is.

That way, if a Docker user sees a volume isn’t getting cleaned up, they can run -h, and hopefully notice mention of named vs anonymous volumes.

Agreed, there should be an example of using the all filter in the help text.


Please keep in mind that this has been a persistent pain for educators, commercial support, and undercaffinated experts for years. People are here in this thread because they find the behavior change surprising, and yeah, it looks like review missed the docs updates needed (and this is because of the historical incorrect split of client/server logic we are still cleaning up) – however, please keep in mind that this thread represents the minority of users who find this behavior unexpected or problematic.

We certainly can improve here, and there are a lot of valid points, but the happy path for the majority of users is to stop pruning named volumes by default.

root@server ~ [125]# docker volume prune --all
unknown flag: --all
See 'docker volume prune --help'.

root@server ~ [125]# docker --version
Docker version 23.0.0, build e92dd87