gsutil: Calling `gsutil mv` on a directory/prefix will incorrectly construct destination object path(s) if the destination prefix is a substring of the source prefix

For example gsutil -m mv gs://bucket/dir/subdir1 gs://bucket/dir/subdir2 sometimes correctly renames to gs://bucket/dir/subdir2, but other times to gs://bucket/dir/subdir2/subdir1

Coulnd’t pinpoint exactly conditions, but it seems it has nothing to do with trailing slashes. In any case there is nothing about such bahaviour in the docs.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Reactions: 3
  • Comments: 19 (8 by maintainers)

Commits related to this issue

Most upvoted comments

If there’s an existing subdirectory called gs://buciket/dir/subdir2 before you run that mv command it will put subdir1 under subdir2. That is correct behavior - it emulates similar behavior of Unix directory renames. Please see https://cloud.google.com/storage/docs/gsutil/commands/cp#how-names-are-constructed for more details.

@mfschwartz I understand but this was not the case. I tried it again just to make sure.

$ mkdir dir
$ touch dir/file
$ gsutil cp -r dir gs://bucket/
$ gsutil mv gs://bucket/dir gs://bucket/dir2
$ gsutil ls gs://bucket/dir2
gs://bucket/dir2/dir/

unfortunately, it’s seems to be hard to reproduce though, I ran this once and it misbehaved, but then 3 times it worked as expected (with different names every time)

gsutil version: 4.19

This is fixed in gsutil v4.43, which is now available in the pypi repo (https://pypi.org/project/gsutil/). We missed the cutoff for this week’s Cloud SDK, but it should be in 265.0.0, scheduled for Tues, Oct 1.

Changed the name of this issue to clarify under which conditions it happens.

There was also a duplicate report of this where a user arrived at the same conclusion in https://issuetracker.google.com/issues/112817360

@thobrla Thanks for acknowledging this. I think the way to solve this is to not use listing at all. Note that my gsutil mv command is completely within the GS cloud, no inter-cloud or local.

When renaming test2 to test, instead of listing the bucket or parent directory, it can check the presence of test directly, which I believe is consistent (rather than the eventually consistent listing operation). Done completely in-cloud, this should be as efficient as listing approach.

One might ask why would anyone want to rename to a just-deleted directory. We do it for backup rotation:

  1. Upload the backup to daily_next
  2. Delete the daily_prev
  3. Rename daily to daily_prev (which was just deleted)
  4. Rename daily_next to daily (which, again, just deleted in step 3)

While the alternative is to use dated directories, we prefer this approach.

@mfschwartz Please reopen, here’s how to reproduce 100% of the time.

Basically when you rename a folder to something else, then you rename it back to the original name (which should not exist), it treats the original name as “already exists”, therefore creating a subdirectory inside the incorrectly “existing” folder.

⟫ gsutil --version
gsutil version: 4.19

ceefour@cron:~⟫ mkdir test
ceefour@cron:~⟫ echo hello > test/hello
ceefour@cron:~⟫ gsutil cp -r test gs://snapshot.bippo.co.id/
Copying file://test/hello [Content-Type=application/octet-stream]...
Uploading   gs://snapshot.bippo.co.id/test/hello:                6 B/6 B

ceefour@cron:~⟫ gsutil ls -r gs://snapshot.bippo.co.id/test
gs://snapshot.bippo.co.id/test/:
gs://snapshot.bippo.co.id/test/hello

ceefour@cron:~⟫ gsutil mv gs://snapshot.bippo.co.id/test gs://snapshot.bippo.co.id/test2
Copying gs://snapshot.bippo.co.id/test/hello [Content-Type=application/octet-stream]...
Copying     gs://snapshot.bippo.co.id/test2/hello:               6 B/6 B
Removing gs://snapshot.bippo.co.id/test/hello...

ceefour@cron:~⟫ gsutil ls -r gs://snapshot.bippo.co.id/test2
gs://snapshot.bippo.co.id/test2/:
gs://snapshot.bippo.co.id/test2/hello

ceefour@cron:~⟫ gsutil mv gs://snapshot.bippo.co.id/test2 gs://snapshot.bippo.co.id/test
Copying gs://snapshot.bippo.co.id/test2/hello [Content-Type=application/octet-stream]...
Copying     gs://snapshot.bippo.co.id/test/test2/hello:          6 B/6 B
Removing gs://snapshot.bippo.co.id/test2/hello...

ceefour@cron:~⟫ gsutil ls -r gs://snapshot.bippo.co.id/test
gs://snapshot.bippo.co.id/test/:

gs://snapshot.bippo.co.id/test/test2/:
gs://snapshot.bippo.co.id/test/test2/hello