dvc: renaming folders in a 132 GB dataset dvc push waits for 5 days then crashes

Please provide information about your setup DVC version(i.e. dvc --version), Platform and method of installation (pip, homebrew, pkg Mac, exe (Windows), DEB(Linux), RPM(Linux))

jmollevi@vr-desktop:~$ dvc --version 0.59.2

installed from pip on debian stretch amd64


When renaming folders in a 132 GB dataset dvc push waits for 5 days then crashes.

the entire folder data/cloud-mask-training-set-1 is added as one dvc file

when renaming some folders directly below that dvc push fails after 5 days

folders in output below

jmollevi@vr-desktop:~/projects/dvctest$ ls data/cloud-mask-training-set-1
final-33UUB-2018-07-04_1  final-34VCM-2018-08-12_1  final-34WDS-2018-06-24_1
final-33VUE-2018-05-18_1  final-34VDM-2018-10-01_1  final-34WDT-2018-07-01_1
final-33VUF-2018-11-09_1  final-34WDS-2018-02-08_1  metadata.csv
final-33VWC-2018-09-26_1  final-34WDS-2018-04-27_1
final-33WXR-2018-03-12_1  final-34WDS-2018-06-01_1


-----

jmollevi@vr-desktop:~/projects/dvctest/data$ time dvc push cloud-mask-training-set-1.dvc 
  1%|          |azure://jmollevid21225/2363465 [34:34<137911:06:44,   212s/file]No handlers could be found for logger "XXX"
ERROR: unexpected error - Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. ErrorCode: AuthenticationFailed
<?xml version="1.0" encoding="utf-8"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:4c2ab6b1-c01e-0050-77ae-6de43f000000
Time:2019-09-17T23:21:19.0002290Z</Message><AuthenticationErrorDetail>Request date header too old: 'Tue, 17 Sep 2019 23:01:05 GMT'</AuthenticationErrorDetail></Error>

Having any troubles?. Hit us up at https://dvc.org/support, we are always happy to help!

real	8093m13.659s
user	8018m2.480s
sys	8m56.291s
jmollevi@vr-desktop:~/projects/dvctest/data$

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 3
  • Comments: 19 (12 by maintainers)

Most upvoted comments

I have been testing this some more and it seems gone, now completing in 4hours.

@Suor I aborted the second push now after having it run over the weekend with no output and 1 core at 100% cpu. When creating a new dvc repo and pushing a 60kb file it completed in 2 seconds. Pulling the file back after removing cache took 1.5 seconds.

@Suor Still no progress bar 2 hours later

@Suor I have not tried that, only setup that azure data store for DVC and have not used it for anything else. It did manage to push the data initially thou in about 4 days.

I will try to make time to test azure cli upload speed and get back on this.

also 137911:06:44 ETA? That’s insane. Even if it didn’t crash after 5 days it would take over 15 years to complete.

  1%|          |azure://jmollevid21225/2363465 [34:34<137911:06:44,   212s/file]

looks like total is 2363465

@efiop judjing by progress bar it’s 100 billion files )

It looks like it was collecting and pushing files so long that authentication headers expired. Azure client failed to handle that situation so it’s their bug, we can work it around by recreating BlockBlobService instance most probably.

P.S. There is no progress bar around dir cache collection in ._collect_used_dir_cache(). I wonder how long does that take.