dvc: dvc pull not fetching all data (cache file not found)
Please provide information about your setup
dvc --version 0.23.2
uname -a Linux arachne-postgres 4.15.0-36-generic #39-Ubuntu SMP Mon Sep 24 16:19:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
On server-1 data pushed to remote is in cache and in s3
$ ls -l .dvc/cache/9c/
total 63360
-rw-rw-r-- 1 ubuntu ubuntu 783548 Jan 4 20:53 01b58cb0faab4ee28a9228552ffd8d
-rw-rw-r-- 1 ubuntu ubuntu 3719779 Jan 4 20:53 2861308e6110dc7f4850cbe331e63a
-rw-rw-r-- 2 ubuntu ubuntu 14722 Jan 4 20:51 2fa17d3b0c9486c5af435329f62151
-rw-rw-r-- 1 ubuntu ubuntu 849013 Jan 4 20:52 416b598605a8fcd6fc04c2edab4edc
-rw-rw-r-- 2 ubuntu ubuntu 22852 Jan 4 20:52 55bb368600627e7e14ad7648d8f26b
-rw-rw-r-- 2 ubuntu ubuntu 39899 Jan 4 20:52 5ffd6a38f12b14c6a5e6aafacf133c
-rw-rw-r-- 1 ubuntu ubuntu 614053 Jan 4 20:51 711b4315305a21dabfa44e74740ff7
-rw-rw-r-- 2 ubuntu ubuntu 23825 Jan 4 20:52 765b898179e0c88af48a638dfe6586
-rw-rw-r-- 1 ubuntu ubuntu 148555 Jan 7 07:36 7b5eca364544bf04f63c390dde7f6e.dir
-rw-rw-r-- 1 ubuntu ubuntu 865287 Jan 4 20:53 7f3444015a68ee039d84986d5f9a98
-rw-rw-r-- 2 ubuntu ubuntu 18086 Jan 4 20:52 8bc9b783a8e0568e71d350e4a1fc37
-rw-rw-r-- 2 ubuntu ubuntu 9823 Jan 4 20:52 95e0f6ac3455b66b69974c3936eba7
-rw-rw-r-- 1 ubuntu ubuntu 673028 Jan 4 20:52 997353cce6f0997dc97e311893f3fb
-rw-rw-r-- 1 ubuntu ubuntu 54562699 Jan 4 22:43 9d7b83b536edc6666c76d16e9bfc6b
-rw-rw-r-- 1 ubuntu ubuntu 436909 Jan 4 20:52 a485af08514391eaf58f7a607a1aaa
-rw-rw-r-- 1 ubuntu ubuntu 365795 Jan 4 20:51 a76a77aa1343aba087211e42f6d2b7
-rw-rw-r-- 1 ubuntu ubuntu 1232517 Jan 4 20:53 ae610c9a0246025b573888e84d766e
-rw-rw-r-- 1 ubuntu ubuntu 364246 Jan 4 20:51 bec0f4fc35ecf1377e62c3859362c4
-rw-rw-r-- 1 ubuntu ubuntu 59057 Nov 9 02:47 cf83393276d56191a88c3d54ef6a5d
-rw-rw-r-- 2 ubuntu ubuntu 33169 Jan 4 20:52 e00d88888f810720aee5b46c3f0772
aws --endpoint=https://ceph.acc.ohsu.edu s3 ls s3://bmeg/dvc/9c/
2019-01-08 04:21:07 783548 01b58cb0faab4ee28a9228552ffd8d
2019-01-08 04:20:50 3719779 2861308e6110dc7f4850cbe331e63a
2019-01-08 04:01:19 14722 2fa17d3b0c9486c5af435329f62151
2019-01-08 04:20:47 849013 416b598605a8fcd6fc04c2edab4edc
2019-01-08 04:02:18 22852 55bb368600627e7e14ad7648d8f26b
2018-12-18 21:41:50 62535 5b6377d120103a5a8e841a7b94ff4c
2019-01-08 04:02:14 39899 5ffd6a38f12b14c6a5e6aafacf133c
2019-01-08 04:19:52 614053 711b4315305a21dabfa44e74740ff7
2019-01-08 04:01:46 23825 765b898179e0c88af48a638dfe6586
2019-01-08 04:01:17 148555 7b5eca364544bf04f63c390dde7f6e.dir
2019-01-08 04:20:18 865287 7f3444015a68ee039d84986d5f9a98
2019-01-08 04:01:38 18086 8bc9b783a8e0568e71d350e4a1fc37
2019-01-08 04:01:44 9823 95e0f6ac3455b66b69974c3936eba7
2019-01-08 04:19:13 673028 997353cce6f0997dc97e311893f3fb
2019-01-04 22:41:27 54562699 9d7b83b536edc6666c76d16e9bfc6b
2019-01-08 04:20:23 436909 a485af08514391eaf58f7a607a1aaa
2019-01-08 04:19:44 365795 a76a77aa1343aba087211e42f6d2b7
2019-01-08 04:20:40 1232517 ae610c9a0246025b573888e84d766e
2018-12-18 21:41:50 799324 b818767f237d4c9647c1208ce8c28b
2019-01-08 04:20:53 364246 bec0f4fc35ecf1377e62c3859362c4
2019-01-07 07:07:32 59057 cf83393276d56191a88c3d54ef6a5d
2019-01-08 04:02:17 33169 e00d88888f810720aee5b46c3f0772
On server-2 dvc never loads all files
aws --endpoint=https://ceph.acc.ohsu.edu s3 ls s3://bmeg/dvc/9c/
2019-01-07 20:21:07 783548 01b58cb0faab4ee28a9228552ffd8d
2019-01-07 20:20:50 3719779 2861308e6110dc7f4850cbe331e63a
2019-01-07 20:01:19 14722 2fa17d3b0c9486c5af435329f62151
2019-01-07 20:20:47 849013 416b598605a8fcd6fc04c2edab4edc
2019-01-07 20:02:18 22852 55bb368600627e7e14ad7648d8f26b
2018-12-18 13:41:50 62535 5b6377d120103a5a8e841a7b94ff4c
2019-01-07 20:02:14 39899 5ffd6a38f12b14c6a5e6aafacf133c
2019-01-07 20:19:52 614053 711b4315305a21dabfa44e74740ff7
2019-01-07 20:01:46 23825 765b898179e0c88af48a638dfe6586
2019-01-07 20:01:17 148555 7b5eca364544bf04f63c390dde7f6e.dir
2019-01-07 20:20:18 865287 7f3444015a68ee039d84986d5f9a98
2019-01-07 20:01:38 18086 8bc9b783a8e0568e71d350e4a1fc37
2019-01-07 20:01:44 9823 95e0f6ac3455b66b69974c3936eba7
2019-01-07 20:19:13 673028 997353cce6f0997dc97e311893f3fb
2019-01-04 14:41:27 54562699 9d7b83b536edc6666c76d16e9bfc6b
2019-01-07 20:20:23 436909 a485af08514391eaf58f7a607a1aaa
2019-01-07 20:19:44 365795 a76a77aa1343aba087211e42f6d2b7
2019-01-07 20:20:40 1232517 ae610c9a0246025b573888e84d766e
2018-12-18 13:41:50 799324 b818767f237d4c9647c1208ce8c28b
2019-01-07 20:20:53 364246 bec0f4fc35ecf1377e62c3859362c4
2019-01-06 23:07:32 59057 cf83393276d56191a88c3d54ef6a5d
2019-01-07 20:02:17 33169 e00d88888f810720aee5b46c3f0772
ls -l .dvc/cache/9c/
total 908
-rw-rw-r-- 1 ubuntu ubuntu 62535 Nov 9 16:09 5b6377d120103a5a8e841a7b94ff4c
-rw-rw-r-- 1 ubuntu ubuntu 799324 Nov 11 16:54 b818767f237d4c9647c1208ce8c28b
-rw-rw-r-- 1 ubuntu ubuntu 59057 Nov 9 16:22 cf83393276d56191a88c3d54ef6a5d
dvc fetch runs without errors (although it constantly recalcs md5) dvc pull always returns the following.
Warning: Cache '9c7b5eca364544bf04f63c390dde7f6e.dir' not found. File '{'path': '/mnt/bmeg/bmeg-etl/source/ccle/vcfs', 'scheme': 'local'}' won't be created.
both servers are at the same git branch / commit
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 28 (28 by maintainers)
Commits related to this issue
- remote: adds support for s3.list_objects Optionally use list_objects, useful for ceph and other s3 emulators fixes: #1476 — committed to bwalsh/dvc by bwalsh 5 years ago
Wrote a quick test to confirm:
hypothesis: only fetching first page
https://github.com/iterative/dvc/blob/9528ad6a1dbe205644431bfb0e02b1e2ae8449bb/dvc/remote/s3.py#L221
Confirmed
Could we use
list_objects
instead ?Thank you, we will try it out next week.
@efiop Ruslan, thank you. I think (haven’t processed all the files yet) that may have solved the issue. I’ll update the issue tomorrow.
My process to downgrade: