dvc: Consistently getting broken pipe when syncing (uploading) large file
traceback (most recent call last):
File "/Users/ophir/anaconda3/envs/p2/bin/dvc", line 11, in <module>
sys.exit(main())
File "/Users/ophir/anaconda3/envs/p2/lib/python2.7/site-packages/dvc/main.py", line 63, in main
Runtime.run(CmdDataSync)
File "/Users/ophir/anaconda3/envs/p2/lib/python2.7/site-packages/dvc/runtime.py", line 41, in run
sys.exit(instance.run())
File "/Users/ophir/anaconda3/envs/p2/lib/python2.7/site-packages/dvc/command/data_sync.py", line 47, in run
pool.map(cloud.sync, targets)
File "/Users/ophir/anaconda3/envs/p2/lib/python2.7/multiprocessing/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/Users/ophir/anaconda3/envs/p2/lib/python2.7/multiprocessing/pool.py", line 567, in get
raise self._value
socket.error: [Errno 32] Broken pipe
p.s. is syncing the file manually using the aws cli a viable workaround, or are there other things done during the sync (updating status file or something similar)
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 24 (24 by maintainers)
Commits related to this issue
- aws: use multipart to push changes Fixes #100. — committed to efiop/dvc by efiop 7 years ago
- aws: use multipart to push changes, v2 Fixes #100 Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com> — committed to efiop/dvc by efiop 7 years ago
Sorry again for such a delay. I managed to reproduce this issue(only on Mac, other platforms work fine) for files that are >5G(tested on 8G and 4G, the former reproduced the issue and the latter was uploaded just fine), which makes sense as aws actually mentions this limitation in their docs, but the strange part is that on Linux this limit doesn’t seem to result in anything). So this issue should be fixed with https://github.com/dataversioncontrol/dvc/issues/163. I’m working on implementing it right now and expect to deliver it in 24h hopefully =). Thank you for your patience.
Oh, those Macs… Thanks for the info.