dvc: Pushing artifacts via WebDAV results in a 411 Length Required response

Bug Report

I am trying to connect to a remote via WebDAV. I can correctly setup user and password along with the url, but when I try to push the artifacts I get a 411 Length Required response. How can I solve the missing header problem?

Please provide information about your setup

DVC version: 1.9.0 (brew)

Platform: Python 3.9.0 on macOS-10.15.7-x86_64-i386-64bit Supports: azure, gdrive, gs, http, https, s3, ssh, oss, webdav, webdavs Cache types: reflink, hardlink, symlink Repo: dvc, git

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 40 (34 by maintainers)

Commits related to this issue

Most upvoted comments

Any info about the server? At the first glance seems like the server is not understanding the chunked upload. Might be missing something though. CC @iksnagreb

@efiop @LucaButera Can we try to figure out, whether it is really (only) the chunked upload and not something else?

@LucaButera If you have a copy of the dvc repository and some time to try something: It should be quite easy to change the _upload method of the WebDAVTree to use the upload_file method, which irrc does no chunking of the file.

https://github.com/iterative/dvc/blob/master/dvc/tree/webdav.py#L243

You would have to change the last line self._client.upload_to(buff=chunks(), remote_path=to_info.path) to self._client.upload_file(local_path=from_file, remote_path=to_info.path)

If this modification lets you upload files, we can be pretty sure it is the chunking or a bug in the webdavclient upload_to method. Note that this will disable the progressbar, so it might seem as it is hanging…

I assume you have no valid dvc cache at the remote yet (as uploading does not work at all)? So you cannot check whether downloading is working?

Before trying to upload the file, the parent directories should be created e.g. datasets/a7, could you please check, whether this was successful?

@LucaButera, to see if the chunking upload is the issue, you could also try sending a curl request with chunking upload:

$ curl --upload-file test.txt https://<user>@drive.switch.ch/remote.php/dav/files/<user>/test.txt -vv --http1.1 --header "Transfer-Encoding: chunked"

Also, check without that header. If the files are uploaded successfully on both instances, something’s wrong with the library. If it’s just the former, chunking upload might have been forbidden on the server entirely.

@LucaButera, It’d be great if you could make a PR. Thanks. Check contributing-guide for setup.

the relative config needed to use it?

Maybe, no need of the config, but we can decide that on the PR discussion.

@skshetry it would be wonderful to have a simple solution like that.

On the other hand a more reliable solution like the one of the “assembly on pull” seems also a nice feature in the long run.

I have never contributed to open source projects but I am willing to help if needed, as I think DVC is really a much needed tool.

I’m also facing similar but slightly different issue with “Nextcloud + mod_fcgi” (which is a bug in httpd2), in which files are uploaded empty.

The original issue might be due to that bug (not fixed yet) or, this bug which was only fixed 2 years ago (OP’s server is 2.4.18, whereas recent one is 2.4.46).

Sabredav’s wiki has a good insight into these bugs:

Finder (On OS X) uses Transfer-Encoding: Chunked in PUT request bodies. This is a little-used HTTP feature, and therefore not implemented in a bunch of web servers. The only server I’ve seen so far that handles this reasonably well is Apache + mod_php. Nginx and Lighttpd respond with 411 Length Required, which is completely ignored by Finder. This was seen on Nginx 0.7.63. It was recently reported that a development release (1.3.8) no longer had this issue.

When using this with Apache + FastCGI PHP completely drops the request body, so it will seem as if the PUT request was successful, but the file will end up empty.

So, the best thing to do is either drop “chunked” requests on PUT or introduce config to disable it.

Not having a progress bar is a very serious thing

@efiop, as the webdavclient3 uses streaming upload, we can still support progress bars:

with open(file, "rb") as fd:
    with Tqdm.wrapattr(fd, "read", ...) as wrapped:
        self._client.upload_to(buff=wrapped, remote_path=to_info.path)

Look here for the change: https://github.com/iterative/dvc/blob/f827d641d5c2f58944e49d2f6537a9ff09e447e1/dvc/tree/webdav.py#L224

but that seems like a feature request for our WebDAV library and not for DVC, right? Or am I missing something?

The Owncloud Chunking (NG) might be too slow for our use case, as it needs to create a separate request for each chunk (and, then send “MOVE” that joins all the chunk which is again expensive). So, unless we change our upload strategy to parallelize chunking upload rather than file upload, we will make it 3-4x slower, just for the sake of having a progress bar. And, it seems it’s possible to have a progress bar without it. Not to add, it’s not a WebDAV standard, that’s unsupported outside of Nextcloud and Owncloud.

it will also result in people running into timeout errors

I don’t think, there is any way around timeout errors, especially if we talk about PHP based WebDAV servers (they have a set max_execution_time). The Owncloud Chunking NG exists because of this very reason.

Though, we could just chunk and upload and then assemble it during pull. I think, this is what rclone chunker does.

For closing this issue, we could just disable chunking upload via a config or by default.

[…] it will also result in people running into timeout errors for big files […]

Uff, yes, did not even think about this yet… You probably not want to adjust the timeout config depending on your expected file size, so chunked transmission is the only solution to avoid timeouts per request.

Then lets think about implementing something like dvc remote modify <remote> chunked_upload false (I think true should be the default). Maybe chunked_transfer or just chunked would be a better name as this might apply to download as well?

@LucaButera, did you try @iksnagreb’s suggestion? If that works, we could provide a config for disabling it.

If that didn’t work, I am afraid there’s no other easy solution than to contact the provider. Nextcloud/Owncloud does support non-standard webdav extension for chunking upload for these kind of situations, but it’s unlikely we are going to support it.

Do I have any way to perform non-chunked upload in DVC?

Hm, I do not thinks this is possible right now - at least for the WebDAV remote. It should be possible to implement an option to enable non-chunked upload, the problem I see is: This would also disable the progressbar (without chunking, we cannot count progress…) which is not obvious and might confuse users. @efiop Are there options for disabling chunking for other remotes, if yes, how do these handle that problem?

Or I have no choice but to contact the provider and hope they can somehow enable the chunked upload?

I think an option for selecting chunked/non-chunked upload could be an configuration option (if we can find a way to handle this conveniently), there are probably other cloud providers disallowing chunked upload as well…

The server is a Switch Drive, which is a cloud storage provider based on ownCloud. I would assume the WebDAV server is the same as ownCloud, but I don’t have further info