transformers: Upload models using Git fails

Environment info

  • transformers version: 3.5.0
  • Platform: Linux-4.15.0-112-generic-x86_64-with-debian-buster-sid
  • Python version: 3.6.12
  • PyTorch version (GPU?): 1.7.0 (False)
  • Tensorflow version (GPU?): 2.3.1 (False)
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No

Who can help

Model Cards: @julien-c T5: @patrickvonplaten

Information

Model I am using (Bert, XLNet …): T5

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

git clone https://huggingface.co/Rostlab/prot_t5_xl_bfd
Cloning into 'prot_t5_xl_bfd'...                                                                                                  
remote: Enumerating objects: 31, done.                     
remote: Counting objects: 100% (31/31), done.                                       
remote: Compressing objects: 100% (29/29), done.                         
remote: Total 31 (delta 13), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (31/31), done.

cp config.json pytorch_model.bin prot_t5_xl_bf

git add --all

git status
On branch main
Your branch is up to date with 'origin/main'.

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        modified:   config.json
        modified:   pytorch_model.bin

git commit -m "[T5] Fix load weights function #8528"

git push
Username for 'https://huggingface.co': xxxxxx
Password for 'https://agemagician@huggingface.co':
Counting objects: 4, done.
Delta compression using up to 80 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (4/4), 4.92 GiB | 23.09 MiB/s, done.
Total 4 (delta 2), reused 1 (delta 0)
error: RPC failed; HTTP 504 curl 22 The requested URL returned error: 504 Gateway Time-out
fatal: The remote end hung up unexpectedly
fatal: The remote end hung up unexpectedly
Everything up-to-date

OR

GIT_CURL_VERBOSE=1 git push
* Couldn't find host huggingface.co in the .netrc file; using defaults
*   Trying 192.99.39.165...                                 
* TCP_NODELAY set           
* Connected to huggingface.co (192.99.39.165) port 443 (#0)
* found 140 certificates in /etc/ssl/certs/ca-certificates.crt
* found 421 certificates in /etc/ssl/certs
* ALPN, offering http/1.1                         
* SSL connection using TLS1.2 / ECDHE_RSA_AES_256_GCM_SHA384
*        server certificate verification OK
*        server certificate status verification SKIPPED
*        common name: huggingface.co (matched)                        
*        server certificate expiration date OK                       
*        server certificate activation date OK               
*        certificate public key: RSA                       
*        certificate version: #3                 
*        subject: CN=huggingface.co                     
*        start date: Tue, 10 Nov 2020 08:05:46 GMT
*        expire date: Mon, 08 Feb 2021 08:05:46 GMT
*        issuer: C=US,O=Let's Encrypt,CN=Let's Encrypt Authority X3
*        compression: NULL                          
* ALPN, server accepted to use http/1.1      
> GET /Rostlab/prot_t5_xl_bfd/info/refs?service=git-receive-pack HTTP/1.1
Host: huggingface.co
User-Agent: git/2.17.1                        
Accept: */*      
Accept-Encoding: gzip 
Accept-Language: C, *;q=0.9          
Pragma: no-cache                                     
                   
< HTTP/1.1 401 Unauthorized
< Server: nginx/1.14.2          
< Date: Sat, 14 Nov 2020 23:43:52 GMT
< Content-Type: text/plain; charset=utf-8         
< Content-Length: 12                                                  
< Connection: keep-alive                                             
< X-Powered-By: huggingface-moon                             
< WWW-Authenticate: Basic realm="Authentication required", charset="UTF-8"
< ETag: W/"c-dAuDFQrdjS3hezqxDTNgW7AOlYk"        
<                                                       
* Connection #0 to host huggingface.co left intact
Username for 'https://huggingface.co': agemagician
Password for 'https://agemagician@huggingface.co':
* Couldn't find host huggingface.co in the .netrc file; using defaults
* Found bundle for host huggingface.co: 0x55acdab63f80 [can pipeline]
* Re-using existing connection! (#0) with host huggingface.co
* Connected to huggingface.co (192.99.39.165) port 443 (#0)
* Server auth using Basic with user 'agemagician'
> GET /Rostlab/prot_t5_xl_bfd/info/refs?service=git-receive-pack HTTP/1.1
Host: huggingface.co                 
Authorization: Basic YWdlbWFnaWNpYW46VWRjXzEyMDA=       
User-Agent: git/2.17.1
Accept: */*
Accept-Encoding: gzip
Accept-Language: C, *;q=0.9
Pragma: no-cache

< HTTP/1.1 200 OK
< Server: nginx/1.14.2
< Date: Sat, 14 Nov 2020 23:43:59 GMT
< Content-Type: application/x-git-receive-pack-advertisement
< Transfer-Encoding: chunked
< Connection: keep-alive
< X-Powered-By: huggingface-moon
<
* Connection #0 to host huggingface.co left intact
Counting objects: 4, done.
Delta compression using up to 80 threads.
Compressing objects: 100% (3/3), done.
* Couldn't find host huggingface.co in the .netrc file; using defaults
* Found bundle for host huggingface.co: 0x55acdab63f80 [can pipeline]
* Re-using existing connection! (#0) with host huggingface.co
* Connected to huggingface.co (192.99.39.165) port 443 (#0)
* Server auth using Basic with user 'agemagician'
> POST /Rostlab/prot_t5_xl_bfd/git-receive-pack HTTP/1.1
Host: huggingface.co
Authorization: Basic YWdlbWFnaWNpYW46VWRjXzEyMDA=
User-Agent: git/2.17.1
Content-Type: application/x-git-receive-pack-request
Accept: application/x-git-receive-pack-result
Content-Length: 4

* upload completely sent off: 4 out of 4 bytes
< HTTP/1.1 200 OK
< Server: nginx/1.14.2
< Date: Sat, 14 Nov 2020 23:44:02 GMT
< Content-Type: application/x-git-receive-pack-result
< Content-Length: 0
< Connection: keep-alive
< X-Powered-By: huggingface-moon
<
* Connection #0 to host huggingface.co left intact
* Couldn't find host huggingface.co in the .netrc file; using defaults
* Found bundle for host huggingface.co: 0x55acdab63f80 [can pipeline]
* Re-using existing connection! (#0) with host huggingface.co
* Connected to huggingface.co (192.99.39.165) port 443 (#0)
* Server auth using Basic with user 'xxxxxx'
> POST /Rostlab/prot_t5_xl_bfd/git-receive-pack HTTP/1.1
Host: huggingface.co
Authorization: Basic YWdlbWFnaWNpYW46VWRjXzEyMDA=
User-Agent: git/2.17.1
Accept-Encoding: gzip
Content-Type: application/x-git-receive-pack-request
Accept: application/x-git-receive-pack-result
Transfer-Encoding: chunked

Writing objects: 100% (4/4), 4.92 GiB | 23.17 MiB/s, done.
Total 4 (delta 2), reused 1 (delta 0)
* Signaling end of chunked upload via terminating chunk.
* Signaling end of chunked upload via terminating chunk.
* The requested URL returned error: 504 Gateway Time-out
* stopped the pause stream!
* Closing connection 0
error: RPC failed; HTTP 504 curl 22 The requested URL returned error: 504 Gateway Time-out
fatal: The remote end hung up unexpectedly
fatal: The remote end hung up unexpectedly
Everything up-to-date

Expected behavior

I had issue before for uploading big models like T5 “#7480”. It was marked as solved after moving to model versioning: https://github.com/huggingface/transformers/pull/8324 However, @patrickvonplaten fixed some problems in T5 weights : https://github.com/huggingface/transformers/pull/8528 I tried to update the model but it still doesn’t work as showed above.

I tried several tricks like: https://stackoverflow.com/questions/54061758/error-rpc-failed-http-504-curl-22-the-requested-url-returned-error-504-gatewa But could not solve it.

Any ideas ?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 20 (8 by maintainers)

Most upvoted comments

Running into the same issue here trying to upload a 1.4Gb dataset

Enumerating objects: 23985, done.
Counting objects: 100% (23985/23985), done.
Delta compression using up to 4 threads
Compressing objects: 100% (23948/23948), done.
error: RPC failed; HTTP 408 curl 22 The requested URL returned error: 408
send-pack: unexpected disconnect while reading sideband packet
Writing objects: 100% (23949/23949), 1.43 GiB | 4.74 MiB/s, done.
Total 23949 (delta 2), reused 23948 (delta 1), pack-reused 0
fatal: the remote end hung up unexpectedly
Everything up-to-date

Having the same issue when uploading a ~6GB GPT-J model via command line:

git push
Enumerating objects: 4, done.
Counting objects: 100% (4/4), done.
Delta compression using up to 10 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 5.11 GiB | 624.00 KiB/s, done.
Total 3 (delta 1), reused 1 (delta 0), pack-reused 0
error: RPC failed; HTTP 504 curl 22 The requested URL returned error: 504
send-pack: unexpected disconnect while reading sideband packet
fatal: the remote end hung up unexpectedly

It seems to be always around 5.11GB (total file size is 6.32GB). When uploading the same file via the HF website interface I get an error 400. Tried it from various locations (wifis), always the same behavior.

Any helpful advises on how to proceed?

Bildschirmfoto 2022-05-30 um 11 55 56

For anyone how still couldn’t figured it out: it is because the big file is not being tracked git LFS. Reasons maybe:

  1. You don’t have the .gitattributes
  2. You have the .gitattributes, but the file extension is not in the list

You can simply ask git LFS to track your file, for example: git lfs track "*.gguf"

@sgugger @thomwolf can you reopen this issue?

This happened to me too:

❯ git push origin main
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3/3), done.
Writing objects:  75% (3/4), 2.00 GiB | 2.75 MiB/s
Writing objects: 100% (4/4), 2.22 GiB | 2.86 MiB/s, done.
Total 4 (delta 0), reused 1 (delta 0), pack-reused 0
error: RPC failed; HTTP 504 curl 22 The requested URL returned error: 504
send-pack: unexpected disconnect while reading sideband packet
fatal: the remote end hung up unexpectedly
Everything up-to-date
Counting objects: 100% (13/13), done.
Delta compression using up to 16 threads
Compressing objects: 100% (10/10), done.
Writing objects: 100% (12/12), 2.55 GiB | 1.03 MiB/s, done.
Total 12 (delta 1), reused 1 (delta 0), pack-reused 0
error: RPC failed; HTTP 504 curl 22 The requested URL returned error: 504
send-pack: unexpected disconnect while reading sideband packet
fatal: the remote end hung up unexpectedly
Everything up-to-date

Me too

/content/russianpoetrymany# git push
Counting objects: 11, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (9/9), done.
Writing objects: 100% (11/11), 6.26 GiB | 91.43 MiB/s, done.
Total 11 (delta 0), reused 2 (delta 0)
error: RPC failed; HTTP 504 curl 22 The requested URL returned error: 504 Gateway Time-out
fatal: The remote end hung up unexpectedly
fatal: The remote end hung up unexpectedly
Everything up-to-date


@julien-c @patrickvonplaten uploading a mbart model from google colab and got it . any pointers ??

Perfect, thanks a lot Patrick. We will test it right away.

Hopefully, the issue will be fixed soon, so I don’t have to waste your time again with the 11B model soon.

No worries at all! Feel free to open a new issue -> doesn’t take me long at all to upload it 😃