transformers: push_to_hub returns "OSError: error: RPC failed; HTTP 408 curl 22 The requested URL returned error: 408"

System Info

- `transformers` version: 4.18.0
- Platform: Linux-5.4.0-96-generic-x86_64-with-glibc2.17
- Python version: 3.8.12
- Huggingface_hub version: 0.5.1
- PyTorch version (GPU?): 1.11.0+cu102 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, …)
  • My own task or dataset (give details below)

Reproduction

Following the docs at https://huggingface.co/course/chapter5/5#uploading-the-dataset-to-the-hugging-face-hub. In my case I try to upload 413299 smaller jsonl files that are separate tasks that get joined depending on the data subset in the dataset setup later (which is why I would like to keep them separated).

This happens after some time when I run repo.push_to_hub() but nothing shows up on the dataset site on the HF hub (currently private until everything is finished):

Several commits (5) will be pushed upstream.
The progress bars may be unreliable.
error: RPC failed; HTTP 408 curl 22 The requested URL returned error: 408
fatal: the remote end hung up unexpectedly
fatal: the remote end hung up unexpectedly
Everything up-to-date

---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
File ~/miniconda3/envs/lmproj2/lib/python3.8/site-packages/huggingface_hub/repository.py:1201, in Repository.git_push(self, upstream, blocking, auto_lfs_prune)
   1200             if return_code:
-> 1201                 raise subprocess.CalledProcessError(
   1202                     return_code, process.args, output=stdout, stderr=stderr
   1203                 )
   1205 except subprocess.CalledProcessError as exc:

CalledProcessError: Command '['git', 'push', '--set-upstream', 'origin', 'main']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
Input In [100], in <cell line: 1>()
----> 1 repo.push_to_hub()

File ~/miniconda3/envs/lmproj2/lib/python3.8/site-packages/huggingface_hub/repository.py:1475, in Repository.push_to_hub(self, commit_message, blocking, clean_ok, auto_lfs_prune)
   1473 self.git_add(auto_lfs_track=True)
   1474 self.git_commit(commit_message)
-> 1475 return self.git_push(
   1476     upstream=f"origin {self.current_branch}",
   1477     blocking=blocking,
   1478     auto_lfs_prune=auto_lfs_prune,
   1479 )

File ~/miniconda3/envs/lmproj2/lib/python3.8/site-packages/huggingface_hub/repository.py:1206, in Repository.git_push(self, upstream, blocking, auto_lfs_prune)
   1201                 raise subprocess.CalledProcessError(
   1202                     return_code, process.args, output=stdout, stderr=stderr
   1203                 )
   1205 except subprocess.CalledProcessError as exc:
-> 1206     raise EnvironmentError(exc.stderr)
   1208 if not blocking:
   1210     def status_method():

OSError: error: RPC failed; HTTP 408 curl 22 The requested URL returned error: 408
fatal: the remote end hung up unexpectedly
fatal: the remote end hung up unexpectedly
Everything up-to-date

Expected behavior

Data is completely uploaded and shows up on the dataset site on the HF hub.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 21 (1 by maintainers)

Most upvoted comments

Increase the buffer size by typing

git config http.postBuffer 9999999999999999

and use the SSH link instead of HTTP as SSH is more stable

git remote set-url origin git@github.com:username/repository.git

Thank you. It worked for me. 9999999999999999 was too big that it caused an error in my case.

The solution was for me to type : git config http.postBuffer 99999999

I also had the same problem, switching to a Wi-Fi network worked for me. but i don’t know y if any one know the answer please tell me

Increase the buffer size by typing

git config http.postBuffer 9999999999999999

and use the SSH link instead of HTTP as SSH is more stable

git remote set-url origin git@github.com:username/repository.git