transformers: [model weights caching] model upload doesn't check model weights hash

I have re-uploaded model weights via transformers-cli upload and noticed that when I tried to use it - it didn’t get re-downloaded, and instead continued to use the cached version.

The problem seems to come from the fact that the other uploaded files haven’t changed, only the model weights.

I double checked that the md5sum of the old weights file is different from the new one.

I re-uploaded the whole folder using:

transformers-cli upload fsmt-wmt19-en-de

If I hunt down the cached files (not an easy task), and delete those, it does re-download the new version.

If I diff the cached weights file and the updated cache file, which gets re-downloaded if I move away the original cached file, they aren’t the same.:

Binary files 
before/d97352d9f1f96ee4c6055f203812035b4597258a837db1f4f0803a2932cc3071.53ce64c7097bfcd85418af04a21b4a897c78c8440de3af078e577727ad9de3a0 
and  
 after/d97352d9f1f96ee4c6055f203812035b4597258a837db1f4f0803a2932cc3071.53ce64c7097bfcd85418af04a21b4a897c78c8440de3af078e577727ad9de3a0 
differ

Could we please include the model weights file in the hash calculation?

Thank you.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (14 by maintainers)

Most upvoted comments

No, just S3 links!

This is due to the CDN caching files, with a 24 hour delay. After 24 hours it should download your file, but if you want it now you can use the use_cdn flag and set it to False. You can see the documentation for this here.

I can confirm it was previously checking the model weights and re-downloading if the weights had been changed. Investigating.