huggingface_hub: scan-cache experiences extreme slowdowns with large models

Describe the bug

when using either hf.scan_cache via python or simply huggingface-cli scan-cache to enumerate already downloaded models, it works fine regardless of number of models if models are small

but once models become larger, it starts slowing down - to about 0.6sec for each 10gb of models. so having 100gf models in cache_dir (which is not that much nowadays), it results in 6seconds to do anything.

given i need to enumerate available models to show which ones are available in application ui (sd.next in my case), this is pretty bad.

Reproduction

download 10 large models and run scan-cache

Logs

> huggingface-cli scan-cache --dir models/Diffusers

      REPO ID                                        REPO TYPE SIZE ON DISK NB FILES LAST_ACCESSED LAST_MODIFIED REFS LOCAL PATH
      ---------------------------------------------- --------- ------------ -------- ------------- ------------- ---- -------------------------------------------------------------------------------
      DucHaiten/DucHaitenAIart                       model             5.5G       16 1 week ago    2 weeks ago   main /mnt/d/Models/Diffusers/models--DucHaiten--DucHaitenAIart
      SG161222/Realistic_Vision_V1.4                 model             5.5G       16 1 week ago    1 week ago    main /mnt/d/Models/Diffusers/models--SG161222--Realistic_Vision_V1.4
      SG161222/Realistic_Vision_V3.0_VAE             model             1.6M        9 1 week ago    1 week ago    main /mnt/d/Models/Diffusers/models--SG161222--Realistic_Vision_V3.0_VAE
      kandinsky-community/kandinsky-2-1              model             7.5G       13 5 days ago    5 days ago    main /mnt/d/Models/Diffusers/models--kandinsky-community--kandinsky-2-1
      kandinsky-community/kandinsky-2-1-prior        model             5.8G       14 5 days ago    5 days ago    main /mnt/d/Models/Diffusers/models--kandinsky-community--kandinsky-2-1-prior
      kandinsky-community/kandinsky-2-2-decoder      model             5.3G        7 4 days ago    4 days ago    main /mnt/d/Models/Diffusers/models--kandinsky-community--kandinsky-2-2-decoder
      kandinsky-community/kandinsky-2-2-prior        model            10.6G       14 4 days ago    4 days ago    main /mnt/d/Models/Diffusers/models--kandinsky-community--kandinsky-2-2-prior
      runwayml/stable-diffusion-v1-5                 model             5.5G       16 1 week ago    2 weeks ago   main /mnt/d/Models/Diffusers/models--runwayml--stable-diffusion-v1-5
      stabilityai/stable-diffusion-xl-base-0.9       model            20.8G       22 1 week ago    1 week ago    main /mnt/d/Models/Diffusers/models--stabilityai--stable-diffusion-xl-base-0.9
      stabilityai/stable-diffusion-xl-refiner-0.9    model            12.2G       13 1 week ago    1 week ago    main /mnt/d/Models/Diffusers/models--stabilityai--stable-diffusion-xl-refiner-0.9
      thu-ml/unidiffuser-v1                          model             5.5G       23 1 day ago     1 day ago     main /mnt/d/Models/Diffusers/models--thu-ml--unidiffuser-v1

  Done in 5.1s. Scanned 13 repo(s) for a total of 84.0G.

System info

- huggingface_hub version: 0.16.4
- Platform: Linux-6.1.21.2-microsoft-standard-WSL2-x86_64-with-glibc2.35
- Python version: 3.10.6
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: /home/vlado/.cache/huggingface/token
- Has saved token ?: True
- Configured git credential helpers: 2592000
- FastAI: N/A
- Tensorflow: 2.12.0
- Torch: 2.1.0.dev20230701+cu121
- Jinja2: 3.1.2
- Graphviz: N/A
- Pydot: N/A
- Pillow: 9.5.0
- hf_transfer: N/A
- gradio: 3.32.0
- tensorboard: 2.6.1
- numpy: 1.23.5
- pydantic: 1.10.11
- aiohttp: 3.8.4
- ENDPOINT: https://huggingface.co
- HUGGINGFACE_HUB_CACHE: /home/vlado/.cache/huggingface/hub
- HUGGINGFACE_ASSETS_CACHE: /home/vlado/.cache/huggingface/assets
- HF_TOKEN_PATH: /home/vlado/.cache/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 18 (9 by maintainers)

Most upvoted comments

Created 2 new issues to be addressed separately:

windows explorer simple copy will fail to copy diffusers folder to a different drive as symlinks on windows only work within a drive and cannot be moved.

The simpler here will be to only move the blobs files and let hf_hub_download recreate the symlinks at runtime if needed. I added a comment about that to the issue.

If I understand correctly, the main feature request of this issue is now: “Can we have an helper to scan the cache, returning for each model a list of its snapshots and for each snapshot a list of the downloaded files but without resolving the symlinks for performances issue? -meaning only the file presence is returned, no information about size/last_accessed/…-”. Am I right? The conversation started to deviate quite a lot so I’m trying to understand what would be the good addition to add here.

fair question. and yes, that is correct.

If I understand correctly, a simple CLI command to move the cache from one path to another would be great for some users?

correct.

Do you know why the symlinks were resolved in actual files instead of copying the ref only? And do you know their platform and what they used to transfer them?

windows explorer simple copy will fail to copy diffusers folder to a different drive as symlinks on windows only work within a drive and cannot be moved.

fyi, i’ve implemened my own “quick model list” and completely removed any dependency on hf.scan_cache.

code is at https://github.com/vladmandic/automatic/blob/21877351872aef2bb402ac83501fe3f383e8ef0e/modules/modelloader.py#L195-L210