Skip to content

OOM error occurs when downloading model using recent versions of Huggingface Hub Python library (worked fine in older versions) #3300

@Jooho

Description

@Jooho

Describe the bug

I’m encountering an Out Of Memory (OOM) error when downloading models using the latest versions of the Huggingface Hub Python library.

Previously, with older versions, the same code and models downloaded without any memory issues. However, after upgrading to a recent release, the process consistently runs out of memory during model download.

Is this a known issue or related to internal changes in how files are cached or loaded? Any guidance on how to mitigate the memory usage or workaround would be highly appreciated.

I used init-container to pull a model but it has nothing to do with test itself.

Reproduction

In kuberentes, you can create the following pod.

apiVersion: v1
kind: Pod
metadata:
  name: hf-model-downloader
spec:
  containers:
    - name: main-app
      image: quay.io/jooholee/busybox
      command: ["sleep", "3600"]
      volumeMounts:
        - name: model-cache
          mountPath: /mnt/models
  initContainers:
    - name: model-downloader
      image: quay.io/jooholee/python:3.10-slim
      command:
        - sh
        - -c
        - |
          #pip install --no-cache-dir huggingface_hub==0.13.4 && \
          pip install --no-cache-dir huggingface_hub==0.34.4 && \
          export HF_HOME=/mnt/models && \
          python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='deepseek-ai/DeepSeek-V2-Lite-Chat', local_dir='/mnt/models', local_dir_use_symlinks=False, max_workers=8, resume_download=True)"
      volumeMounts:
        - name: model-cache
          mountPath: /mnt/models
      resources:
        requests:
          memory: 512Mi
        limits:
          memory: 1Gi
  volumes:
    - name: model-ca

this use the latest version of lib and it failed

The following use very old one 0.13.4

apiVersion: v1
kind: Pod
metadata:
  name: hf-model-downloader
spec:
  containers:
    - name: main-app
      image: quay.io/jooholee/busybox
      command: ["sleep", "3600"]
      volumeMounts:
        - name: model-cache
          mountPath: /mnt/models
  initContainers:
    - name: model-downloader
      image: quay.io/jooholee/python:3.10-slim
      command:
        - sh
        - -c
        - |
          pip install --no-cache-dir huggingface_hub==0.13.4 && \
          export HF_HOME=/mnt/models && \
          python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='deepseek-ai/DeepSeek-V2-Lite-Chat', local_dir='/mnt/models', local_dir_use_symlinks=False, max_workers=8, resume_download=True)"
      volumeMounts:
        - name: model-cache
          mountPath: /mnt/models
      resources:
        requests:
          memory: 512Mi
        limits:
          memory: 1Gi
  volumes:
    - name: model-ca

This one successfully downloaded.

When I see memory usage, the first one tried to use around 10G to pull the model properly but the second(old) just use around 200M.

Logs

not much log left

first one(the recent version)

[notice] A new release of pip is available: 23.0.1 -> 25.2
[notice] To update, run: pip install --upgrade pip
/usr/local/lib/python3.10/site-packages/huggingface_hub/file_download.py:945: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
/usr/local/lib/python3.10/site-packages/huggingface_hub/file_download.py:982: UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir`.
For more details, check out https://huggingface.co/docs/huggingface_hub/main/en/guides/download#download-files-to-local-folder.
  warnings.warn(
Fetching 15 files:  20%|██        | 3/15 [00:00<00:00, 20.44it/s]%     

...
hf-model-downloader                                               0/1     Init:OOMKilled   1 (18s ago)   25s
..

for second(old) one:

Collecting idna<4,>=2.5
  Downloading idna-3.10-py3-none-any.whl (70 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 70.4/70.4 kB 234.9 MB/s eta 0:00:00
Installing collected packages: urllib3, typing-extensions, tqdm, pyyaml, packaging, idna, filelock, charset_normalizer, certifi, requests, huggingface_hub
Successfully installed certifi-2025.8.3 charset_normalizer-3.4.3 filelock-3.18.0 huggingface_hub-0.13.4 idna-3.10 packaging-25.0 pyyaml-6.0.2 requests-2.32.4 tqdm-4.67.1 typing-extensions-4.14.1 urllib3-2.5.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

[notice] A new release of pip is available: 23.0.1 -> 25.2
[notice] To update, run: pip install --upgrade pip
Downloading .gitattributes: 1.52kB [00:00, 5.29MB/s]it/s]
Downloading LICENSE: 13.8kB [00:00, 4.93MB/s]       | 0.00/8.59G [00:00<?, ?B/s]
Downloading README.md: 15.5kB [00:00, 6.39MB/s]     | 0.00/8.59G [00:00<?, ?B/s]
Downloading config.json: 1.52kB [00:00, 937kB/s]
Downloading generation_config.json: 100%|██████████| 181/181 [00:00<00:00, 1.17MB/s]
Downloading (…)guration_deepseek.py: 10.3kB [00:00, 25.2MB/s]
Downloading (…)fetensors.index.json: 480kB [00:00, 149MB/s]
Downloading modeling_deepseek.py: 78.7kB [00:00, 31.5MB/s]181 [00:00<?, ?B/s]
Downloading (…)ion_deepseek_fast.py: 1.37kB [00:00, 2.55MB/s]
Downloading tokenizer.json: 4.61MB [00:00, 110MB/s] | 0.00/8.59G [00:00<?, ?B/s]
Downloading tokenizer_config.json: 1.28kB [00:00, 4.64MB/s]
Downloading (…)f-000004.safetensors: 100%|██████████| 5.64G/5.64G [02:49<00:00, 33.3MB/s]
Downloading (…)f-000004.safetensors: 100%|██████████| 8.59G/8.59G [03:55<00:00, 36.5MB/s]
Downloading (…)f-000004.safetensors: 100%|██████████| 8.59G/8.59G [03:57<00:00, 36.2MB/s]
Downloading (…)f-000004.safetensors: 100%|██████████| 8.59G/8.59G [03:57<00:00, 36.2MB/s]
Fetching 15 files: 100%|██████████| 15/15 [03:57<00:00, 15.82s/it][03:56<00:00, 63.5MB/s]

System info

[notice] A new release of pip is available: 23.0.1 -> 25.2
[notice] To update, run: pip install --upgrade pip

Copy-and-paste the text below in your GitHub issue.

- huggingface_hub version: 0.34.4
- Platform: Linux-5.14.0-427.70.1.el9_4.x86_64-x86_64-with-glibc2.36
- Python version: 3.10.18
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Running in Google Colab Enterprise ?: No
- Token path ?: /mnt/models/token
- Has saved token ?: False
- FastAI: N/A
- Tensorflow: N/A
- Torch: N/A
- Jinja2: N/A
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: N/A
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: N/A
- pydantic: N/A
- aiohttp: N/A
- hf_xet: 1.1.7
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /mnt/models/hub
- HF_ASSETS_CACHE: /mnt/models/assets
- HF_TOKEN_PATH: /mnt/models/token
- HF_STORED_TOKENS_PATH: /mnt/models/stored_tokens
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_DISABLE_XET: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10


---
For second

- huggingface_hub version: 0.13.4
- Platform: Linux-5.14.0-427.70.1.el9_4.x86_64-x86_64-with-glibc2.36
- Python version: 3.10.18
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: /mnt/models/token
- Has saved token ?: False
- FastAI: N/A
- Tensorflow: N/A
- Torch: N/A
- Jinja2: N/A
- Graphviz: N/A
- Pydot: N/A
- Pillow: N/A
- hf_transfer: N/A
- ENDPOINT: https://huggingface.co
- HUGGINGFACE_HUB_CACHE: /mnt/models/hub
- HUGGINGFACE_ASSETS_CACHE: /mnt/models/assets
- HF_TOKEN_PATH: /mnt/models/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False

FYI, I tried to change all cache directories but the result was the same.

  - name: model-downloader
      image: quay.io/jooholee/python:3.10-slim
      env:
      - name: HF_ACCELERATE_CACHE
        value: /models
      - name: HF_TOKENIZERS_CACHE
        value: /models
      - name: HF_MODULES_CACHE
        value: /models
      - name: HF_METRICS_CACHE
        value: /models
      - name: HF_DATASETS_CACHE
        value: /models
      - name: TRANSFORMERS_CACHE
        value: /models
      - name: HF_HUB_CACHE
        value: /models
      - name: HF_HOME
        value: /models
      - name: HOME
        value: /tmp
      command:
        - sh
        - -c
        - |
          #pip install --no-cache-dir huggingface_hub==0.13.4 && \
          pip install --no-cache-dir huggingface_hub==0.34.4 && \
          HF_DEBUG=1 \
          export HF_HOME=/mnt/models && \
          python -c "from huggingface_hub import dump_environment_info;dump_environment_info()"
          python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='deepseek-ai/DeepSeek-V2-Lite-Chat', local_dir='/mnt/models', local_dir_use_symlinks=False, max_workers=8, resume_download=True)"
      volumeMounts:
        - name: home
          mountPath: /tmp
        - name: models
          mountPath: /models
        - name: model-cache
          mountPath: /mnt/models
      resources:
        requests:
          memory: 512Mi
        limits:
          memory: 1Gi
  volumes:
    - name: home
      emptyDir: {}
    - name: model-cache
      emptyDir: {}
    - name: models
      emptyDir: {}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions