-
Notifications
You must be signed in to change notification settings - Fork 812
Description
Describe the bug
I’m encountering an Out Of Memory (OOM) error when downloading models using the latest versions of the Huggingface Hub Python library.
Previously, with older versions, the same code and models downloaded without any memory issues. However, after upgrading to a recent release, the process consistently runs out of memory during model download.
Is this a known issue or related to internal changes in how files are cached or loaded? Any guidance on how to mitigate the memory usage or workaround would be highly appreciated.
I used init-container to pull a model but it has nothing to do with test itself.
Reproduction
In kuberentes, you can create the following pod.
apiVersion: v1
kind: Pod
metadata:
name: hf-model-downloader
spec:
containers:
- name: main-app
image: quay.io/jooholee/busybox
command: ["sleep", "3600"]
volumeMounts:
- name: model-cache
mountPath: /mnt/models
initContainers:
- name: model-downloader
image: quay.io/jooholee/python:3.10-slim
command:
- sh
- -c
- |
#pip install --no-cache-dir huggingface_hub==0.13.4 && \
pip install --no-cache-dir huggingface_hub==0.34.4 && \
export HF_HOME=/mnt/models && \
python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='deepseek-ai/DeepSeek-V2-Lite-Chat', local_dir='/mnt/models', local_dir_use_symlinks=False, max_workers=8, resume_download=True)"
volumeMounts:
- name: model-cache
mountPath: /mnt/models
resources:
requests:
memory: 512Mi
limits:
memory: 1Gi
volumes:
- name: model-ca
this use the latest version of lib and it failed
The following use very old one 0.13.4
apiVersion: v1
kind: Pod
metadata:
name: hf-model-downloader
spec:
containers:
- name: main-app
image: quay.io/jooholee/busybox
command: ["sleep", "3600"]
volumeMounts:
- name: model-cache
mountPath: /mnt/models
initContainers:
- name: model-downloader
image: quay.io/jooholee/python:3.10-slim
command:
- sh
- -c
- |
pip install --no-cache-dir huggingface_hub==0.13.4 && \
export HF_HOME=/mnt/models && \
python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='deepseek-ai/DeepSeek-V2-Lite-Chat', local_dir='/mnt/models', local_dir_use_symlinks=False, max_workers=8, resume_download=True)"
volumeMounts:
- name: model-cache
mountPath: /mnt/models
resources:
requests:
memory: 512Mi
limits:
memory: 1Gi
volumes:
- name: model-ca
This one successfully downloaded.
When I see memory usage, the first one tried to use around 10G to pull the model properly but the second(old) just use around 200M.
Logs
not much log left
first one(the recent version)
[notice] A new release of pip is available: 23.0.1 -> 25.2
[notice] To update, run: pip install --upgrade pip
/usr/local/lib/python3.10/site-packages/huggingface_hub/file_download.py:945: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/usr/local/lib/python3.10/site-packages/huggingface_hub/file_download.py:982: UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir`.
For more details, check out https://huggingface.co/docs/huggingface_hub/main/en/guides/download#download-files-to-local-folder.
warnings.warn(
Fetching 15 files: 20%|██ | 3/15 [00:00<00:00, 20.44it/s]%
...
hf-model-downloader 0/1 Init:OOMKilled 1 (18s ago) 25s
..
for second(old) one:
Collecting idna<4,>=2.5
Downloading idna-3.10-py3-none-any.whl (70 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 70.4/70.4 kB 234.9 MB/s eta 0:00:00
Installing collected packages: urllib3, typing-extensions, tqdm, pyyaml, packaging, idna, filelock, charset_normalizer, certifi, requests, huggingface_hub
Successfully installed certifi-2025.8.3 charset_normalizer-3.4.3 filelock-3.18.0 huggingface_hub-0.13.4 idna-3.10 packaging-25.0 pyyaml-6.0.2 requests-2.32.4 tqdm-4.67.1 typing-extensions-4.14.1 urllib3-2.5.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[notice] A new release of pip is available: 23.0.1 -> 25.2
[notice] To update, run: pip install --upgrade pip
Downloading .gitattributes: 1.52kB [00:00, 5.29MB/s]it/s]
Downloading LICENSE: 13.8kB [00:00, 4.93MB/s] | 0.00/8.59G [00:00<?, ?B/s]
Downloading README.md: 15.5kB [00:00, 6.39MB/s] | 0.00/8.59G [00:00<?, ?B/s]
Downloading config.json: 1.52kB [00:00, 937kB/s]
Downloading generation_config.json: 100%|██████████| 181/181 [00:00<00:00, 1.17MB/s]
Downloading (…)guration_deepseek.py: 10.3kB [00:00, 25.2MB/s]
Downloading (…)fetensors.index.json: 480kB [00:00, 149MB/s]
Downloading modeling_deepseek.py: 78.7kB [00:00, 31.5MB/s]181 [00:00<?, ?B/s]
Downloading (…)ion_deepseek_fast.py: 1.37kB [00:00, 2.55MB/s]
Downloading tokenizer.json: 4.61MB [00:00, 110MB/s] | 0.00/8.59G [00:00<?, ?B/s]
Downloading tokenizer_config.json: 1.28kB [00:00, 4.64MB/s]
Downloading (…)f-000004.safetensors: 100%|██████████| 5.64G/5.64G [02:49<00:00, 33.3MB/s]
Downloading (…)f-000004.safetensors: 100%|██████████| 8.59G/8.59G [03:55<00:00, 36.5MB/s]
Downloading (…)f-000004.safetensors: 100%|██████████| 8.59G/8.59G [03:57<00:00, 36.2MB/s]
Downloading (…)f-000004.safetensors: 100%|██████████| 8.59G/8.59G [03:57<00:00, 36.2MB/s]
Fetching 15 files: 100%|██████████| 15/15 [03:57<00:00, 15.82s/it][03:56<00:00, 63.5MB/s]
System info
[notice] A new release of pip is available: 23.0.1 -> 25.2
[notice] To update, run: pip install --upgrade pip
Copy-and-paste the text below in your GitHub issue.
- huggingface_hub version: 0.34.4
- Platform: Linux-5.14.0-427.70.1.el9_4.x86_64-x86_64-with-glibc2.36
- Python version: 3.10.18
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Running in Google Colab Enterprise ?: No
- Token path ?: /mnt/models/token
- Has saved token ?: False
- FastAI: N/A
- Tensorflow: N/A
- Torch: N/A
- Jinja2: N/A
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: N/A
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: N/A
- pydantic: N/A
- aiohttp: N/A
- hf_xet: 1.1.7
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /mnt/models/hub
- HF_ASSETS_CACHE: /mnt/models/assets
- HF_TOKEN_PATH: /mnt/models/token
- HF_STORED_TOKENS_PATH: /mnt/models/stored_tokens
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_DISABLE_XET: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10
---
For second
- huggingface_hub version: 0.13.4
- Platform: Linux-5.14.0-427.70.1.el9_4.x86_64-x86_64-with-glibc2.36
- Python version: 3.10.18
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: /mnt/models/token
- Has saved token ?: False
- FastAI: N/A
- Tensorflow: N/A
- Torch: N/A
- Jinja2: N/A
- Graphviz: N/A
- Pydot: N/A
- Pillow: N/A
- hf_transfer: N/A
- ENDPOINT: https://huggingface.co
- HUGGINGFACE_HUB_CACHE: /mnt/models/hub
- HUGGINGFACE_ASSETS_CACHE: /mnt/models/assets
- HF_TOKEN_PATH: /mnt/models/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
FYI, I tried to change all cache directories but the result was the same.
- name: model-downloader
image: quay.io/jooholee/python:3.10-slim
env:
- name: HF_ACCELERATE_CACHE
value: /models
- name: HF_TOKENIZERS_CACHE
value: /models
- name: HF_MODULES_CACHE
value: /models
- name: HF_METRICS_CACHE
value: /models
- name: HF_DATASETS_CACHE
value: /models
- name: TRANSFORMERS_CACHE
value: /models
- name: HF_HUB_CACHE
value: /models
- name: HF_HOME
value: /models
- name: HOME
value: /tmp
command:
- sh
- -c
- |
#pip install --no-cache-dir huggingface_hub==0.13.4 && \
pip install --no-cache-dir huggingface_hub==0.34.4 && \
HF_DEBUG=1 \
export HF_HOME=/mnt/models && \
python -c "from huggingface_hub import dump_environment_info;dump_environment_info()"
python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='deepseek-ai/DeepSeek-V2-Lite-Chat', local_dir='/mnt/models', local_dir_use_symlinks=False, max_workers=8, resume_download=True)"
volumeMounts:
- name: home
mountPath: /tmp
- name: models
mountPath: /models
- name: model-cache
mountPath: /mnt/models
resources:
requests:
memory: 512Mi
limits:
memory: 1Gi
volumes:
- name: home
emptyDir: {}
- name: model-cache
emptyDir: {}
- name: models
emptyDir: {}