You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/guides/upload.md
+1-34Lines changed: 1 addition & 34 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -234,41 +234,8 @@ Future(...)
234
234
### Upload a folder by chunks
235
235
236
236
[`upload_folder`] makes it easy to upload an entire folder to the Hub. However, for large folders (thousands of files or
237
-
hundreds of GB), it can still be challenging. If you have a folder with a lot of files, you might want to upload
238
-
it in several commits. If you experience an error or a connection issue during the upload, you would not have to resume
239
-
the process from the beginning.
240
-
241
-
To upload a folder in multiple commits, just pass `multi_commits=True` as argument. Under the hood, `huggingface_hub`
242
-
will list the files to upload/delete and split them in several commits. The "strategy" (i.e. how to split the commits)
243
-
is based on the number and size of the files to upload. A PR is open on the Hub to push all the commits. Once the PR is
244
-
ready, the commits are squashed into a single commit. If the process is interrupted before completing, you can rerun
245
-
your script to resume the upload. The created PR will be automatically detected and the upload will resume from where
246
-
it stopped. It is recommended to pass `multi_commits_verbose=True` to get a better understanding of the upload and its
247
-
progress.
248
-
249
-
The example below will upload the checkpoints folder to a dataset in multiple commits. A PR will be created on the Hub
250
-
and merged automatically once the upload is complete. If you prefer the PR to stay open and review it manually, you can
251
-
pass `create_pr=True`.
237
+
hundreds of GB), we recommend using [`upload_large_folder`], which splits the upload into multiple commits. See the [Upload a large folder](#upload-a-large-folder) section for more details.
252
238
253
-
```py
254
-
>>> upload_folder(
255
-
...folder_path="local/checkpoints",
256
-
...repo_id="username/my-dataset",
257
-
...repo_type="dataset",
258
-
...multi_commits=True,
259
-
...multi_commits_verbose=True,
260
-
... )
261
-
```
262
-
263
-
If you want a better control on the upload strategy (i.e. the commits that are created), you can have a look at the
264
-
low-level [`plan_multi_commits`] and [`create_commits_on_pr`] methods.
265
-
266
-
<Tipwarning={true}>
267
-
268
-
`multi_commits` is still an experimental feature. Its API and behavior is subject to change in the future without prior
0 commit comments