-
-
Notifications
You must be signed in to change notification settings - Fork 79
Description
When setting up an instance on vast.ai, the default.sh provisioning script does not download CivitAI.com models. Troubleshooting revealed some problems, stemming from the "provisioning_download" function at the bottom of the script. I was able to fix everything so, I'll take some time to lay out the problems and my solution.
It should be noted that I haven't used this provisioning script on any other platforms, just vast.ai - these problems may or may not be restricted to the vast.ai platform.
Source Reference
It's worth mentioning that the "provisioning_download" function within default.sh is an expanded version of the "build_extra_download" function from /build/COPY_ROOT_99/opt/ai-dock/bin/build/layer99/init.sh". Both are pasted below for reference..
init.sh (build_extra_download)
# Download from $1 URL to $2 file path
function build_extra_download() {
wget -qnc --content-disposition --show-progress -e dotbytes="${3:-4M}" -P "$2" "$1"
default.sh (provisioning_download)
}# Download from $1 URL to $2 file path
function provisioning_download() {
if [[ -n $HF_TOKEN && $1 =~ ^https://([a-zA-Z0-9_-]+\.)?huggingface\.co(/|$|\?) ]]; then
auth_token="$HF_TOKEN"
elif
[[ -n $CIVITAI_TOKEN && $1 =~ ^https://([a-zA-Z0-9_-]+\.)?civitai\.com(/|$|\?) ]]; then
auth_token="$CIVITAI_TOKEN"
fi
if [[ -n $auth_token ]];then
wget --header="Authorization: Bearer $auth_token" -qnc --content-disposition --show-progress -e dotbytes="${3:-4M}" -P "$2" "$1"
else
wget -qnc --content-disposition --show-progress -e dotbytes="${3:-4M}" -P "$2" "$1"
fi
}
Issues & Changes Explained
The wget commands are at odds with themselves as written. The "-q" parameter blocks all console and log output, rendering the "--show-progress" and "-e dotbytes" parameters useless. I want wget to be verbose anyway, to see that my models are downloading and identify any errors.
Removed all "-q" parameters from wget commands.
Occasional "invalid value" error related to the -e dotbytes
parameter. As written, the dotbytes value should default to 4M, unless a third parameter is passed to override it. No third parameter is being passed to wget during this process anyway, so I simplified the value to 4M, which eliminated the error for me going forward.
Changed all
-e dotbytes="${3:-4M}"
parameters to-e dotbytes=4M
.
Regular expressions are missing escape characters for forward slashes. They may work in some environments but, for maximum compatibility and the sake of "correct-ness", I added escape characters where needed.
Added \ in front of all / in regular expressions.
Civit.ai: "filename too long" error when using header authorization. Civit.ai seems to prefer authorization by adding the API token to the end of the download URL, rather than being passed in the header. Using the header always resulted in a filename error, whereas using a URL parameter always succeeded. Since HuggingFace.com only accepts header authorization, fixing this required additional URL matching, variables and additions to the if statements. I needed to ensure that Huggingface.co links continue to use header auth, while Civit.ai links will append the token to the end of the URL.
Modified regular expressions to be more specific in matching URLs. This is primarily to distinguish between a Civit.ai URL that ends after the model id vs. a URL that has parameters at the end. This is because the auth token must be appended differently in each case. (They are actually far more specific than they need to be, at the moment, as I plan to add some error correction that would notify the user if they pasted an invalid URL into the provisioning script.)
Added additional elif statements and new variable
url_type
to distinguish between a Huggingface.co URLurl_type=hf
, a basic Civit.ai URLurl_type=civit1
, and a complex Civit.ai URLurl_type=civit2
.
Removed header parameter from wget for Civit.ai statements - appended
?token=$auth_token
to $1 parameter for basic Civit.ai URLs and&token=$auth_token
for complex Civit.ai URLs.
Modified Function
# Download from $1 URL to $2 file path
function provisioning_download() {
if [[ -n $HF_TOKEN && $1 =~ ^https:\/\/huggingface\.co\/.*\/resolve\/.*\.(?:safetensors|bin|ckpt|onnx|pt|pkl|yaml|yml|zip)+$ ]]; then
auth_token="$HF_TOKEN"
url_type=hf
elif
[[ -n $CIVITAI_TOKEN && $1 =~ ^https:\/\/civitai\.com\/api\/download\/models\/[0-9]{1,6}$ ]]; then
auth_token="$CIVITAI_TOKEN"
url_type=civit1
elif
[[ -n $CIVITAI_TOKEN && $1 =~ ^https:\/\/civitai\.com\/api\/download\/models\/[0-9]{1,6}\?(?:type=.*|&format=.*|&size=(full|pruned)|&fp=fp(16|32))+$ ]]; then
auth_token="$CIVITAI_TOKEN"
url_type=civit2
fi
if [[ ( -n $auth_token ) || ( $url_type=hf ) ]];then
wget --header="Authorization: Bearer $auth_token" -nc --content-disposition --show-progress -e dotbytes=4M -P "$2" "$1"
elif [[ ( -n $auth_token) || ( $url_type=civit1 ) ]];then
wget -nc --content-disposition --show-progress -e dotbytes=4M -P "$2" "$1?token=$auth_token"
elif [[ ( -n $auth_token) || ( $url_type=civit2 ) ]];then
wget -nc --content-disposition --show-progress -e dotbytes=4M -P "$2" "$1&token=$auth_token"
else
wget -nc --content-disposition --show-progress -e dotbytes=4M -P "$2" "$1"
fi
}
If desired, please reference the GPT-4o output linked below - it explains the line-by-line operation of the above function in great detail. It helped me verify that my code should work the way I intended before I even moved on to testing. GPT is very, very helpful and impressive when used this way, IMO.
/DaveStuff/stable-diffusion-webui/blob/main/config/provisioning/provisioning_download-README.md
Summary
So far, this version of the function has worked perfectly to download all my models and supporting files from both Huggingface.co and Civit.ai. Feel free to use all or some of my work to enhance the project. This function should work in the majority of environments but, as stated, I have only tested it on vast.ai jupyter instances.
Below is a link to my full, custom provisioning script, which includes this modified download function, as a reference.
/DaveStuff/stable-diffusion-webui/blob/main/config/provisioning/custom.sh
WARNING: Most of the models listed in my provisioning script are NSFW - please don't use my custom script to configure your instance. I'm linking it here as a reference to a working example, ONLY.
Cheers and happy diffusing...