-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
$ python scripts/inference.py --source_image examples/reference_images/1.jpg --driving_audio examples/driving_audios/1.wav
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'prefer_nhwc': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_enable': '0', 'enable_cuda_graph': '0', 'tunable_op_max_tuning_duration_ms': '0', 'tunable_op_tuning_enable': '0', 'cudnn_conv_use_max_workspace': '1', 'use_tf32': '1', 'cudnn_conv1d_pad_to_nc1d': '0', 'do_copy_in_default_stream': '1', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'gpu_external_empty_cache': '0', 'gpu_external_free':
'0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'arena_extend_strategy': 'kNextPowerOfTwo', 'user_compute_stream': '0', 'has_user_compute_stream': '0', 'use_ep_level_unified_stream': '0', 'device_id': '0'}}
find model: ./pretrained_models/face_analysis/models/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'prefer_nhwc': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_enable': '0', 'enable_cuda_graph': '0', 'tunable_op_max_tuning_duration_ms': '0', 'tunable_op_tuning_enable': '0', 'cudnn_conv_use_max_workspace': '1', 'use_tf32': '1', 'cudnn_conv1d_pad_to_nc1d': '0', 'do_copy_in_default_stream': '1', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'gpu_external_empty_cache': '0', 'gpu_external_free':
'0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'arena_extend_strategy': 'kNextPowerOfTwo', 'user_compute_stream': '0', 'has_user_compute_stream': '0', 'use_ep_level_unified_stream': '0', 'device_id': '0'}}
find model: ./pretrained_models/face_analysis/models/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'prefer_nhwc': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_enable': '0', 'enable_cuda_graph': '0', 'tunable_op_max_tuning_duration_ms': '0', 'tunable_op_tuning_enable': '0', 'cudnn_conv_use_max_workspace': '1', 'use_tf32': '1', 'cudnn_conv1d_pad_to_nc1d': '0', 'do_copy_in_default_stream': '1', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'gpu_external_empty_cache': '0', 'gpu_external_free':
'0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'arena_extend_strategy': 'kNextPowerOfTwo', 'user_compute_stream': '0', 'has_user_compute_stream': '0', 'use_ep_level_unified_stream': '0', 'device_id': '0'}}
find model: ./pretrained_models/face_analysis/models/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'prefer_nhwc': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_enable': '0', 'enable_cuda_graph': '0', 'tunable_op_max_tuning_duration_ms': '0', 'tunable_op_tuning_enable': '0', 'cudnn_conv_use_max_workspace': '1', 'use_tf32': '1', 'cudnn_conv1d_pad_to_nc1d': '0', 'do_copy_in_default_stream': '1', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'gpu_external_empty_cache': '0', 'gpu_external_free':
'0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'arena_extend_strategy': 'kNextPowerOfTwo', 'user_compute_stream': '0', 'has_user_compute_stream': '0', 'use_ep_level_unified_stream': '0', 'device_id': '0'}}
find model: ./pretrained_models/face_analysis/models/glintr100.onnx recognition ['None', 3, 112, 112] 127.5 127.5
Applied providers: ['CUDAExecutionProvider', 'CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}, 'CUDAExecutionProvider': {'prefer_nhwc': '0', 'enable_skip_layer_norm_strict_mode': '0', 'tunable_op_enable': '0', 'enable_cuda_graph': '0', 'tunable_op_max_tuning_duration_ms': '0', 'tunable_op_tuning_enable': '0', 'cudnn_conv_use_max_workspace': '1', 'use_tf32': '1', 'cudnn_conv1d_pad_to_nc1d': '0', 'do_copy_in_default_stream': '1', 'cudnn_conv_algo_search': 'EXHAUSTIVE', 'gpu_external_empty_cache': '0', 'gpu_external_free':
'0', 'gpu_external_alloc': '0', 'gpu_mem_limit': '18446744073709551615', 'arena_extend_strategy': 'kNextPowerOfTwo', 'user_compute_stream': '0', 'has_user_compute_stream': '0', 'use_ep_level_unified_stream': '0', 'device_id': '0'}}
find model: ./pretrained_models/face_analysis/models/scrfd_10g_bnkps.onnx detection [1, 3, '?', '?'] 127.5 128.0
set det-size: (640, 640)
libEGL warning: MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1735477071.154004 3553 task_runner.cc:85] GPU suport is not available: INTERNAL: ; RET_CHECK failure (mediapipe/gpu/gl_context_egl.cc:84) egl_initializedUnable to initialize EGL
W0000 00:00:1735477071.178249 3553 face_landmarker_graph.cc:174] Sets FaceBlendshapesGraph acceleration to xnnpack by default.
libEGL warning: MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libEGL warning: MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libEGL warning: MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libEGL warning: MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
libEGL warning: MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:$${ORIGIN}/dri:/usr/lib/dri, suffix _dri)
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
W0000 00:00:1735477071.230974 3690 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1735477071.246707 3700 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
Processed and saved: ./.cache/1_sep_background.png
Processed and saved: ./.cache/1_sep_face.png
Some weights of Wav2VecModel were not initialized from the model checkpoint at ./pretrained_models/wav2vec/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
2024-12-29 20:57:51,829 - INFO - separator - Separator version 0.17.2 instantiating with output_dir: ./.cache/audio_preprocess, output_format: WAV
2024-12-29 20:57:51,829 - INFO - separator - Operating System: Linux #1 SMP Tue Nov 5 00:21:55 UTC 2024
2024-12-29 20:57:51,833 - INFO - separator - System: Linux Node: dpc Release: 5.15.167.4-microsoft-standard-WSL2 Machine: x86_64 Proc: x86_64
2024-12-29 20:57:51,833 - INFO - separator - Python Version: 3.10.16
2024-12-29 20:57:51,834 - INFO - separator - PyTorch Version: 2.2.2+cu121
2024-12-29 20:57:52,253 - INFO - separator - FFmpeg installed: ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg
developers
2024-12-29 20:57:52,254 - INFO - separator - ONNX Runtime GPU package installed with version: 1.18.0
2024-12-29 20:57:52,254 - INFO - separator - CUDA is available in Torch, setting Torch device to CUDA
2024-12-29 20:57:52,254 - INFO - separator - ONNXruntime has CUDAExecutionProvider available, enabling acceleration
2024-12-29 20:57:52,254 - INFO - separator - Loading model Kim_Vocal_2.onnx...
2024-12-29 20:57:50,521 - INFO - separator - Load model duration: 00:00:00
2024-12-29 20:57:50,522 - INFO - separator - Starting separation process for audio_file_path: examples/driving_audios/1.wav
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:08<00:00, 2.89s/it]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 10.25it/s]
2024-12-29 20:58:01,237 - INFO - mdx_separator - Saving Vocals stem to 1_(Vocals)_Kim_Vocal_2.wav...
2024-12-29 20:58:01,403 - INFO - common_separator - Clearing input audio file paths, sources and stems...
2024-12-29 20:58:01,403 - INFO - separator - Separation duration: 00:00:10
The config attributes {'center_input_sample': False, 'out_channels': 4} were passed to UNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Some weights of the model checkpoint were not used when initializing UNet2DConditionModel:
['conv_norm_out.bias, conv_norm_out.weight, conv_out.bias, conv_out.weight']
The config attributes {'center_input_sample': False} were passed to UNet3DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Load motion module params from pretrained_models/motion_module/mm_sd_v15_v2.ckpt
Killed