Skip to content

result on v1 has a much higher pitch than the reference / 输出的音调比原音频高了 #444

@teidenzero

Description

@teidenzero

https://drive.google.com/drive/folders/1gn1UZsvuhvxI6CAIh-gL3Hz0qEOoTlx0?usp=sharing

Please see the attached example.
The result pitch for output_terrified.wav is noticeably higher than the reference, does it have something to do with some of the values of the .mp3 files?

output_terrified.wav 的结果音调明显高于参考值,这是否与 .mp3 文件的某些值有关?

The same thing happens when I'm using one of the provided Demo speakers (demo_speaker0.mp3)
当我使用提供的音频时也发生了同样的事情

source_se = torch.load(f'{ckpt_base}/en_style_se.pth').to(device)
save_path = f'{output_dir}/output_angry.wav'

# Run the base speaker tts
text = "ahhh! don't shoot!"
src_path = f'{output_dir}/tmp.wav'
base_speaker_tts.tts(text, src_path, speaker='angry', language='English', speed=1.0)

# Run the tone color converter
encode_message = "@MyShell"
tone_color_converter.convert(
    audio_src_path=src_path, 
    src_se=source_se, 
    tgt_se=target_se, 
    output_path=save_path,
    message=encode_message)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions