Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
Freatraum ec0d805079 | 1 year ago | |
---|---|---|
configs | 1 year ago | |
dataset_raw | 1 year ago | |
filelists | 1 year ago | |
hubert | 1 year ago | |
inference | 1 year ago | |
logs/48k | 1 year ago | |
raw | 1 year ago | |
vdecoder | 1 year ago | |
Eng_docs.md | 1 year ago | |
LICENSE | 1 year ago | |
README.md | 1 year ago | |
add_speaker.py | 1 year ago | |
attentions.py | 1 year ago | |
commons.py | 1 year ago | |
data_utils.py | 1 year ago | |
flask_api.py | 1 year ago | |
inference.ipynb | 1 year ago | |
inference_main.py | 1 year ago | |
losses.py | 1 year ago | |
mel_processing.py | 1 year ago | |
models.py | 1 year ago | |
modules.py | 1 year ago | |
preprocess_flist_config.py | 1 year ago | |
preprocess_hubert_f0.py | 1 year ago | |
requirements.txt | 1 year ago | |
resample.py | 1 year ago | |
spec_gen.py | 1 year ago | |
train.py | 1 year ago | |
utils.py | 1 year ago |
更改了存储输出位置,更适合启智平台,可在训练任务-结果下载中直接下载
据不完全统计,多说话人似乎会导致音色 漏加重,不建议训练超过10人的模型,目前的建议是如果想炼出来更像目标音色,尽可能炼单说话人的
针对sovits3.0 48khz模型推理显存占用大的问题,可以切换到32khz的分支 版本训练32khz的模型
目前发现一个较大问题,3.0推理时显存占用巨大,6G显存基本只能推理30s左右长度音频
断音问题已解决,音质提升了不少
2.0版本已经移至 sovits_2.0分支
3.0版本使用FreeVC的代码结构,与旧版本不通用
与DiffSVC 相比,在训练数据质量非常高时diffsvc有着更好的表现,对于质量差一些的数据集,本仓库可能会有更好的表现,此外,本仓库推理速度上比diffsvc快很多
歌声音色转换模型,通过SoftVC内容编码器提取源音频语音特征,与F0同时输入VITS替换原本的文本输入达到歌声转换的效果。同时,更换声码器为 NSF HiFiGAN 解决断音问题
当前分支是48khz的版本,使用时需要先git checkout main,推理时显存占用较大,经常会出现爆显存的问题,如果爆显存需要手动将音频切片逐片段转换,推荐切换到32khz的分支 训练32khz版本的模型
# 一键下载
# hubert
wget -P hubert/ https://github.com/bshall/hubert/releases/download/v0.1/hubert-soft-0d54a1f4.pt
# G与D预训练模型
wget -P logs/48k/ https://huggingface.co/innnky/sovits_pretrained/resolve/main/G_0.pth
wget -P logs/48k/ https://huggingface.co/innnky/sovits_pretrained/resolve/main/D_0.pth
仅需要以以下文件结构将数据集放入dataset_raw目录即可
dataset_raw
├───speaker0
│ ├───xxx1-xxx1.wav
│ ├───...
│ └───Lxx-0xx8.wav
└───speaker1
├───xx2-0xxx2.wav
├───...
└───xxx7-xxx007.wav
python resample.py
python preprocess_flist_config.py
# 注意
# 自动生成的配置文件中,说话人数量n_speakers会自动按照数据集中的人数而定
# 为了给之后添加说话人留下一定空间,n_speakers自动设置为 当前数据集人数乘2
# 如果想多留一些空位可以在此步骤后 自行修改生成的config.json中n_speakers数量
# 一旦模型开始训练后此项不可再更改
python preprocess_hubert_f0.py
执行完以上步骤后 dataset 目录便是预处理完成的数据,可以删除dataset_raw文件夹了
python train.py -c configs/config.json -m 48k
No Description
Python Markdown Jupyter Notebook Text
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》