Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
yeyupiaoling 4906f4a974 | 2 years ago | |
---|---|---|
conf | 2 years ago | |
data_utils | 2 years ago | |
dataset | 2 years ago | |
decoders | 2 years ago | |
docs | 2 years ago | |
download_data | 2 years ago | |
model_utils | 2 years ago | |
static | 2 years ago | |
templates | 2 years ago | |
tools | 2 years ago | |
utils | 2 years ago | |
.gitignore | 2 years ago | |
LICENSE | 4 years ago | |
README.md | 2 years ago | |
create_data.py | 2 years ago | |
eval.py | 2 years ago | |
export_model.py | 2 years ago | |
infer_gui.py | 2 years ago | |
infer_path.py | 2 years ago | |
infer_server.py | 2 years ago | |
requirements.txt | 2 years ago | |
train.py | 2 years ago |
本项目是基于PaddlePaddle的DeepSpeech 项目开发的,做了较大的修改,方便训练中文自定义数据集,同时也方便测试和使用。DeepSpeech2是基于PaddlePaddle实现的端到端自动语音识别(ASR)引擎,其论文为《Baidu's Deep Speech 2 paper》 ,本项目同时还支持各种数据增强方法,以适应不同的使用场景。支持在Windows,Linux下训练和预测,支持Nvidia Jetson等开发板推理预测,该分支为新版本,如果要使用旧版本,请查看release/1.0分支。
本项目使用的环境:
数据集 | 卷积层数量 | 循环神经网络的数量 | 循环神经网络的大小 | 测试集字错率 | 下载地址 |
---|---|---|---|---|---|
aishell(179小时) | 2 | 3 | 1024 | 0.084532 | 点击下载 |
free_st_chinese_mandarin_corpus(109小时) | 2 | 3 | 1024 | 0.170260 | 点击下载 |
thchs_30(34小时) | 2 | 3 | 1024 | 0.026838 | 点击下载 |
超大数据集(1600多小时真实数据)+(1300多小时合成数据) | 2 | 3 | 1024 | 训练中 | 训练中 |
说明: 这里提供的是训练参数,如果要用于预测,还需要执行导出模型,使用的解码方法是集束搜索。
有问题欢迎提 issue 交流
python infer_path.py --wav_path=./dataset/test.wav
输出结果:
----------- Configuration Arguments -----------
alpha: 1.2
beam_size: 10
beta: 0.35
cutoff_prob: 1.0
cutoff_top_n: 40
decoding_method: ctc_greedy
enable_mkldnn: False
is_long_audio: False
lang_model_path: ./lm/zh_giga.no_cna_cmn.prune01244.klm
mean_std_path: ./dataset/mean_std.npz
model_dir: ./models/infer/
to_an: True
use_gpu: True
use_tensorrt: False
vocab_path: ./dataset/zh_vocab.txt
wav_path: ./dataset/test.wav
------------------------------------------------
消耗时间:132, 识别结果: 近几年不但我用书给女儿儿压岁也劝说亲朋不要给女儿压岁钱而改送压岁书, 得分: 94
python infer_path.py --wav_path=./dataset/test_vad.wav --is_long_audio=True
基于PaddlePaddle实现的语音识别,中文语音识别。项目完善,识别效果好。支持Windows,Linux下训练和预测,支持Nvidia Jetson开发板预测。
Python JavaScript HTML CSS
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》