You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
PCL-陶恒韬 2767ccf78d 更新 'datasets/pre_process_bc.py' 10 months ago
..
bpe_4w_pcl update 1 year ago
spm_13w update 1 year ago
spm_25w update 1 year ago
README.md update 1 year ago
dataset_download.py update 1 year ago
dataset_sample.py update 1 year ago
mindrecord_shuffle.py update 1 year ago
pre_process_bc.py 更新 'datasets/pre_process_bc.py' 10 months ago

mPanGu-α-53是首个以中文为中心的多语言&机器翻译模型,在一带一路沿线66个国家53种语种上进行预训练和单双语混合增量训练,单模型支持一带一路53个语种任两语种间的互译,对比WMT2021多语言任务赛道No.1在”中外“100个方向上平均BLEU值提升0.354,支持在NPU/GPU上基于MindSpore分布式训练(最少8卡)、推理(全精度/FP16,1卡)和多语言任务的迁移学习。

Text Python

Apache-2.0

Contributors (3)