Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
shibing624 8c345e9661 | 6 months ago | |
---|---|---|
.. | ||
README.md | 6 months ago | |
demo.py | 6 months ago | |
predict.py | 6 months ago | |
train.py | 6 months ago |
torch>=1.4.0
transformers>=4.4.2
example: examples/seq2seq/demo.py
from pycorrector import ConvSeq2SeqCorrector
m = ConvSeq2SeqCorrector()
print(m.correct_batch(['今天新情很好', '你找到你最喜欢的工作,我也很高心。']))
output:
[{'source': '今天新情很好', 'target': '今天心情很好', 'errors': [('新', '心', 2)]},
{'source': '你找到你最喜欢的工作,我也很高心。', 'target': '你找到你最喜欢的工作,我也很高兴。', 'errors': [('心', '兴', 15)]}]
sighan 2015中文拼写纠错数据(2k条):examples/data/sighan_2015/train.tsv
data format:
你说的是对,跟那些失业的人比起来你也算是辛运的。 你说的是对,跟那些失业的人比起来你也算是幸运的。
nlpcc2018+hsk dataset, download from https://pan.baidu.com/s/1BkDru60nQXaDVLRSr7ktfA 密码:m6fg [130W sentence pair,215MB]
run train:
python train.py --do_train --do_predict
python predict.py
output:
[{'source': '今天新情很好', 'target': '今天心情很好', 'errors': [('新', '心', 2)]},
{'source': '你找到你最喜欢的工作,我也很高心。', 'target': '你找到你最喜欢的工作,我也很高兴。', 'errors': [('心', '兴', 15)]}]
基于SIGHAN2015数据集训练的convseq2seq模型,已经release到github:
Text Python other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》