Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
4 years ago | |
---|---|---|
.. | ||
README.md | 4 years ago |
This is the open-source project from Joint Lab of BAAI and JDAI.
We release four models for building dialogue systems including Intent Classification (NLU), Slot Tagging (NER), Dialogue State Tracking (DST) and Question & Answering (QA). NLU and QA models are bert-based models. NER model is based on Bi-LSTM + CRF enhanced by some linguistic features. DST model is the TRADE model. The former three models are trained on our real Customer Service Dialogue Data (CSDD) and the last DST model is trained on public dataset CrossWOZ.
The download links of these models are as below.
Model | Data Source | Link |
---|---|---|
BAAI-JD-Dialogue-Intent, Chinese | CSDD | BAAI-JD-Dialogue-Intent for Tensorflow |
BAAI-JD-Dialogue-Ner, Chinese | CSDD | BAAI-JD-Dialogue-Ner for Tensorflow |
BAAI-JD-Dialogue-Dst, Chinese | CrossWOZ | BAAI-JD-Dialogue-Dst for Pytorch |
BAAI-JD-Dialogue-Sim, Chinese | CSDD | BAAI-JD-Dialogue-Sim for Tensorflow |
1. BAAI-JD Dialogue Intent
This model is used for intent classification in dialogue system.
We define 10 intents in this task, which are the most common intents in E-commerce customer service domain. The model is trained on BAAI-JDAI-BERT, by fine-tuning with in-house annotated intent classification corpus, and the max_seq_length is set to 50.
Intent | Precision | Recall | F1 | Support |
---|---|---|---|---|
配送周期 | 93.11 | 96.83 | 94.93 | 600 |
什么时间出库 | 97.44 | 95.00 | 96.20 | 600 |
售后商品使用问题 | 96.48 | 95.83 | 96.15 | 600 |
商品区别 | 97.70 | 99.17 | 98.43 | 600 |
修改订单 | 98.16 | 98.00 | 98.08 | 600 |
商品推荐 | 96.69 | 97.33 | 97.01 | 600 |
赠品 | 96.70 | 97.67 | 97.18 | 600 |
能否便宜优惠 | 95.84 | 96.00 | 95.92 | 600 |
家电安装 | 96.89 | 98.50 | 97.69 | 600 |
配送方式 | 98.33 | 98.33 | 98.33 | 600 |
other | 86.22 | 81.33 | 83.70 | 600 |
Macro Precision | Macro Recall | Macro F1 | Accuracy | |
---|---|---|---|---|
全测试集 | 95.78 | 95.82 | 95.78 | 95.82 |
2. BAAI-JD Dialogue Tagging
This model is used for name entity recognition in dialogue system.
We define 6 types of entities in this task, which are also very common in the E-commerce customer service domain. The model is based on Bi-LSTM with CRF, and enhanced by some linguistic features.
Entity | Precision | Recall | F1 | Support |
---|---|---|---|---|
brand | 83.33 | 81.58 | 82.45 | 186 |
date | 89.91 | 92.82 | 91.34 | 446 |
location | 89.94 | 89.94 | 89.94 | 467 |
price | 85.67 | 89.67 | 87.62 | 314 |
product | 80.44 | 80.34 | 80.39 | 869 |
time | 88.15 | 87.67 | 87.91 | 363 |
Macro Precision | Macro Recall | Macro F1 | |
---|---|---|---|
全测试集 | 85.60 | 86.28 | 85.94 |
3. BAAI-JD Dialogue DST
This model is used for dialogue state tracking for multi-domains.
The task is to identify slot-value pairs in the query in multi-turn dailogues, meanwhile maintain the dailogue states. The model is based on TRADE model, and enhanced by the BAAI-JDAI-WordEmbedding model.
Joint Accuracy | Turn Accuracy | Joint F1 | |
---|---|---|---|
CrossWOZ | 24.40 | 97.75 | 77.90 |
4. BAAI-JD Dialogue Sim
This model is used for calculating the semantic similarity between question and answer, and it can be used for builiding retrieval based dialogue system.
The task has 2 labels, where "0" means "not match" and "1" means "match". The model is based on BAAI-JDAI-BERT, and fine-tuned with in-house annotated QA matching corpus.
After load the model, you can concatenate the question and answer as "QA", and the model will predict the similarity score. The max_seq_length is set to 128.
Macro Precision | Macro Recall | Macro F1 | Accuracy | |
---|---|---|---|---|
全测试集 | 82.16 | 82.15 | 82.15 | 82.15 |
For more details, you can download the model package and refer to the codes and README in it.
该项目开源了一些自然语言处理的预训练模型。该项目主要关注对话系统的一些基础模型,尤其是电子商务领域。该项目使用 42 GB 的客户服务对话数据 (大约包含 12 亿个句子) 进行训练,并开源了训练好的BERT模型和词嵌入模型。
Python Shell other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》