Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
2 years ago | |
---|---|---|
data | 2 years ago | |
src | 2 years ago | |
LICENSE | 2 years ago | |
README.md | 2 years ago |
This repository is the implementation of 'NeuCrowd: Neural Sampling Network for Representation Learning with Crowdsourced Labels'.
To develop locally, please follow the instruction below:
git clone git@github.com:tal-ai/NeuCrowd_KAIS2021.git
cd NeuCrowd_KAIS2021
Please convert your data into three csv data files (train.csv, valid.csv, test.csv) of correct format, each of which contains a label column which stand for the category(0/1).
The rest columns are the feature columns.
Two example data sets used in the paper are stored in data
folder.
You can train the model as follow:
cd src/
python trainRLL.py your_data_set_path
You can test the model as follow:
cd src/
python inference_neucrowd.py your_data_set_path
If you have any problem to the project, please feel free to report them as issues.
该算法提出一个基于众包标签的监督表示学习 (SRL) 统一框架 NeuCrowd。可以缓解由于数据隐私、预算限制、特定领域标注人员短缺等导致的众包标签的数量有限的问题。该框架 (1) 通过safety-aware 抽样和稳健的锚点生成,创建了大量高质量 n 元组训练样本;(2)学习一个抽样神经网络,自适应地为 SRL 网络选择有效样本。在酒店评论数据集上,accuracy 达到 87.1%;在 Pre-K Children Speech 数据集上accuracy 达到 86.7%。
CSV Python
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》