Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
tal bf58aefe54 | 1 year ago | |
---|---|---|
code | 1 year ago | |
config | 1 year ago | |
preprocess | 1 year ago | |
tokenizer/libri-roberta_train-960 | 1 year ago | |
LICENSE | 1 year ago | |
README.md | 1 year ago |
git clone https://github.com/Ydkwim/CTAL.git
cd CTAL
pip install -r requirements.txt
Semantic Feature: please refer to the jupyter notebook: nontebook/preprocess_text.ipynb
Acoustic Feature: please refer to the jupyter notebook: notebook/preprocess_audio.ipynb
After you prepare both the acoustic and semantic features, you can start to pre-training the model with executing following shell command:
python run_m2pretrainn.py --run transformer \
--config path/to/your/config.yaml --name model_name
The pre-trained model will be saved to the path: result/transformer/model_name. For the convenience of all the users, we make our pre-trained upstream model available:
CTAL-Base: https://drive.google.com/file/d/1erCQplU9it9XBNrWDyLutekKthsZIi0q/view?usp=sharing
CTAL-Large: https://drive.google.com/file/d/1L_QIZVRybJiiG2NywcX5xQQw8Y-3Vq5I/view?usp=sharing
It is very convient to use our pre-trained upstream model for different types of audio-and-language downstream tasks, including Sentiment Analysis, Emotion Recognition, Speaker Verification, etc. We prepare a sample fine-tuning script m2p_finetune.py here for everyone. To start the fine-tuning process, you can run the following commands:
python m2p_finetune.py --config your/config/path \
--task_name sentiment --epochs 10 --save_path your/save/path
python m2p_finetune.py --config your/config/path \
--task_name emotion --epochs 10 --save_path your/save/path
python m2p_finetune.py --config your/config/path \
--task_name verification --epochs 10 --save_path your/save/path
If you have any problem to the project, please feel free to report them as issues.
该算法提出一个新的基于音频和文本的跨模态预训练模型, CTAL。通过大量的音频和文本对的两个代理任务来学习音频和文本之间的模态内和模态间的联系:屏蔽的语言建模和屏蔽的跨模态声学建模。在对多个下游音频和文本任务进行微调后,CTAL 模型在不同任务上有明显的改进,包括情感分类、情绪分析和说话人验证。其中在 IEMOCAP(Emotion Classification) 数据集上WA 达到73.95%,在MOSEI(Sentiment Analysis)上 F1 达到 81.01%。
Text Python
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》