Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
root 7b2ed2b8a4 | 1 year ago | |
---|---|---|
.. | ||
data | 1 year ago | |
data_process | 1 year ago | |
model/ncbi | 1 year ago | |
pretrain/pubmed_bert | 1 year ago | |
prompt_ranking | 1 year ago | |
prompt_retrieval | 1 year ago | |
README.md | 1 year ago | |
arc.pdf | 1 year ago |
Code for IJCAI 2022 paper: Enhancing Entity Representations with Prompt Learning for Biomedical Entity Linking.
We propose a two-stage entity linking algorithm to enhance the entity representations based on prompt learning. The first stage includes a coarser-grained retrieval from a representation space defined by a bi encoder that independently embeds the mentions and entities’ surface forms. Unlike previous one-model-fits-all systems, each candidate is then re-ranked with a finer-grained encoder based on prompt-tuning that concatenates the mention context and entity information. Extensive experiments show that our model achieves promising performance improvements compared with several state of-the-art techniques on the largest biomedical public dataset MedMentions and the NCBI disease corpus.
We also observe by cases that the proposed prompt-tuning strategy is effective in solving both the variety and ambiguity challenges in the linking task.
python: 3.8
PyTorch: 1.9.0
transformers: 4.10.0
openprompt: 0.1.1
you can go here to know more about OpenPrompt.
1.Download the pytorch based pubmed bert pretrained model from here, and put it to the folder "pretrain".
2.Generate training sample and test sample data according to the file data_process/data_process.py.Provide entity dictionary file, entity type information file, and corresponding sample files of mention and gold entity according to the code description to generate corresponding training samples.
3.Run prompt_ranking/prompt_medicine_train.py to train the model.
4.Run prompt_ranking/prompt_medicine_predict.py to predict the result.
5.Run prompt_retrieval/prompt_entity_vector.py to generate mention and entity vector with prompt model.
6.Run prompt_retrieval/vector_search.py to serach top N candidates with prompt model.
医学自然语言处理算法库,包括命名实体 、语义关系抽取、时间序列预测、预训练模型等。
Text Pickle Python HTML SVG other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》