Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
shenjiajun a33774f867 | 1 year ago | |
---|---|---|
ascend310_infer | 1 year ago | |
scripts | 1 year ago | |
src | 1 year ago | |
KGNN-12_44.ckpt | 1 year ago | |
README.md | 1 year ago | |
default_config.yaml | 1 year ago | |
eval.py | 1 year ago | |
export.py | 1 year ago | |
postprocess.py | 1 year ago | |
preprocess.py | 1 year ago | |
requirements.txt | 1 year ago | |
train.py | 1 year ago |
KGNN can effectively capture drug and its potential neighborhoods by mining their associated relations in KG. To extract both high-order structures and semantic relations of the KG, KGNN learn from the neighborhoods for each entity in KG as their local receptive, and then integrate neighborhood information with bias from representation of the current entity. This way, the receptive field can be naturally extended to multiple hops away to model highorder topological information and to obtain drugs potential long-distance correlations. This idea was proposed in the paper "KGNN: Knowledge Graph Neural Network for Drug-Drug Interaction Prediction.", published in 2020.
Paper Xuan Lin, Zhe Quan, Zhi-Jie Wang, Tengfei Ma, Xiangxiang Zeng College of Information Science and Engineering, Hunan University, College of Computer Science, Chongqing University, Published in IJCAI-2020.
The network structure can be decomposed into three parts: DDI Extraction and KG Construction, KGNN Layer and Drug-drug Interaction Prediction.It takes the parsed DDI matrix and knowledge graph obtained from preprocessing (i.e., DDI extraction and KG construction) of dataset as the input. It outputs the interaction value for the drug-drug pair. Remind that the central idea of KGNN is to consider both high-order structure and semantic relation, by using graph neural network to encode the drug and its topological neighborhood information to a distributed representation.
Dataset used KEGG-drug
In this project, the file organization is recommended as below:
.
└─raw_data
├─kegg
├─approved_example.txt
├─entity2id.txt
├─relation2id.txt
└─train2id.txt
.
└─kgnn
├─README.md # descriptions about kgcn
├─scripts
├─run_eval_ascend.sh # launch standalone training with GPU platform(1p)
├─run_eval_gpu.sh # launch evaluating with GPU platform
├─run_infer_310.sh # launch evaluating with ascend platform
├─run_standalone_train_ascend.sh # launch standalone training with ascend platform(1p)
└─run_train_gpu.sh # launch standalone training with GPU platform
├─src
├─model_utils
├─config.py # parameter configuration
├─device_adapter.py # device adapter
├─local_adapter.py # local adapter
└─moxing_adapter.py # moxing adapter
├─aggregator.py # aggregator function
├─data_loader.py # data loader function
├─dataset.py # dataset function
├─kgcn.py # kgcn model
├─loss.py # loss
└─utils.py # General components
├─default_config.yaml # parameter configuration
├─eval.py # evaluation script
├─export.py # export scripts
├─postprocess.py # postprocess script
├─preprocess.py # preprocess scripts
└─train.py # training script
# standalone training
bash run_standalone_train_ascend.sh [RAW_DATA_DIR] [DATASET] [DEVICE_ID]
# example: bash run_standalone_train_ascend.sh ./raw_data/ kegg 0
1. Upload dataset.
2. Create a training task and set the following options:
Computing Resources: Ascend NPU
Startup file: train.py
Dataset: kegg.zip
Running parameters: enable_modelarts = True
3. Download the model at the "Results Download" after completing the training.
# standalone training
bash run_train_gpu.sh [RAW_DATA_DIR] [DATASET] [DEVICE_ID]
# example: bash run_train_gpu.sh ./raw_data/ kegg 0
# training example
Ascend:
# standalone training
bash run_standalone_train_ascend.sh [RAW_DATA_DIR] [DATASET] [DEVICE_ID]
# example: bash run_standalone_train_ascend.sh ./raw_data/ kegg 0
GPU:
# standalone training
bash run_train_gpu.sh [RAW_DATA_DIR] [DATASET] [DEVICE_ID]
# example: bash run_train_gpu.sh ./raw_data/ kegg 0
Training result will be stored in the example path. Checkpoints will be stored at ckpt_url
by default, and training log will be redirected to ./log
(1p)
...
Train epoch time: 21830.078 ms, per step time: 496.138 ms
epoch: 19 step: 1, loss is 0.09186707437038422
epoch: 19 step: 2, loss is 0.09054221957921982
epoch: 19 step: 3, loss is 0.0800129845738411
epoch: 19 step: 4, loss is 0.07796357572078705
epoch: 19 step: 5, loss is 0.07740001380443573
epoch: 19 step: 6, loss is 0.08665880560874939
epoch: 19 step: 7, loss is 0.09129385650157928
epoch: 19 step: 8, loss is 0.08729872107505798
epoch: 19 step: 9, loss is 0.08645150810480118
epoch: 19 step: 10, loss is 0.09781090915203094
epoch: 19 step: 11, loss is 0.10565177351236343
epoch: 19 step: 12, loss is 0.08685533702373505
epoch: 19 step: 13, loss is 0.08123263716697693
epoch: 19 step: 14, loss is 0.08262551575899124
epoch: 19 step: 15, loss is 0.08548043668270111
...
You can start training using python or shell scripts. The usage of shell scripts as follows:
# eval example
Ascend:
bash run_eval_ascend.sh [CKPT_PATH] [DEVICE_ID]
# example: bash run_eval_ascend.sh "./KGNN2.0/ckpt/ckpt_0/KGNN-34_44.ckpt" 0
1. Upload ckpt file.
2. Create a inference task and set the following options:
Computing Resources: Ascend NPU
Select model: KGNN-12_44.ckpt
Startup file: eval.py
Dataset: kegg.zip
Running parameters: enable_modelarts = True
3. Find the inference results in log.
eval.py
and run:# eval example
GPU:
bash run_eval_gpu.sh [CKPT_PATH] [DEVICE_ID]
# example: bash run_eval_gpu.sh "./KGNN2.0/ckpt/ckpt_0/KGNN-34_44.ckpt" 0
checkpoint can be produced in training process.
Evaluation result will be stored in the output file of evaluation script, you can find result like the followings in log
.
Logging Info - test_auc: 0.9422428483164229, test_acc: 0.8876509232954546, test_f1: 0.8932487731478626, test_aupr: 0.9188073906877906
python export.py --checkpoint_path=[checkpoint_path]
The checkpoint_path
parameter is required,
# Ascend310 inference
bash run_infer_310.sh [MINDIR_PATH] [DATASET_NAME] [DATASET_PATH] [NEED_PREPROCESS] [DEVICE_ID]
DATASET_NAME
must be in ['kegg', 'drugbank'].NEED_PREPROCESS
means weather need preprocess or not, it's value is 'y' or 'n'.DEVICE_ID
is optional, default value is 0.Inference result is saved in current path, you can find result like this in acc.log file.
Parameters | Ascend | GPU |
---|---|---|
Model Version | KGNN | KGNN |
Resource | Ascend910 | GPU3090 |
uploaded Date | 10/28/2022 | 10/28/2022 |
MindSpore Version | 1.8.0 | 1.8.0 |
Dataset | KEGG-drug | KEGG-drug |
Batch_size | 2048 | 2048 |
Training Parameters | epoch: 20, batch_size: 2048, lr=0.0005 | epoch: 20, batch_size: 2048, lr=0.0005 |
Optimizer | Adam | Adam |
Loss Function | binary cross-entropy | binary cross-entropy |
Scripts | kgnn scripts | kgnn scripts |
Parameters | Ascend | GPU |
---|---|---|
Model Version | KGNN | KGNN |
Resource | Ascend910 | GPU3090 |
uploaded Date | 10/28/2022 | 10/28/2022 |
MindSpore Version | 1.8.0 | 1.8.0 |
Dataset | KEGG-drug | KEGG-drug |
Batch_size | 2048 | 2048 |
Accuracy | 88.32% | 88.90% |
Model for inference | 127M (.ckpt file) | 127M (.ckpt file) |
There are two random situations:
Please check the official homepage.
KGNN的mindspore实现
Python C++ Shell Text
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》