RoBERTa iterates on BERT's pretraining procedure, including training the model
longer, with bigger batches over more data; removing the next sentence prediction
objective; training on longer sequences; and dynamically changing the masking
pattern applied to the training data.
RoBERTa model is using Fairseq toolbox. Before you run this model,
you need to setup Fairseq first.
# Go to "toolbox/Fairseq" directory in root path
cd ../../../../toolbox/Fairseq/
bash install_toolbox_fairseq.sh
# Download dataset
cd fairseq/
mkdir -p glue_data
cd glue_data/
wget https://dl.fbaipublicfiles.com/glue/data/RTE.zip
unzip RTE.zip
rm -rf RTE.zip
# Preprocess dataset
cd ..
./examples/roberta/preprocess_GLUE_tasks.sh glue_data RTE
# Download pretrain weight
wget https://dl.fbaipublicfiles.com/fairseq/models/roberta.large.tar.gz
tar -xzvf roberta.large.tar.gz
# Finetune on CLUE RTE task
bash roberta.sh
# Inference on GLUE RTE task
python3 roberta.py
GPUs | QPS | Train Epochs | Accuracy |
---|---|---|---|
BI-v100 x8 | 207.5 | 10 | 86.3 |
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》