Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
zh-zheng e0cee47bed | 1 year ago | |
---|---|---|
.. | ||
config | 1 year ago | |
cpm_live | 1 year ago | |
examples | 1 year ago | |
scripts | 1 year ago | |
training_tasks | 1 year ago | |
.flake8 | 1 year ago | |
README.MD | 1 year ago | |
cpm-ant-plus.png | 1 year ago | |
data_sample.txt | 1 year ago | |
preprocess_dataset.py | 1 year ago | |
pretrain_cpm_ant.py | 1 year ago | |
pretrain_cpm_ant_plus.py | 1 year ago | |
pyproject.toml | 1 year ago | |
question_answering.py | 1 year ago | |
requirements.txt | 1 year ago | |
summarization.py | 1 year ago | |
tasks.json | 1 year ago | |
text_generation.py | 1 year ago | |
translation.py | 1 year ago |
CPM-Ant+ is an open-source bilingual pre-trained language model (PLM) with 10B parameters, which is the second milestone of the live training process of CPM-Live. CPM-Ant+ is an enhanced version of CPM-Ant. For more details on CPM-Ant, please check here. The code, log files, and checkpoints of CPM-Ant+ are available under an open license.
Compared to CPM-Ant, CPM-Ant+ has several new features:
First, you need to clone the cpm-ant-plus
branch of this repository.
$ git clone -b cpm-ant-plus --single-branch https://github.com/OpenBMB/CPM-Live.git
Then, please make sure that your environment meets the following requirements:
We recommend using Anaconda to manage the environment and installing additional dependencies from PyPI:
$ cd CPM-Live/cpm-live
$ pip install -r requirements.txt
We release the checkpoint of CPM-Ant+ (10B), and you can download it from here.
If you want to compress CPM-Ant+ into smaller models, please check our guidelines in BMCook!
If you want to adapt CPM-Ant+ to your own tasks, we recommend using parameter-efficient tuning (a.k.a., delta tuning). With the help of OpenDelta, we can conduct delta tuning without modifying the code of the original model.
We install OpenDelta from source. Note that we use the with_bmtrain
branch, which enables us to conduct distributed delta tuning on multiple computing nodes.
$ git clone -b with_bmtrain --single-branch https://github.com/thunlp/OpenDelta.git
$ cd OpenDelta
$ python setup.py install
We need to download a checkpoint of CPM-Ant+ and load it.
from cpm_live.models import CPMAntPlus, CPMAntConfig
import bmtrain as bmt
bmt.init_distributed(seed=0)
config = CPMAntConfig.from_json_file("YOUR_PATH/cpm-ant-plus-10b.json")
ckpt_path = "YOUR_PATH/cpm-ant-plus-10b.pt"
model = CPMAntPlus(config=config)
bmt.load(model, ckpt_path)
Using Opendelta, we can insert a delta model (e.g. LoRA) into CPM-Ant+ with three lines of code:
from opendelta import LoraModel
delta_model = LoraModel(backbone_model=model, modified_modules=["project_q", "project_v"])
delta_model.freeze_module(exclude=["deltas"], set_state_dict=True)
delta_model.log()
We provide a sample of the data used for pre-training.
If you want to know how we convert the data to binary files, run the following command:
$ bash scripts/preprocess_dataset.sh
If you want to use CPM-Ant+ on your own tasks, we provide several examples of adapting CPM-Ant+ to the tasks on CUGE benchmark, including summarization, dialogue, classification, re-ranking. Please check the examples folder.
You can use CPM-Ant+ directly for various NLP tasks.
You can use CPM-Ant+ for text generation, either in Chinese or English. Currently, we implement two decoding strategies: beam search and top-k/top-p sampling. Here is an example:
$ python text_generation.py
CPM-Ant+ can answer your questions based on the provided document, try it!
$ python question_answering.py
With the help of CPM-Ant+, you can extract key sentences from a document.
$ python summarization.py
You can also use CPM-Ant+ for Chinese-English and English-Chinese translation!
$ python summarization.py
If you want to experience our big models but don't have enough GPU memory, we recommend using BMInf, which can help you use our models for inference on most consumer-level GPUs. Let's try it!
Install BMInf:
$ pip install bminf
Assuming that you have a GPU with 8G memory, you can run the text generation script with the following command:
$ python text_generation.py --use-bminf --memory-limit 4
Note that memory-limit
should be less than total GPU memory, as there are some intermediate computation results needed to be stored.
No Description
Text Python Markdown other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》