Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
Lijingtian e612ebb99c | 10 months ago | |
---|---|---|
README.md | 11 months ago | |
cli_demo.py | 11 months ago | |
cover_alpaca2jsonl.py | 11 months ago | |
finetune.py | 11 months ago | |
finetune.sh | 10 months ago | |
infer.ipynb | 11 months ago | |
med_webui.py | 10 months ago | |
requirements.txt | 11 months ago | |
tokenize_dataset_rows.py | 11 months ago | |
tokenized_data.sh | 10 months ago | |
train.py | 11 months ago | |
web_demo.py | 11 months ago |
一种平价的chatgpt实现方案,基于清华的 ChatGLM-6B + LoRA 进行finetune.
数据集: alpaca
tokenization
python tokenize_dataset_rows.py \
--jsonl_path data/alpaca_data.jsonl \
--save_path data/alpaca \
--max_seq_length 200 \
--skip_overlength
--jsonl_path
微调的数据路径, 格式jsonl, 对每行的['context']和['target']字段进行encode--save_path
输出路径--max_seq_length
样本的最大长度python finetune.py \
--dataset_path data/alpaca \
--lora_rank 8 \
--per_device_train_batch_size 6 \
--gradient_accumulation_steps 1 \
--max_steps 52000 \
--save_steps 1000 \
--save_total_limit 2 \
--learning_rate 1e-4 \
--fp16 \
--remove_unused_columns false \
--logging_steps 50 \
--output_dir output
参考 infer.ipynb
基于lora微调chatglm
Python Jupyter Notebook Shell Text
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》