Weights version | Link | FastChat version compatibility | Base Model | Release Date | Fine-tuning Data |
---|---|---|---|---|---|
v1.5 | 7B, 7B-16k, 13B, 13B-16k | >=0.2.21 |
Llama 2 | Aug. 1, 2023 | 370M tokens |
v1.3 | 7B, 13B, 33B | >=0.2.1 |
Llama 1 | Jun. 22, 2023 | 370M tokens |
v1.1 | 7B, 13B | >=0.2.1 |
Llama 1 | Apr. 12, 2023 | - |
v0 | 7B-delta, 13B-delta | <=0.1.10 |
Llama 1 | Mar. 30, 2023 | - |
Major updates of weights v1.5
Major updates of weights v1.3
Major updates of weights v1.1
###
to the EOS token </s>
. This change makes it easier to determine the generation stop criteria and enables better compatibility with other libraries.A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
USER: Hello!
ASSISTANT: Hello!</s>
USER: How are you?
ASSISTANT: I am good.</s>
See a full prompt template here and example output here.
A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions.
### Human: Hello!
### Assistant: Hello!
### Human: How are you?
### Assistant: I am good.
See the full prompt template here.
We release Vicuna weights v0 as delta weights to comply with the LLaMA model license.
You can add our delta to the original LLaMA weights to obtain the Vicuna weights. Instructions:
NOTE:
Weights v1.1 are only compatible with transformers>=4.28.0
and fschat >= 0.2.0
.
Please update your local packages accordingly. If you follow the above commands to do a fresh install, then you should get all the correct versions.
This conversion command needs around 30 GB of CPU RAM.
See the "Low CPU Memory Conversion" section below if you do not have enough memory.
Replace /path/to/*
with the real paths.
python3 -m fastchat.model.apply_delta \
--base-model-path /path/to/llama-7b \
--target-model-path /path/to/output/vicuna-7b \
--delta-path lmsys/vicuna-7b-delta-v1.1
This conversion command needs around 60 GB of CPU RAM.
See the "Low CPU Memory Conversion" section below if you do not have enough memory.
Replace /path/to/*
with the real paths.
python3 -m fastchat.model.apply_delta \
--base-model-path /path/to/llama-13b \
--target-model-path /path/to/output/vicuna-13b \
--delta-path lmsys/vicuna-13b-delta-v1.1
You can try these methods to reduce the CPU RAM requirement of weight conversion.
--low-cpu-mem
to the commands above, which will split large weight files into smaller ones and use the disk as temporary storage. This can keep the peak memory at less than 16GB.There are some frequently asked tokenizer issues (https://github.com/lm-sys/FastChat/issues/408).
Some of them are not only related to FastChat or Vicuna weights but are also related to how you convert the base llama model.
We suggest that you use transformers>=4.28.0
and redo the weight conversion for the base llama model.
After applying the delta, you should have a file named special_tokens_map.json
in your converted weight folder for either v0 or v1.1.
The contents of this file should be the same as this file: https://huggingface.co/lmsys/vicuna-13b-delta-v0/blob/main/special_tokens_map.json.
If the file is not present, please copy the special_tokens_map.json
and tokenizer_config.json
files from https://huggingface.co/lmsys/vicuna-13b-delta-v0/tree/main to your converted weight folder. This works for both v0 and v1.1.
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》