Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
Tian Xia b36334cb8e | 3 months ago | |
---|---|---|
.. | ||
scripts | 10 months ago | |
LICENSE | 1 year ago | |
README.md | 6 months ago | |
dummy.json | 1 year ago | |
serve-openai-api-endpoint.yaml | 3 months ago | |
serve.yaml | 3 months ago | |
train.yaml | 3 months ago |
This README contains instructions to run and train Vicuna, an open-source LLM chatbot with quality comparable to ChatGPT. The Vicuna release was trained using SkyPilot on cloud spot instances, with a cost of ~$300.
Install the latest SkyPilot and check your setup of the cloud credentials:
pip install git+https://github.com/skypilot-org/skypilot.git
sky check
See the Vicuna SkyPilot YAMLs: for training and for serving.
sky launch -c vicuna-serve -s serve.yaml
(task, pid=20933) 2023-04-12 22:08:49 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=10, model_list_mode='once', share=True, moderate=False)
(task, pid=20933) 2023-04-12 22:08:49 | INFO | stdout | Running on local URL: http://0.0.0.0:7860
(task, pid=20933) 2023-04-12 22:08:51 | INFO | stdout | Running on public URL: https://<random-hash>.gradio.live
sky launch -c vicuna-serve-v100 -s serve.yaml --gpus V100
sky launch -c vicuna-serve -s serve.yaml --env MODEL_SIZE=13
sky launch -c vicuna-openai-api -s serve-openai-api-endpoint.yaml
Currently, training requires GPUs with 80GB memory. See sky show-gpus --all
for supported GPUs.
We can start the training of Vicuna model on the dummy data dummy.json1 with a single command. It will automatically find the available cheapest VM on any cloud.
To train on your own data, replace the file with your own, or change the line /data/mydata.json: ./dummy.json
to the path of your own data in the train.yaml.
Steps for training on your cloud(s):
Replace the bucket name in train.yaml with some unique name, so the SkyPilot can create a bucket for you to store the model weights. See # Change to your own bucket
in the YAML file.
Training the Vicuna-7B model on 8 A100 GPUs (80GB memory) using spot instances:
# Launch it on managed spot to save 3x cost
sky spot launch -n vicuna train.yaml
Note: if you would like to see the training curve on W&B, you can add --env WANDB_API_KEY
to the above command, which will propagate your local W&B API key in the environment variable to the job.
[Optional] Train a larger 13B model
# Train a 13B model instead of the default 7B
sky spot launch -n vicuna-7b train.yaml --env MODEL_SIZE=13
# Use *unmanaged* spot instances (i.e., preemptions won't get auto-recovered).
# Unmanaged spot provides a better interactive development experience but is vulnerable to spot preemptions.
# We recommend using managed spot as above.
sky launch -c vicuna train.yaml
Currently, such A100-80GB:8
spot instances are only available on AWS and GCP.
[Optional] To use on-demand A100-80GB:8
instances, which are currently available on Lambda Cloud, Azure, and GCP:
sky launch -c vicuna -s train.yaml --no-use-spot
Q: I see some bucket permission errors sky.exceptions.StorageBucketGetError
when running the above:
...
sky.exceptions.StorageBucketGetError: Failed to connect to an existing bucket 'YOUR_OWN_BUCKET_NAME'.
Please check if:
1. the bucket name is taken and/or
2. the bucket permissions are not setup correctly. To debug, consider using gsutil ls gs://YOUR_OWN_BUCKET_NAME.
A: You need to replace the bucket name with your own globally unique name, and rerun the commands. New private buckets will be automatically created under your cloud account.
No Description
Python SVG Shell Markdown HTML other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》