Branch: serve_k8s_playground_ha

History

Tian Xia b36334cb8e [Provisioner] Introducing `best` disk tier (#2434 ) * add best disk tier * update smoke test * Apply suggestions from code review Co-authored-by: Wei-Lin Chiang <infwinston@gmail.com> * upd comments * rename normalize to translate * group cpu, memory, disk options into _TASK_OPTIONS * move pylint disable for pylint bug * use enum instead of string literal * add sky/utils/resources_utils.py * change to lower case * minor * Update sky/resources.py Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com> * warning when ignore some arguments * Update sky/clouds/local.py Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com> * change more disk tier in str to its value * failover for best tier * fix tier comparison * add azure failover test * typo * format * change falcon to best disk tier * nits * centralize check_disk_tier_enabled * check ports first * add test_jobs tests * apply suggestions from code review * check and ignore params after CLI override * change new example to best * failover for azure * refactor * fix api test * upd tier * change for runpod * Update sky/resources.py Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com> * changes for vsphere & lint * fix & add test * lint * move tests * add high tier test --------- Co-authored-by: Wei-Lin Chiang <infwinston@gmail.com> Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>		3 months ago
..
scripts	fix vicuna example (#2173)	10 months ago

LICENSE	Make vicuna finetuning serving self-contained (#1855)	1 year ago

README.md	[Examples] Fix Vicuna serve example and add OpenAI Example for Vicuna (#2791)	6 months ago

dummy.json	Make vicuna finetuning serving self-contained (#1855)	1 year ago

serve-openai-api-endpoint.yaml	[Provisioner] Introducing `best` disk tier (#2434)	3 months ago

serve.yaml	[Provisioner] Introducing `best` disk tier (#2434)	3 months ago

train.yaml	[Provisioner] Introducing `best` disk tier (#2434)	3 months ago

README.md

Vicuna: An LLM Chatbot Impressing GPT-4 with 90% ChatGPT Quality

Vicuna: An LLM Chatbot Impressing GPT-4 with 90% ChatGPT Quality

Vicuna LLM

This README contains instructions to run and train Vicuna, an open-source LLM chatbot with quality comparable to ChatGPT. The Vicuna release was trained using SkyPilot on cloud spot instances, with a cost of ~$300.

Prerequisites

Install the latest SkyPilot and check your setup of the cloud credentials:

pip install git+https://github.com/skypilot-org/skypilot.git
sky check

See the Vicuna SkyPilot YAMLs: for training and for serving.

Serve the official Vicuna model by yourself with SkyPilot

Start serving the Vicuna-7B model on a single A100 GPU:

sky launch -c vicuna-serve -s serve.yaml

Check the output of the command. There will be a sharable gradio link (like the last line of the following). Open it in your browser to chat with Vicuna.

(task, pid=20933) 2023-04-12 22:08:49 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=10, model_list_mode='once', share=True, moderate=False)
(task, pid=20933) 2023-04-12 22:08:49 | INFO | stdout | Running on local URL:  http://0.0.0.0:7860
(task, pid=20933) 2023-04-12 22:08:51 | INFO | stdout | Running on public URL: https://<random-hash>.gradio.live

[Optional] Try other GPUs:

sky launch -c vicuna-serve-v100 -s serve.yaml --gpus V100

[Optional] Serve the 13B model instead of the default 7B:

sky launch -c vicuna-serve -s serve.yaml --env MODEL_SIZE=13

[Optional] Serve the OpenAI API Compatible Endpoint:

sky launch -c vicuna-openai-api -s serve-openai-api-endpoint.yaml

Training Vicuna with SkyPilot

Currently, training requires GPUs with 80GB memory. See sky show-gpus --all for supported GPUs.

We can start the training of Vicuna model on the dummy data dummy.json¹ with a single command. It will automatically find the available cheapest VM on any cloud.

To train on your own data, replace the file with your own, or change the line /data/mydata.json: ./dummy.json to the path of your own data in the train.yaml.

Steps for training on your cloud(s):

Replace the bucket name in train.yaml with some unique name, so the SkyPilot can create a bucket for you to store the model weights. See # Change to your own bucket in the YAML file.
Training the Vicuna-7B model on 8 A100 GPUs (80GB memory) using spot instances:

# Launch it on managed spot to save 3x cost
sky spot launch -n vicuna train.yaml

Note: if you would like to see the training curve on W&B, you can add --env WANDB_API_KEY to the above command, which will propagate your local W&B API key in the environment variable to the job.

[Optional] Train a larger 13B model

# Train a 13B model instead of the default 7B
sky spot launch -n vicuna-7b train.yaml --env MODEL_SIZE=13

# Use *unmanaged* spot instances (i.e., preemptions won't get auto-recovered).
# Unmanaged spot provides a better interactive development experience but is vulnerable to spot preemptions.
# We recommend using managed spot as above.
sky launch -c vicuna train.yaml

Currently, such A100-80GB:8 spot instances are only available on AWS and GCP.

[Optional] To use on-demand A100-80GB:8 instances, which are currently available on Lambda Cloud, Azure, and GCP:

sky launch -c vicuna -s train.yaml --no-use-spot

Q&A

Q: I see some bucket permission errors sky.exceptions.StorageBucketGetError when running the above:

...
sky.exceptions.StorageBucketGetError: Failed to connect to an existing bucket 'YOUR_OWN_BUCKET_NAME'.
Please check if:
  1. the bucket name is taken and/or
  2. the bucket permissions are not setup correctly. To debug, consider using gsutil ls gs://YOUR_OWN_BUCKET_NAME.

A: You need to replace the bucket name with your own globally unique name, and rerun the commands. New private buckets will be automatically created under your cloud account.

The dummy data was originally from the official Vicuna repository, FastChat. ↩︎

No Description

Python SVG Shell Markdown HTML other

zhanghao.wu@outlook.com zongheng.y@gmail.com romil.bhardwaj@gmail.com concretevitamin@users.noreply.github.com cblmemo@gmail.com infwinston@gmail.com gautam@mittal.net romil.bhardwaj@berkeley.edu lsf@berkeley.edu suquark@gmail.com michael.luo@berkeley.edu woosuk.kwon@berkeley.edu 34902420+landscapepainter@users.noreply.github.com weichiang@berkeley.edu michaelluo@dhcp-132-50.EECS.Berkeley.EDU ziming.mao@yale.edu isaacong.jw@gmail.com sumanthgenz@gmail.com edwardzeng@berkeley.edu hysunhe@foxmail.com michaelluo@MacBook-Pro.local michael.luo123456789@gmail.com rahejamehul@gmail.com guoxd@jihulab.com 46831164+ewzeng@users.noreply.github.com

How to access data resources in code

README.md

Vicuna: An LLM Chatbot Impressing GPT-4 with 90% ChatGPT Quality

Prerequisites

Serve the official Vicuna model by yourself with SkyPilot

Training Vicuna with SkyPilot

Q&A

Contributors (25+) All

Contributors (25+)
All