Branch: main

History

SingL3 709bb99d3e [Fix] Consume much more gpt memory running eval_rm (#3614 ) Fix #3611. Still debugging or model_training. --------- Co-authored-by: Lin Junpeng <linjunpeng@sensetime.com>		8 months ago
..
model_eval	[Fix] Consume much more gpt memory running eval_rm (#3614)	8 months ago

model_training	[Fix] Consume much more gpt memory running eval_rm (#3614)	8 months ago

pretokenizer	Add dataset adapter for loading dolly15k_multilingual dataset (#3660)	9 months ago

.gitignore	add one .gitignore to resolve conflicts	1 year ago

MESSAGE_AND_TOKEN_FORMAT.md	Simplify message and token format doc (#3324)	11 months ago

README.md	update READMEs (#3487)	10 months ago

pyproject.toml	To support pip install for model_training package. (#3643)	9 months ago

Reproduction directions

Here are some minimal commands to tun to whole pipeline on the collected data.

make sure python >= 3.10, otherwise, you would meet the
[issue]

First create the data path location.

mkdir -p .cache
mkdir -p .saved_models
export DATA_PATH=$PWD/.cache
export MODEL_PATH=$PWD/.saved_models

Then download the OA message tree JSONL file or declare the HuggingFace
dataset to use.

Create a new or modify an existing configuration section in the config.yaml
(SFT), config_rm.yaml (RM) or config_rl.yaml (RL) YAML configuration files
located in the model_training/configs/ directory and specify the OA JSONL data
file or HuggingFace dataset to use.

To use a local OASST JSONL file (either .jsonl or .jsonl.gz) specify the
file name with the input_file_path configuration option. Place the file
either in the cache_dir (DATA_PATH) or specify an absolute path.

cp /path/to/<oasst.trees.jsonl> $DATA_PATH

Example:

my_data_config:
  datasets:
    - oasst_export:
      input_file_path: oasst_export.trees.jsonl.gz

To use a HuggingFace dataset specify the dataset name with the
hf_dataset_name configuration option.

Example:

my_data_config:
  datasets:
    - oasst_export:
      hf_dataset_name: OpenAssistant/oasst1

Note: If both hf_dataset_name and input_file_path are specified
input_file_path will take precedence.

See the
OpenAssistant/oasst1
dataset card on the HuggingFace hub for more information.

(TODO) add better parsing of the config files that is consistent for sft, rm
and rl training.

SFT Training

Start with the SFT training.

cd model_training
# export shared modules
export PYTHONPATH=$PYTHONPATH:../../oasst-shared

python trainer_sft.py --configs defaults oa_dataset_only pythia --cache_dir $DATA_PATH --output_dir $MODEL_PATH/sft_model

# if you want to use wandb, add
--wandb_entity your_username/team_name

To change the model used, i.e. larger pythia version create a new config in
model_training/configs/config.yaml or set the flag --model_name to
EleutherAI/pythia-{size}-deduped. Larger models will probably need to also
adjust the --learning_rate and --per_device_train_batch_size flags.

Get SFT trained model

# choose a specific checkpoint
export SFT_MODEL=$MODEL_PATH/sft_model/<checkpoint-X>

# or get latest checkpoint
export SFT_MODEL=$MODEL_PATH/sft_model/$(ls -t $MODEL_PATH/sft_model/ | head -n 1)

RM Training

Train the reward model

cd model_training
python trainer_rm.py --configs defaults_rm oasst-rm-1-pythia-1b

Get RM trained model

# choose a specific checkpoint
export REWARD_MODEL=$MODEL_PATH/reward_model/<checkpoint-X>

# or get latest checkpoint
export REWARD_MODEL=$MODEL_PATH/reward_model/$(ls -t $MODEL_PATH/reward_model/ | head -n 1)

RL Training

Train the RL agent

cd model_training
python trainer_rl.py --configs defaults_rlhf --cache_dir $DATA_PATH --rank_model $REWARD_MODEL --sft_model $SFT_MODEL --output_dir $MODEL_PATH/rl_model

Message and Token Format

See the MESSAGE_AND_TOKEN_FORMAT.md file for information about the pattern we
are using.

No Description

Jupyter Notebook Python TSX TypeScript SVG other

yk@users.noreply.github.com fozziethebeat@gmail.com andreas.koepf@provisio.com ka70911@gmail.com andrewm4894@gmail.com 33456881+notmd@users.noreply.github.com tinhmeo10@gmail.com olivergestanley@gmail.com theblackcat102@github.com me@rileysandb.org 49699333+dependabot[bot]@users.noreply.github.com othrayte@gmail.com 66271487+AlexanderHOtt@users.noreply.github.com jack@lomz.me ilya@mokhov.vg j.pirker@gmail.com petri.lucian@outlook.com sot.anagn@gmail.com 122345469+0x22almostEvil@users.noreply.github.com guille.hoardings@gmail.com 31857876+CloseChoice@users.noreply.github.com gaz@bitplane.net richardmacarthy@hotmail.com grealydesmond@gmail.com shahules786@gmail.com

How to access data resources in code