Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
chenzhehao f0fa4a7cd5 | 2 years ago | |
---|---|---|
.. | ||
README.md | 2 years ago | |
__init__.py | 2 years ago | |
get_distribute_train_cmd.py | 2 years ago | |
hyper_parameter_config.ini | 2 years ago |
The number of Ascend accelerators can be automatically allocated based on the device_num set in hccl config file, You don not need to specify that.
For example, if we want to generate the launch command of the distributed training of CenterNet model on Ascend accelerators, we can run the following command in /centernet_det/
dir:
python ./scripts/ascend_distributed_launcher/get_distribute_pretrain_cmd.py --run_script_dir ./train.py --hyper_parameter_config_dir ./scripts/ascend_distributed_launcher/hyper_parameter_config.ini --data_dir /path/dataset/ --mindrecord_dir /path/mindrecord_dataset/ --hccl_config_dir model_zoo/utils/hccl_tools/hccl_2p_56_x.x.x.x.json
output:
hccl_config_dir: model_zoo/utils/hccl_tools/hccl_2p_56_x.x.x.x.json
the number of logical core: 192
avg_core_per_rank: 96
rank_size: 2
start training for rank 0, device 5:
rank_id: 0
device_id: 5
core nums: 0-95
epoch_size: 350
data_dir: /path/dataset/
mindrecord_dir: /path/mindrecord_dataset/
log file dir: ./LOG5/training_log.txt
start training for rank 1, device 6:
rank_id: 1
device_id: 6
core nums: 96-191
epoch_size: 350
data_dir: /path/dataset/
mindrecord_dir: /path/mindrecord_dataset/
log file dir: ./LOG6/training_log.txt
Note that hccl_2p_56_x.x.x.x.json
can use hccl_tools.py to generate.
For hyper parameter, please note that you should customize the scripts hyper_parameter_config.ini
. Please note that these two hyper parameters are not allowed to be configured here:
For Other Model, please note that you should customize the option run_script
and Corresponding hyper_parameter_config.ini
.
centernet_hourglass,目标检测。
Python C++ Shell Text other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》