Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
7 months ago | |
---|---|---|
davincirunsdk | 10 months ago | |
image | 10 months ago | |
source | 10 months ago | |
tests | 10 months ago | |
.gitignore | 10 months ago | |
DESCRIPTION.rst | 10 months ago | |
LICENSE | 10 months ago | |
MANIFEST.in | 10 months ago | |
Makefile | 10 months ago | |
README.md | 7 months ago | |
make.bat | 10 months ago | |
requirements.txt | 10 months ago | |
setup.cfg | 10 months ago | |
setup.py | 10 months ago |
此项目荣获2022启智社区优秀开源项目
为类Jupyter交互式环境提供Notebook友好的Ascend分布式训练SDK,推荐在AI靶场使用本脚本进行python脚本训练
davincirun
命令,支持Modelarts Ascend训练作业,不再需要打包davinci文件夹init_rank_table
支持转换v0.1 hccl json -> v1.0 hccl jsonstart_distributed_train
, wait_distributed_train
根据v1.0 hccl json启动并等待分布式训练完成output_notebook=True
支持在notebook中输出分布式训练日志更多见SDK文档
$pip install davincirunsdk
以MindSpore1.5分布式训练教程 为例,使用本SDK可改造为
import os
os.environ['DATA_PATH'] = '/cache/cifar-10-batches-bin'
from davincirunsdk import start_and_wait_distributed_train
cmd = ['python', 'resnet50_distributed_training.py']
start_and_wait_distributed_train(cmd, output_notebook=True)
以下命令将等价于python davincirun.py train.py
$davincirun train.py
或在python文件中使用:
from davincirunsdk import init_rank_table, start_and_wait_distributed_train
init_rank_table()
start_and_wait_distributed_train(['python', 'train.py'])
同调试环境(开发环境),不需要额外修改
更多细节见案例
MIT License
$git clone https://git.openi.org.cn/Wh1isper/davincirunsdk.git
$cd davincirunsdk
$pip install -e ./
$pip install pytest
$pytest .
$pip install Sphinx sphinx-rtd-theme
$make html
notebook
文件夹下是针对notebook运行环境修改的davincirun文件,以及sdk入口
davincirunsdk
目录下,除了notebook
外的文件,是原有davincurun代码,进行了python包改造,并按需启用了moxing对obs文件的支持
各项文档和API功能还在完善中,欢迎各位在issue中进行反馈
感谢华为云、鹏城实验室、AI靶场对本项目的大力支持和帮助,该项目已贡献给AI靶场
不如在这里给我们一个Star🌟鼓励一下!
🌟🌟🌟Github 🌟🌟🌟
🌟🌟🌟OpenI 🌟🌟🌟