#4237 gpu训练任务,gcu训练任务,支持模型多文件,跟随训练任务重构新接口一起改

Closed
created 10 months ago by liuzx · 10 comments
liuzx self-assigned this 10 months ago
liuzx added this to the V20230628 milestone 9 months ago
chenshihai was assigned by liuzx 9 months ago
liuzx modified the milestone from V20230628 to V20230718 9 months ago
liuzx added the
test
label 8 months ago
liuzx added this to the ai_task_refactor_v3_train branch 8 months ago
wangj commented 8 months ago
Owner
等示例代码仓更新pretrain代码。
wangj added the
wait
label 8 months ago
liuzx commented 8 months ago
Poster
测试脚本参考 https://openi.pcl.ac.cn/OpenIOSSG/MNIST_PytorchExample_GPU/src/branch/test ,注意选择模型文件的名称需要与代码中的模型文件对应
liuzx commented 8 months ago
Poster
![image]() 比如这个代码中界面需要选择的模型文件是mnist_epoch5.pkl
wangj commented 8 months ago
Owner
gpu示例代码仓可以跑成功。有个现象:选择同一个模型文件跑多个任务,日志打印的“加载 epoch {} 权重成功!”,每次数字都不一样。 gcu示例代码运行失败。任务名 wjtes20230713-c2-gcu-pretrain。 npu示例代码可以跑成功。
liuzx removed the
wait
label 8 months ago
liuzx commented 8 months ago
Poster
npu示例代码需要更新什么呢,gcu和gpu的已经更新
wangj commented 8 months ago
Owner
> npu示例代码需要更新什么呢,gcu和gpu的已经更新 npu示例代码可以跑成功,已更新评论。
wangj commented 8 months ago
Owner
发现问题 #4504
wangj added the
enhancement
label 8 months ago
wangj removed the
test
label 8 months ago
wangj commented 8 months ago
Owner
移到下个里程碑,后续和 #4271 优化设计后一起上线。
wangj modified the milestone from V20230718 to V20230808 8 months ago
tanglj commented 8 months ago
Collaborator
#4553 新建云脑任务时支持选择整个模型 参考上面的issue修改
wangj modified the milestone from V20230808 to V20230828 7 months ago
tanglj modified the milestone from V20230828 to V20230912 7 months ago
wangj modified the milestone from V20230912 to V20231018 6 months ago
wangj modified the milestone from V20231018 to V20231102 5 months ago
tanglj commented 5 months ago
Collaborator
方案已经调整为选整个模型了,不用选模型中的多个文件了,关闭该任务。
tanglj closed this issue 5 months ago
wangj added the
invalid
label 4 months ago
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.