#4552 启智集群的A100放入智算集群的云脑1中

Closed
created 9 months ago by tanglj · 7 comments
tanglj commented 9 months ago
智算集群中,支持GPU的调试、训练、在线推理,将启智集群的A100放入智算集群中。
tanglj added the
enhancement
label 9 months ago
tanglj added this to the V20230808 milestone 9 months ago
wangj was assigned by tanglj 9 months ago
lewis was assigned by tanglj 9 months ago
avadesian commented 8 months ago
Owner
目前没有运行简况信息
avadesian modified the milestone from V20230808 to V20230828 8 months ago
tanglj modified the milestone from V20230828 to V20230912 8 months ago
wangj commented 7 months ago
Owner
发现问题: #4683 、 #4684
wangj modified the milestone from V20230912 to V20231018 7 months ago
wangj modified the milestone from V20231018 to V20231102 6 months ago
wangj modified the milestone from V20231102 to V20231120 5 months ago
wangj commented 5 months ago
Owner
发现问题: #4869 , 日志延迟依赖云脑1解决。 #4879 ,分中心名称展示不统一。
wangj commented 5 months ago
Owner
> 发现问题: #4683 、 #4684 已修复。
wangj commented 5 months ago
Owner
> 发现问题: > #4869 , 日志延迟依赖云脑1解决。 > #4879 ,分中心名称展示不统一。 第2个问题,通过修改app.ini配置即可解决。
wangj added the
test
label 5 months ago
wangj commented 5 months ago
Owner
功能通过测试。遗留问题: #4869 验证场景包括: 1.创建调试任务 2.再次调试 3.提交镜像(新云脑1提交的镜像地址ip、port和鹏城云计算所提交的镜像地址不一样,不过不影响使用,在两个智算GPU分中心都通用。) 4.创建训练任务。
wangj commented 5 months ago
Owner
@tanglj 需在管理后台操作上架,并修改app.ini配置。
tanglj was assigned by wangj 5 months ago
wangj closed this issue 5 months ago
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.