liutension 8d873b9d20 | 4 years ago | |
---|---|---|
cambricon-integration | 5 years ago | |
charts | 4 years ago | |
deepops | 4 years ago | |
doc | 4 years ago | |
image-factory | 4 years ago | |
kubebox-server | 4 years ago | |
model | 4 years ago | |
pylon | 4 years ago | |
rest-server | 4 years ago | |
rest-server-plugin | 4 years ago | |
rest-server-storage | 4 years ago | |
taskset | 4 years ago | |
web-portal | 4 years ago | |
.gitattributes | 5 years ago | |
.gitignore | 5 years ago | |
LICENSE | 5 years ago | |
LICENSE-CN | 5 years ago | |
README.md | 4 years ago | |
README_zh.md | 4 years ago | |
openilogo.png | 5 years ago | |
package-lock.json | 5 years ago | |
sysarch.png | 4 years ago | |
user manual.pdf | 5 years ago |
Openi-octopus is a cluster management tool and resource scheduling platform jointly designed and developed by Peking University, Xi'an Jiaotong University, Zhejiang University and China University of science and technology, and maintained by Pengcheng laboratory, Peking University, China University of science and technology and aitisa. The platform combines some mature designs that perform well in large-scale production environment, and is mainly designed to improve the efficiency of academic research and reproduce academic research results.
OPENI is completely open: it is under the Open-Intelligence license. OPENI is architected in a modular way: different module can be plugged in as appropriate. This makes OPENI particularly attractive to evaluate various research ideas, which include but not limited to the following components:
OPENI operates in an open model: contributions from academia and industry are all highly welcome.
The system runs in a cluster of machines each equipped with one or multiple GPUs.
Each machine in the cluster runs Ubuntu 18.04 LTS and has a statically assigned IP address.
To deploy services, the system further relies on a Docker registry service (e.g., Docker hub)
to store the Docker images for the services to be deployed.
The system also requires a dev machine that runs in the same environment that has full access to the cluster.
And the system need NTP service for clock synchronization.
To deploy and use the system, the process consists of the following steps.
After system services have been deployed, user can access the web portal, a Web UI, for cluster management and job management.
Please refer to this tutorial for details about job submission.
The web portal also provides Web UI for cluster management.
The system architecture is illustrated above.
User submits jobs or monitors cluster status through the Web Portal,
which calls APIs provided by the REST server.
Third party tools can also call REST server directly for job management.
Upon receiving API calls, the REST server coordinates with k8s ApiServer, k8s Scheduler will schedule the job to k8s node with CPU,GPU and other resources.
TaskSetController will monitor the job life cycle in k8s cluster.
Restserver retrieve the status of jobs from k8s ApiServer, and its status can display on Web portal.
Other type of CPU based AI workloads or traditional big data job
can also run in the platform, coexisted with those GPU-based jobs.
The storage of training data and results can be customized according to platform/equipment requirements.
@Deprecated 此仓库已弃用,请移步至 https://git.openi.org.cn/OpenI/octopus.
启智章鱼项目(OPENI-OCTOPUS)是一个集群管理和资源调度系统,支持在GPU集群中运行AI任务作业(比如深度学习任务作业)。平台提供了一系列接口,能够支持主流的深度学习框架。
JavaScript Go SVG Python JSX other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》