Pengcheng CloudBrain I Datasets
This project is the datasets of Pengcheng CloudBrain I. For certain privacy and data security reasons, we open source about six months of data, including basic information about the Job, as well as node-related information.
Detail Information
Job Information
The file "jobs_info.csv" contains the jobs basic informations. This file consists of the following information.
- Task Name: It consists of userid, jobid and taskid. Tasks with same jobid are submitted together. And scheduler will treat the job which has more than one tasks as Gang-Schedule job. All the tasks in a job will be scheduled at the same time. Due to user privacy, we hash the user information.
- Request GPU: Integer.
- Request CPU: Integer.
- Request Host Memory: Floating number in MB.
- Submit Time: Task submit timestamp. Due to user privacy, we add an offset to all timestamps. We also add an offset to following Start Time and End Time.
- Start Time: Task start timestamp.
- End Time: Task finsih timestamp.
- Job Type: The resource queue where the task is located.
- Scheduled Node: Resource nodes where tasks are scheduled
- GPU Utilization: Floating Number in percentage. The average GPU utilization during the task lifecycle.
- GPU Memory: Floating number in MB. The average GPU memory usage during taskLifecycle.
- CPU Usage: Floating number in percentage.
- Host Memory Average: Floating number in MB.
Node Information
The file "nodes.csv" contains the nodes information, mainly about the GPU informations. This file consists of the following information.
- NodeID: NodeID
- GPU Number: Number of GPUs equipped in this node.
- GPU Memory Capacity: Floating number in MB. The GPU memory capacity of a GPU card.
Notes:
Information about the node is collected from a specific day. The file "nodes.csv" may can not cover all the nodes that job/task was scheduled. But ee have kept the prefixes of the nodes consistent when we add an hash to information about nodes. So we can make a simple inference about the information about the node form the same prefix.
if you want to use the detailed resource usage information, please read the DATA_INFORMATION.MD first.
More data will be released when the collation is completed.