- 41574670b4 v1.0
cftang created GCU type debugging task cftan202401042072650(deleted)
3 months ago
cftang created GCU type debugging task cftan202401042072650(deleted)
3 months ago
cftang created repository cftang/GCU_PaddlePaddle_ModelZoo
3 months ago
cftang created CPU/GPU type debugging task cftan202310281784281(deleted)
5 months ago
cftang created repository cftang/openi-notebook
5 months ago
cftang commented on issue zeizei/OpenI_Learning#1015
GCU打榜热身活动(4.10~4.16)——完成任务可奖励20-40积分任务1(pythorch): bert_base 场景 卡 epochs train_batch_size predict_batch_size train_fps_mean exact_match f1 线性度 a 1 1 48 48 67.94706955 79.61210974 87.08313104 b 1 10 48 48 68.29758169 79.02554399 87.23861865 c 8 1 48 48 64.43498743 78.94985809 86.47563451 0.9483115 d 8 10 48 48 63.775345 78.11731315 86.64708563 0.933786284 e 8 100 24 24 23.92116933 77.61589404 85.68731233 根据场景c/a计算线性度=0.9483115(64.43498743/67.94706955) 根据场景d/b计算线性度=0.933786284(68.29758169/63.775345) 问题:单卡,100 epochs,train_batch_size=96时直接报line 46: 42 Illegal instruction (core dumped),可能内存不够了,最好有报错信息返回 ./TopsRider_t2x_2.1.52_samples/samples/model/torch/single_card/run_pytorch_bert_base_convergence_test.sh: line 46: 42 Illegal instruction (core dumped) python3 -u ./run_squad.py --device=dtu --do_train --do_predict --do_eval --train_batch_size=96 --predict_batch_size=96 --learning_rate=3e-5 --num_train_epochs=100 --max_steps=-1 --max_seq_length=384 --doc_stride=128 --do_lower_case --bert_model=bert-base-uncased --print_freq=20 --skip_steps=5 --init_checkpoint=${DATASET_DIR}/pytorch_bert_base/bert_base_init/bert_base.pt --train_file=${DATASET_DIR}/pytorch_bert_base/squad/v1.1/train-v1.1.json --predict_file=${DATASET_DIR}/pytorch_bert_base/squad/v1.1/dev-v1.1.json --vocab_file=${DATASET_DIR}/pytorch_bert_base/bert_base_init/vocab.txt --config_file=${DATASET_DIR}/pytorch_bert_base/bert_base_init/bert_config.json --eval_script=${DATASET_DIR}/pytorch_bert_base/squad/v1.1/evaluate-v1.1.py --output_dir=./output > ${LOG_FILE} 2>&1
1 year ago
cftang commented on issue zeizei/OpenI_Learning#1015
GCU打榜热身活动(4.10~4.16)——完成任务可奖励20-40积分任务二:基于PaddlePaddle + GCU跑通模型并测试GCU性能 resnet50 1.GCU单卡或8卡至少支持1个模型 1.1 单卡 epoch=1 ![image](/attachments/5e747962-eead-422f-825b-283bc1297fda) 1.2 8卡 epoch=1 ![image](/attachments/c80f4aec-945d-43d6-a10e-07a22a90ec6c) 2.统计GCU单卡/8卡线性度: 8卡FPS/(单卡FPS*8) 8卡 "train_fps_mean": 140.81122839865833, 单卡 "train_fps_mean": 174.87976037517075, 线性度 = 0.80518882286077608658489404384979
1 year ago
cftang created CPU/GPU type debugging task cftan202208141045914(deleted)
1 year ago
cftang created CPU/GPU type debugging task cftan202208141045914(deleted)
1 year ago
cftang created CPU/GPU type debugging task cftan202208112022761(deleted)
1 year ago
cftang created CPU/GPU type debugging task cftan202208112022609(deleted)
1 year ago
cftang created repository cftang/Paddle
1 year ago
cftang created CPU/GPU type debugging task cftan202208071351994(deleted)
1 year ago
cftang created repository cftang/modelbox
1 year ago
cftang created reasoning task mnist_inference
1 year ago
cftang created NPU training task cftan202207021132873
1 year ago
cftang created NPU training task cftan202207021132242
1 year ago
cftang created repository cftang/MNIST_Example_Mindspore_NPU
1 year ago
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》