|
- ***********************data_url**************************
- 2022-09-23 00:16:23,234:INFO:Args:
- 2022-09-23 00:16:23,234:INFO:--> backbone: yolox_darknet53
- 2022-09-23 00:16:23,234:INFO:--> data_aug: True
- 2022-09-23 00:16:23,234:INFO:--> device_target: GPU
- 2022-09-23 00:16:23,234:INFO:--> outputs_dir: ./2022-09-23_time_00_16_23
- 2022-09-23 00:16:23,234:INFO:--> save_graphs: False
- 2022-09-23 00:16:23,234:INFO:--> lr_scheduler: yolox_warm_cos_lr
- 2022-09-23 00:16:23,234:INFO:--> max_epoch: 15
- 2022-09-23 00:16:23,234:INFO:--> total_epoch: 30
- 2022-09-23 00:16:23,234:INFO:--> data_dir: /dataset/coco2017
- 2022-09-23 00:16:23,234:INFO:--> yolox_no_aug_ckpt:
- 2022-09-23 00:16:23,235:INFO:--> need_profiler: 0
- 2022-09-23 00:16:23,235:INFO:--> pretrained:
- 2022-09-23 00:16:23,235:INFO:--> resume_yolox:
- 2022-09-23 00:16:23,235:INFO:--> flip_prob: 0.5
- 2022-09-23 00:16:23,235:INFO:--> hsv_prob: 1.0
- 2022-09-23 00:16:23,235:INFO:--> per_batch_size: 8
- 2022-09-23 00:16:23,235:INFO:--> depth_wise: False
- 2022-09-23 00:16:23,235:INFO:--> max_gt: 48
- 2022-09-23 00:16:23,235:INFO:--> num_classes: 80
- 2022-09-23 00:16:23,235:INFO:--> input_size: [640, 640]
- 2022-09-23 00:16:23,235:INFO:--> fpn_strides: [8, 16, 32]
- 2022-09-23 00:16:23,235:INFO:--> use_l1: False
- 2022-09-23 00:16:23,235:INFO:--> use_syc_bn: True
- 2022-09-23 00:16:23,235:INFO:--> n_candidate_k: 10
- 2022-09-23 00:16:23,235:INFO:--> lr: 0.01
- 2022-09-23 00:16:23,235:INFO:--> min_lr_ratio: 0.01
- 2022-09-23 00:16:23,235:INFO:--> warmup_epochs: 5
- 2022-09-23 00:16:23,235:INFO:--> weight_decay: 0.0005
- 2022-09-23 00:16:23,236:INFO:--> momentum: 0.9
- 2022-09-23 00:16:23,236:INFO:--> no_aug_epochs: 15
- 2022-09-23 00:16:23,236:INFO:--> log_interval: 56
- 2022-09-23 00:16:23,236:INFO:--> ckpt_interval: -1
- 2022-09-23 00:16:23,236:INFO:--> is_save_on_master: 1
- 2022-09-23 00:16:23,236:INFO:--> is_distributed: 0
- 2022-09-23 00:16:23,236:INFO:--> rank: 0
- 2022-09-23 00:16:23,236:INFO:--> group_size: 1
- 2022-09-23 00:16:23,236:INFO:--> bind_cpu: True
- 2022-09-23 00:16:23,236:INFO:--> device_num: 8
- 2022-09-23 00:16:23,236:INFO:--> enable_modelarts: False
- 2022-09-23 00:16:23,236:INFO:--> need_modelarts_dataset_unzip: True
- 2022-09-23 00:16:23,236:INFO:--> modelarts_dataset_unzip_name: coco2017
- 2022-09-23 00:16:23,236:INFO:--> data_url:
- 2022-09-23 00:16:23,236:INFO:--> train_url:
- 2022-09-23 00:16:23,236:INFO:--> checkpoint_url:
- 2022-09-23 00:16:23,236:INFO:--> data_path: /home/work/user-job-dir/inputs/data/
- 2022-09-23 00:16:23,236:INFO:--> output_path: /home/work/user-job-dir/outputs/model/
- 2022-09-23 00:16:23,237:INFO:--> load_path: /cache/checkpoint_path
- 2022-09-23 00:16:23,237:INFO:--> ckpt_path: ./
- 2022-09-23 00:16:23,237:INFO:--> log_path: val/outputs/
- 2022-09-23 00:16:23,237:INFO:--> val_ckpt: 0-2755_64.ckpt
- 2022-09-23 00:16:23,237:INFO:--> conf_thre: 0.001
- 2022-09-23 00:16:23,237:INFO:--> nms_thre: 0.65
- 2022-09-23 00:16:23,237:INFO:--> result_path:
- 2022-09-23 00:16:23,237:INFO:--> file_format: MINDIR
- 2022-09-23 00:16:23,237:INFO:--> export_bs: 1
- 2022-09-23 00:16:23,237:INFO:--> interval: 5
- 2022-09-23 00:16:23,237:INFO:--> run_eval: True
- 2022-09-23 00:16:23,237:INFO:--> start_epoch: 0
- 2022-09-23 00:16:23,237:INFO:--> end_epoch: 30
- 2022-09-23 00:16:23,237:INFO:--> ema: True
- 2022-09-23 00:16:23,237:INFO:--> config_path: /code/model_utils/../default_config.yaml
- 2022-09-23 00:16:23,237:INFO:--> data_root: /dataset/coco2017/train2017
- 2022-09-23 00:16:23,237:INFO:--> annFile: /dataset/coco2017/annotations/instances_train2017.json
- 2022-09-23 00:16:23,238:INFO:--> logger: <LOGGER yolox (NOTSET)>
- 2022-09-23 00:16:23,238:INFO:
- 2022-09-23 00:16:23,238:INFO:Args:
- 2022-09-23 00:16:23,238:INFO:--> backbone: yolox_darknet53
- 2022-09-23 00:16:23,238:INFO:--> data_aug: True
- 2022-09-23 00:16:23,238:INFO:--> device_target: GPU
- 2022-09-23 00:16:23,238:INFO:--> outputs_dir: ./2022-09-23_time_00_16_23
- 2022-09-23 00:16:23,238:INFO:--> save_graphs: False
- 2022-09-23 00:16:23,239:INFO:--> lr_scheduler: yolox_warm_cos_lr
- 2022-09-23 00:16:23,239:INFO:--> max_epoch: 15
- 2022-09-23 00:16:23,239:INFO:--> total_epoch: 30
- 2022-09-23 00:16:23,239:INFO:--> data_dir: /dataset/coco2017
- 2022-09-23 00:16:23,239:INFO:--> yolox_no_aug_ckpt:
- 2022-09-23 00:16:23,239:INFO:--> need_profiler: 0
- 2022-09-23 00:16:23,239:INFO:--> pretrained:
- 2022-09-23 00:16:23,239:INFO:--> resume_yolox:
- 2022-09-23 00:16:23,239:INFO:--> flip_prob: 0.5
- 2022-09-23 00:16:23,239:INFO:--> hsv_prob: 1.0
- 2022-09-23 00:16:23,239:INFO:--> per_batch_size: 8
- 2022-09-23 00:16:23,239:INFO:--> depth_wise: False
- 2022-09-23 00:16:23,239:INFO:--> max_gt: 48
- 2022-09-23 00:16:23,239:INFO:--> num_classes: 80
- 2022-09-23 00:16:23,239:INFO:--> input_size: [640, 640]
- 2022-09-23 00:16:23,239:INFO:--> fpn_strides: [8, 16, 32]
- 2022-09-23 00:16:23,239:INFO:--> use_l1: False
- 2022-09-23 00:16:23,240:INFO:--> use_syc_bn: False
- 2022-09-23 00:16:23,240:INFO:--> n_candidate_k: 10
- 2022-09-23 00:16:23,240:INFO:--> lr: 0.01
- 2022-09-23 00:16:23,240:INFO:--> min_lr_ratio: 0.01
- 2022-09-23 00:16:23,240:INFO:--> warmup_epochs: 5
- 2022-09-23 00:16:23,240:INFO:--> weight_decay: 0.0005
- 2022-09-23 00:16:23,240:INFO:--> momentum: 0.9
- 2022-09-23 00:16:23,240:INFO:--> no_aug_epochs: 15
- 2022-09-23 00:16:23,240:INFO:--> log_interval: 56
- 2022-09-23 00:16:23,240:INFO:--> ckpt_interval: -1
- 2022-09-23 00:16:23,240:INFO:--> is_save_on_master: 1
- 2022-09-23 00:16:23,240:INFO:--> is_distributed: 0
- 2022-09-23 00:16:23,240:INFO:--> rank: 0
- 2022-09-23 00:16:23,240:INFO:--> group_size: 1
- 2022-09-23 00:16:23,240:INFO:--> bind_cpu: True
- 2022-09-23 00:16:23,240:INFO:--> device_num: 8
- 2022-09-23 00:16:23,240:INFO:--> enable_modelarts: False
- 2022-09-23 00:16:23,240:INFO:--> need_modelarts_dataset_unzip: True
- 2022-09-23 00:16:23,241:INFO:--> modelarts_dataset_unzip_name: coco2017
- 2022-09-23 00:16:23,241:INFO:--> data_url:
- 2022-09-23 00:16:23,241:INFO:--> train_url:
- 2022-09-23 00:16:23,241:INFO:--> checkpoint_url:
- 2022-09-23 00:16:23,241:INFO:--> data_path: /home/work/user-job-dir/inputs/data/
- 2022-09-23 00:16:23,241:INFO:--> output_path: /home/work/user-job-dir/outputs/model/
- 2022-09-23 00:16:23,241:INFO:--> load_path: /cache/checkpoint_path
- 2022-09-23 00:16:23,241:INFO:--> ckpt_path: ./
- 2022-09-23 00:16:23,241:INFO:--> log_path: val/outputs/
- 2022-09-23 00:16:23,241:INFO:--> val_ckpt: 0-2755_64.ckpt
- 2022-09-23 00:16:23,241:INFO:--> conf_thre: 0.001
- 2022-09-23 00:16:23,241:INFO:--> nms_thre: 0.65
- 2022-09-23 00:16:23,241:INFO:--> result_path:
- 2022-09-23 00:16:23,241:INFO:--> file_format: MINDIR
- 2022-09-23 00:16:23,241:INFO:--> export_bs: 1
- 2022-09-23 00:16:23,241:INFO:--> interval: 5
- 2022-09-23 00:16:23,241:INFO:--> run_eval: True
- 2022-09-23 00:16:23,242:INFO:--> start_epoch: 0
- 2022-09-23 00:16:23,242:INFO:--> end_epoch: 30
- 2022-09-23 00:16:23,242:INFO:--> ema: True
- 2022-09-23 00:16:23,242:INFO:--> config_path: /code/model_utils/../default_config.yaml
- 2022-09-23 00:16:23,242:INFO:--> data_root: /dataset/coco2017/train2017
- 2022-09-23 00:16:23,242:INFO:--> annFile: /dataset/coco2017/annotations/instances_train2017.json
- 2022-09-23 00:16:23,242:INFO:--> logger: <LOGGER yolox (NOTSET)>
- 2022-09-23 00:16:23,242:INFO:--> rank_save_ckpt_flag: 1
- 2022-09-23 00:16:23,242:INFO:
- 2022-09-23 00:16:26,289:INFO:Training backbone is: yolox_darknet53
- 2022-09-23 00:16:27,590:INFO:Network weights have been initialized...
- 2022-09-23 00:16:27,596:INFO:Finish getting network...
- 2:config.data_root /dataset/coco2017/train2017
- 2 config.annFile /dataset/coco2017/annotations/instances_train2017.json
- before ds: /dataset/coco2017/train2017
- before ds: /dataset/coco2017/annotations/instances_train2017.json
- loading annotations into memory...
- Done (t=14.76s)
- creating index...
- index created!
- loading annotations into memory...
- Done (t=0.46s)
- creating index...
- index created!
- 2022-09-23 00:16:43,981:INFO:Finish loading the val dataset!
- 2022-09-23 00:16:43,981:INFO:Finish loading training dataset! batch size:8
- 2022-09-23 00:16:46,097:INFO:14785 steps for one epoch.
- 2022-09-23 00:16:46,310:INFO:Learning rate scheduler:yolox_warm_cos_lr, base_lr:0.01, min lr ratio:0.01
- 2022-09-23 00:16:48,792:INFO:Add ema model
- loading annotations into memory...
- Done (t=0.45s)
- creating index...
- index created!
- =============================self.save_path ./2022-09-23_time_00_16_23/ckpt_0/
- 2022-09-23 00:16:49,278:INFO:Epoch number:15
- 2022-09-23 00:16:49,278:INFO:All steps number:221775
- 2022-09-23 00:16:49,278:INFO:==================Start Training=========================
- 2022-09-23 00:27:21,743:INFO:epoch: 0 step: [0/14785], loss: 16.5536, overflow: False, scale: 4194304, lr: 0.000000, time: 11302.65
- 2022-09-23 00:27:55,111:INFO:epoch: 0 step: [56/14785], loss: 16.6861, overflow: False, scale: 4194304, lr: 0.000000, time: 595.77
- 2022-09-23 00:28:28,714:INFO:epoch: 0 step: [112/14785], loss: 16.8601, overflow: False, scale: 4194304, lr: 0.000000, time: 600.00
- 2022-09-23 00:29:02,414:INFO:epoch: 0 step: [168/14785], loss: 15.1920, overflow: False, scale: 4194304, lr: 0.000000, time: 601.75
- 2022-09-23 00:29:36,009:INFO:epoch: 0 step: [224/14785], loss: 17.3856, overflow: False, scale: 4194304, lr: 0.000000, time: 599.88
- 2022-09-23 00:30:09,604:INFO:epoch: 0 step: [280/14785], loss: 17.1606, overflow: False, scale: 4194304, lr: 0.000000, time: 599.87
- 2022-09-23 00:30:43,212:INFO:epoch: 0 step: [336/14785], loss: 16.8648, overflow: False, scale: 4194304, lr: 0.000000, time: 600.10
- 2022-09-23 00:31:16,809:INFO:epoch: 0 step: [392/14785], loss: 14.8211, overflow: False, scale: 4194304, lr: 0.000000, time: 599.91
- 2022-09-23 00:31:50,094:INFO:epoch: 0 step: [448/14785], loss: 17.5050, overflow: False, scale: 4194304, lr: 0.000000, time: 594.33
- 2022-09-23 00:32:23,498:INFO:epoch: 0 step: [504/14785], loss: 15.1142, overflow: False, scale: 4194304, lr: 0.000000, time: 596.47
- 2022-09-23 00:32:56,907:INFO:epoch: 0 step: [560/14785], loss: 15.8946, overflow: False, scale: 4194304, lr: 0.000001, time: 596.56
- 2022-09-23 00:33:30,105:INFO:epoch: 0 step: [616/14785], loss: 15.6947, overflow: False, scale: 4194304, lr: 0.000001, time: 592.77
- 2022-09-23 00:34:03,323:INFO:epoch: 0 step: [672/14785], loss: 14.8325, overflow: False, scale: 4194304, lr: 0.000001, time: 593.16
- 2022-09-23 00:34:36,420:INFO:epoch: 0 step: [728/14785], loss: 16.1739, overflow: False, scale: 4194304, lr: 0.000001, time: 590.92
- 2022-09-23 00:35:09,713:INFO:epoch: 0 step: [784/14785], loss: 14.5879, overflow: False, scale: 4194304, lr: 0.000001, time: 594.49
- 2022-09-23 00:35:43,105:INFO:epoch: 0 step: [840/14785], loss: 15.1365, overflow: False, scale: 4194304, lr: 0.000001, time: 596.25
- 2022-09-23 00:36:16,499:INFO:epoch: 0 step: [896/14785], loss: 15.0237, overflow: False, scale: 4194304, lr: 0.000001, time: 596.26
- 2022-09-23 00:36:49,908:INFO:epoch: 0 step: [952/14785], loss: 14.8522, overflow: False, scale: 4194304, lr: 0.000002, time: 596.56
- 2022-09-23 00:37:23,304:INFO:epoch: 0 step: [1008/14785], loss: 13.0060, overflow: False, scale: 4194304, lr: 0.000002, time: 596.34
- 2022-09-23 00:37:56,299:INFO:epoch: 0 step: [1064/14785], loss: 13.3474, overflow: False, scale: 4194304, lr: 0.000002, time: 589.15
- 2022-09-23 00:38:29,606:INFO:epoch: 0 step: [1120/14785], loss: 14.1374, overflow: False, scale: 4194304, lr: 0.000002, time: 594.72
- 2022-09-23 00:39:02,903:INFO:epoch: 0 step: [1176/14785], loss: 13.0796, overflow: False, scale: 4194304, lr: 0.000003, time: 594.55
- 2022-09-23 00:39:36,107:INFO:epoch: 0 step: [1232/14785], loss: 13.1477, overflow: False, scale: 4194304, lr: 0.000003, time: 592.87
- 2022-09-23 00:40:09,448:INFO:epoch: 0 step: [1288/14785], loss: 13.2250, overflow: False, scale: 4194304, lr: 0.000003, time: 595.34
- 2022-09-23 00:40:42,957:INFO:epoch: 0 step: [1344/14785], loss: 13.4732, overflow: False, scale: 4194304, lr: 0.000003, time: 598.33
- 2022-09-23 00:41:16,404:INFO:epoch: 0 step: [1400/14785], loss: 12.8062, overflow: False, scale: 4194304, lr: 0.000004, time: 597.20
- 2022-09-23 00:41:49,708:INFO:epoch: 0 step: [1456/14785], loss: 12.9970, overflow: False, scale: 4194304, lr: 0.000004, time: 594.69
- 2022-09-23 00:42:23,102:INFO:epoch: 0 step: [1512/14785], loss: 13.4036, overflow: False, scale: 4194304, lr: 0.000004, time: 596.27
- 2022-09-23 00:42:56,302:INFO:epoch: 0 step: [1568/14785], loss: 13.3095, overflow: False, scale: 4194304, lr: 0.000005, time: 592.82
- 2022-09-23 00:43:29,508:INFO:epoch: 0 step: [1624/14785], loss: 13.3287, overflow: False, scale: 4194304, lr: 0.000005, time: 592.94
- 2022-09-23 00:44:02,970:INFO:epoch: 0 step: [1680/14785], loss: 12.5004, overflow: False, scale: 4194304, lr: 0.000005, time: 597.50
- 2022-09-23 00:44:36,387:INFO:epoch: 0 step: [1736/14785], loss: 12.6681, overflow: False, scale: 4194304, lr: 0.000006, time: 596.70
- 2022-09-23 00:45:09,794:INFO:epoch: 0 step: [1792/14785], loss: 12.5229, overflow: False, scale: 4194304, lr: 0.000006, time: 596.52
- 2022-09-23 00:45:43,039:INFO:epoch: 0 step: [1848/14785], loss: 12.1766, overflow: False, scale: 4194304, lr: 0.000006, time: 593.62
- 2022-09-23 00:46:16,109:INFO:epoch: 0 step: [1904/14785], loss: 12.3222, overflow: False, scale: 4194304, lr: 0.000007, time: 590.50
- 2022-09-23 00:46:49,397:INFO:epoch: 0 step: [1960/14785], loss: 12.2096, overflow: False, scale: 8388608, lr: 0.000007, time: 594.38
- 2022-09-23 00:47:22,701:INFO:epoch: 0 step: [2016/14785], loss: 12.3094, overflow: False, scale: 8388608, lr: 0.000007, time: 594.68
- 2022-09-23 00:47:56,208:INFO:epoch: 0 step: [2072/14785], loss: 11.9540, overflow: False, scale: 8388608, lr: 0.000008, time: 598.29
- 2022-09-23 00:48:29,501:INFO:epoch: 0 step: [2128/14785], loss: 12.2128, overflow: False, scale: 8388608, lr: 0.000008, time: 594.49
- 2022-09-23 00:49:02,904:INFO:epoch: 0 step: [2184/14785], loss: 11.7839, overflow: False, scale: 8388608, lr: 0.000009, time: 596.42
- 2022-09-23 00:49:36,304:INFO:epoch: 0 step: [2240/14785], loss: 12.2202, overflow: False, scale: 8388608, lr: 0.000009, time: 596.39
- 2022-09-23 00:50:09,559:INFO:epoch: 0 step: [2296/14785], loss: 11.9633, overflow: False, scale: 8388608, lr: 0.000010, time: 593.80
- 2022-09-23 00:50:42,702:INFO:epoch: 0 step: [2352/14785], loss: 12.1360, overflow: False, scale: 8388608, lr: 0.000010, time: 591.81
- 2022-09-23 00:51:16,205:INFO:epoch: 0 step: [2408/14785], loss: 12.0454, overflow: False, scale: 8388608, lr: 0.000011, time: 598.24
- 2022-09-23 00:51:49,292:INFO:epoch: 0 step: [2464/14785], loss: 11.7633, overflow: False, scale: 8388608, lr: 0.000011, time: 590.78
- 2022-09-23 00:52:22,504:INFO:epoch: 0 step: [2520/14785], loss: 11.6970, overflow: False, scale: 8388608, lr: 0.000012, time: 592.98
- 2022-09-23 00:52:55,703:INFO:epoch: 0 step: [2576/14785], loss: 11.6713, overflow: False, scale: 8388608, lr: 0.000012, time: 592.81
- 2022-09-23 00:53:29,110:INFO:epoch: 0 step: [2632/14785], loss: 11.5366, overflow: False, scale: 8388608, lr: 0.000013, time: 596.50
- 2022-09-23 00:54:02,397:INFO:epoch: 0 step: [2688/14785], loss: 11.5653, overflow: False, scale: 8388608, lr: 0.000013, time: 594.36
- 2022-09-23 00:54:35,696:INFO:epoch: 0 step: [2744/14785], loss: 11.8926, overflow: False, scale: 8388608, lr: 0.000014, time: 594.59
- 2022-09-23 00:55:09,096:INFO:epoch: 0 step: [2800/14785], loss: 11.5865, overflow: False, scale: 8388608, lr: 0.000014, time: 596.39
- 2022-09-23 00:55:42,369:INFO:epoch: 0 step: [2856/14785], loss: 11.4571, overflow: False, scale: 8388608, lr: 0.000015, time: 594.13
- 2022-09-23 00:56:15,902:INFO:epoch: 0 step: [2912/14785], loss: 11.5985, overflow: False, scale: 8388608, lr: 0.000016, time: 598.76
- 2022-09-23 00:56:49,099:INFO:epoch: 0 step: [2968/14785], loss: 11.5091, overflow: False, scale: 8388608, lr: 0.000016, time: 592.74
- 2022-09-23 00:57:22,496:INFO:epoch: 0 step: [3024/14785], loss: 11.8182, overflow: False, scale: 8388608, lr: 0.000017, time: 596.35
- 2022-09-23 00:57:56,004:INFO:epoch: 0 step: [3080/14785], loss: 12.1749, overflow: False, scale: 8388608, lr: 0.000017, time: 598.32
- 2022-09-23 00:58:29,200:INFO:epoch: 0 step: [3136/14785], loss: 12.1216, overflow: False, scale: 8388608, lr: 0.000018, time: 592.76
- 2022-09-23 00:59:02,621:INFO:epoch: 0 step: [3192/14785], loss: 11.4005, overflow: False, scale: 8388608, lr: 0.000019, time: 596.76
- 2022-09-23 00:59:35,995:INFO:epoch: 0 step: [3248/14785], loss: 11.6250, overflow: False, scale: 8388608, lr: 0.000019, time: 595.91
- 2022-09-23 01:00:09,200:INFO:epoch: 0 step: [3304/14785], loss: 11.2246, overflow: False, scale: 8388608, lr: 0.000020, time: 592.90
- 2022-09-23 01:00:42,305:INFO:epoch: 0 step: [3360/14785], loss: 11.6701, overflow: False, scale: 8388608, lr: 0.000021, time: 591.12
- 2022-09-23 01:01:15,596:INFO:epoch: 0 step: [3416/14785], loss: 11.2784, overflow: False, scale: 8388608, lr: 0.000021, time: 594.45
- 2022-09-23 01:01:49,006:INFO:epoch: 0 step: [3472/14785], loss: 11.2578, overflow: False, scale: 8388608, lr: 0.000022, time: 596.59
- 2022-09-23 01:02:22,295:INFO:epoch: 0 step: [3528/14785], loss: 11.4301, overflow: False, scale: 8388608, lr: 0.000023, time: 594.39
- 2022-09-23 01:02:55,603:INFO:epoch: 0 step: [3584/14785], loss: 11.8299, overflow: False, scale: 8388608, lr: 0.000024, time: 594.74
- 2022-09-23 01:03:29,103:INFO:epoch: 0 step: [3640/14785], loss: 11.5810, overflow: False, scale: 8388608, lr: 0.000024, time: 598.18
- 2022-09-23 01:04:02,312:INFO:epoch: 0 step: [3696/14785], loss: 11.2652, overflow: False, scale: 8388608, lr: 0.000025, time: 592.97
- 2022-09-23 01:04:35,330:INFO:epoch: 0 step: [3752/14785], loss: 11.4685, overflow: False, scale: 8388608, lr: 0.000026, time: 589.59
- 2022-09-23 01:05:08,544:INFO:epoch: 0 step: [3808/14785], loss: 11.8584, overflow: False, scale: 8388608, lr: 0.000027, time: 593.05
- 2022-09-23 01:05:41,599:INFO:epoch: 0 step: [3864/14785], loss: 11.5200, overflow: False, scale: 8388608, lr: 0.000027, time: 590.25
- 2022-09-23 01:06:15,106:INFO:epoch: 0 step: [3920/14785], loss: 11.6293, overflow: False, scale: 8388608, lr: 0.000028, time: 598.27
- 2022-09-23 01:06:48,602:INFO:epoch: 0 step: [3976/14785], loss: 11.4959, overflow: False, scale: 16777216, lr: 0.000029, time: 598.06
- 2022-09-23 01:07:21,908:INFO:epoch: 0 step: [4032/14785], loss: 11.1830, overflow: False, scale: 16777216, lr: 0.000030, time: 594.72
- 2022-09-23 01:07:55,503:INFO:epoch: 0 step: [4088/14785], loss: 11.4709, overflow: False, scale: 16777216, lr: 0.000031, time: 599.87
- 2022-09-23 01:08:29,002:INFO:epoch: 0 step: [4144/14785], loss: 11.4550, overflow: False, scale: 16777216, lr: 0.000031, time: 598.15
- 2022-09-23 01:09:02,394:INFO:epoch: 0 step: [4200/14785], loss: 11.4590, overflow: False, scale: 16777216, lr: 0.000032, time: 596.26
- 2022-09-23 01:09:35,612:INFO:epoch: 0 step: [4256/14785], loss: 11.3435, overflow: False, scale: 16777216, lr: 0.000033, time: 593.12
- 2022-09-23 01:10:08,994:INFO:epoch: 0 step: [4312/14785], loss: 11.7930, overflow: False, scale: 16777216, lr: 0.000034, time: 596.07
- 2022-09-23 01:10:42,199:INFO:epoch: 0 step: [4368/14785], loss: 11.6554, overflow: False, scale: 16777216, lr: 0.000035, time: 592.90
- 2022-09-23 01:11:15,352:INFO:epoch: 0 step: [4424/14785], loss: 11.3560, overflow: False, scale: 16777216, lr: 0.000036, time: 591.98
- 2022-09-23 01:11:48,504:INFO:epoch: 0 step: [4480/14785], loss: 11.4438, overflow: False, scale: 16777216, lr: 0.000037, time: 591.95
- 2022-09-23 01:12:21,701:INFO:epoch: 0 step: [4536/14785], loss: 11.1466, overflow: False, scale: 16777216, lr: 0.000038, time: 592.77
- 2022-09-23 01:12:54,698:INFO:epoch: 0 step: [4592/14785], loss: 11.2773, overflow: False, scale: 16777216, lr: 0.000039, time: 589.19
- 2022-09-23 01:13:27,605:INFO:epoch: 0 step: [4648/14785], loss: 11.4214, overflow: False, scale: 16777216, lr: 0.000040, time: 587.58
- 2022-09-23 01:14:00,600:INFO:epoch: 0 step: [4704/14785], loss: 11.5596, overflow: False, scale: 16777216, lr: 0.000041, time: 589.18
- 2022-09-23 01:14:33,695:INFO:epoch: 0 step: [4760/14785], loss: 11.7842, overflow: False, scale: 16777216, lr: 0.000041, time: 590.95
- 2022-09-23 01:15:07,081:INFO:epoch: 0 step: [4816/14785], loss: 11.1415, overflow: False, scale: 16777216, lr: 0.000042, time: 596.13
- 2022-09-23 01:15:40,204:INFO:epoch: 0 step: [4872/14785], loss: 11.4526, overflow: False, scale: 16777216, lr: 0.000043, time: 591.46
- 2022-09-23 01:16:13,297:INFO:epoch: 0 step: [4928/14785], loss: 11.5080, overflow: False, scale: 16777216, lr: 0.000044, time: 590.90
- 2022-09-23 01:16:46,694:INFO:epoch: 0 step: [4984/14785], loss: 11.4143, overflow: False, scale: 16777216, lr: 0.000045, time: 596.32
- 2022-09-23 01:17:20,203:INFO:epoch: 0 step: [5040/14785], loss: 11.3855, overflow: False, scale: 16777216, lr: 0.000046, time: 598.32
- 2022-09-23 01:17:53,439:INFO:epoch: 0 step: [5096/14785], loss: 11.3906, overflow: False, scale: 16777216, lr: 0.000048, time: 593.44
- 2022-09-23 01:18:26,802:INFO:epoch: 0 step: [5152/14785], loss: 11.2814, overflow: False, scale: 16777216, lr: 0.000049, time: 595.72
- 2022-09-23 01:19:00,208:INFO:epoch: 0 step: [5208/14785], loss: 11.2183, overflow: False, scale: 16777216, lr: 0.000050, time: 596.47
- 2022-09-23 01:19:33,660:INFO:epoch: 0 step: [5264/14785], loss: 11.2175, overflow: False, scale: 16777216, lr: 0.000051, time: 597.30
- 2022-09-23 01:20:07,206:INFO:epoch: 0 step: [5320/14785], loss: 11.6454, overflow: False, scale: 16777216, lr: 0.000052, time: 598.99
- 2022-09-23 01:20:40,703:INFO:epoch: 0 step: [5376/14785], loss: 11.7865, overflow: False, scale: 16777216, lr: 0.000053, time: 598.12
- 2022-09-23 01:21:14,198:INFO:epoch: 0 step: [5432/14785], loss: 11.4281, overflow: False, scale: 16777216, lr: 0.000054, time: 598.07
- 2022-09-23 01:21:47,799:INFO:epoch: 0 step: [5488/14785], loss: 10.9863, overflow: False, scale: 16777216, lr: 0.000055, time: 599.97
- 2022-09-23 01:22:21,303:INFO:epoch: 0 step: [5544/14785], loss: 11.5683, overflow: False, scale: 16777216, lr: 0.000056, time: 598.25
- 2022-09-23 01:22:54,903:INFO:epoch: 0 step: [5600/14785], loss: 11.2892, overflow: False, scale: 16777216, lr: 0.000057, time: 599.97
- 2022-09-23 01:23:28,503:INFO:epoch: 0 step: [5656/14785], loss: 10.9236, overflow: False, scale: 16777216, lr: 0.000059, time: 599.95
- 2022-09-23 01:24:02,104:INFO:epoch: 0 step: [5712/14785], loss: 11.4661, overflow: False, scale: 16777216, lr: 0.000060, time: 599.99
- 2022-09-23 01:24:35,395:INFO:epoch: 0 step: [5768/14785], loss: 11.1328, overflow: False, scale: 16777216, lr: 0.000061, time: 594.45
- 2022-09-23 01:25:08,806:INFO:epoch: 0 step: [5824/14785], loss: 11.5490, overflow: False, scale: 16777216, lr: 0.000062, time: 596.57
- 2022-09-23 01:25:42,401:INFO:epoch: 0 step: [5880/14785], loss: 11.2434, overflow: False, scale: 16777216, lr: 0.000063, time: 599.77
- 2022-09-23 01:26:15,907:INFO:epoch: 0 step: [5936/14785], loss: 11.6049, overflow: False, scale: 16777216, lr: 0.000064, time: 598.29
- 2022-09-23 01:26:49,407:INFO:epoch: 0 step: [5992/14785], loss: 10.9740, overflow: False, scale: 33554432, lr: 0.000066, time: 598.17
- 2022-09-23 01:27:22,797:INFO:epoch: 0 step: [6048/14785], loss: 11.5274, overflow: False, scale: 33554432, lr: 0.000067, time: 596.21
- 2022-09-23 01:27:56,294:INFO:epoch: 0 step: [6104/14785], loss: 11.1844, overflow: False, scale: 33554432, lr: 0.000068, time: 598.13
- 2022-09-23 01:28:29,806:INFO:epoch: 0 step: [6160/14785], loss: 11.2540, overflow: False, scale: 33554432, lr: 0.000069, time: 598.38
- 2022-09-23 01:29:03,205:INFO:epoch: 0 step: [6216/14785], loss: 11.7043, overflow: False, scale: 33554432, lr: 0.000071, time: 596.38
- 2022-09-23 01:29:36,527:INFO:epoch: 0 step: [6272/14785], loss: 11.9052, overflow: False, scale: 33554432, lr: 0.000072, time: 594.99
- 2022-09-23 01:30:09,997:INFO:epoch: 0 step: [6328/14785], loss: 10.8609, overflow: False, scale: 33554432, lr: 0.000073, time: 597.63
- 2022-09-23 01:30:43,401:INFO:epoch: 0 step: [6384/14785], loss: 10.9260, overflow: False, scale: 33554432, lr: 0.000075, time: 596.48
- 2022-09-23 01:31:16,998:INFO:epoch: 0 step: [6440/14785], loss: 11.1515, overflow: False, scale: 33554432, lr: 0.000076, time: 599.91
- 2022-09-23 01:31:50,395:INFO:epoch: 0 step: [6496/14785], loss: 11.2144, overflow: False, scale: 33554432, lr: 0.000077, time: 596.34
- 2022-09-23 01:32:23,910:INFO:epoch: 0 step: [6552/14785], loss: 11.8112, overflow: False, scale: 33554432, lr: 0.000079, time: 598.44
- 2022-09-23 01:32:57,305:INFO:epoch: 0 step: [6608/14785], loss: 11.1904, overflow: False, scale: 33554432, lr: 0.000080, time: 596.29
- 2022-09-23 01:33:30,803:INFO:epoch: 0 step: [6664/14785], loss: 11.6389, overflow: False, scale: 33554432, lr: 0.000081, time: 598.14
- 2022-09-23 01:34:04,195:INFO:epoch: 0 step: [6720/14785], loss: 11.3638, overflow: False, scale: 33554432, lr: 0.000083, time: 596.13
- 2022-09-23 01:34:37,602:INFO:epoch: 0 step: [6776/14785], loss: 11.4481, overflow: False, scale: 33554432, lr: 0.000084, time: 596.53
- 2022-09-23 01:35:10,939:INFO:epoch: 0 step: [6832/14785], loss: 11.6650, overflow: False, scale: 33554432, lr: 0.000085, time: 595.28
- 2022-09-23 01:35:44,399:INFO:epoch: 0 step: [6888/14785], loss: 11.6558, overflow: False, scale: 33554432, lr: 0.000087, time: 597.46
- 2022-09-23 01:36:18,012:INFO:epoch: 0 step: [6944/14785], loss: 10.8416, overflow: False, scale: 33554432, lr: 0.000088, time: 600.20
- 2022-09-23 01:36:51,492:INFO:epoch: 0 step: [7000/14785], loss: 11.1136, overflow: False, scale: 33554432, lr: 0.000090, time: 597.79
- 2022-09-23 01:37:25,008:INFO:epoch: 0 step: [7056/14785], loss: 11.1037, overflow: False, scale: 33554432, lr: 0.000091, time: 598.46
- 2022-09-23 01:37:58,603:INFO:epoch: 0 step: [7112/14785], loss: 11.4767, overflow: False, scale: 33554432, lr: 0.000093, time: 599.88
- 2022-09-23 01:38:32,099:INFO:epoch: 0 step: [7168/14785], loss: 11.2608, overflow: False, scale: 33554432, lr: 0.000094, time: 598.10
- 2022-09-23 01:39:05,605:INFO:epoch: 0 step: [7224/14785], loss: 10.9175, overflow: False, scale: 33554432, lr: 0.000096, time: 598.29
- 2022-09-23 01:39:38,915:INFO:epoch: 0 step: [7280/14785], loss: 11.0856, overflow: False, scale: 33554432, lr: 0.000097, time: 594.78
- 2022-09-23 01:40:12,388:INFO:epoch: 0 step: [7336/14785], loss: 11.2940, overflow: False, scale: 33554432, lr: 0.000099, time: 597.68
- 2022-09-23 01:40:45,702:INFO:epoch: 0 step: [7392/14785], loss: 11.6151, overflow: False, scale: 33554432, lr: 0.000100, time: 594.84
- 2022-09-23 01:41:19,105:INFO:epoch: 0 step: [7448/14785], loss: 10.9411, overflow: False, scale: 33554432, lr: 0.000102, time: 596.47
- 2022-09-23 01:41:52,696:INFO:epoch: 0 step: [7504/14785], loss: 10.8246, overflow: False, scale: 33554432, lr: 0.000103, time: 599.78
- 2022-09-23 01:42:26,193:INFO:epoch: 0 step: [7560/14785], loss: 11.7228, overflow: False, scale: 33554432, lr: 0.000105, time: 598.12
- 2022-09-23 01:42:59,502:INFO:epoch: 0 step: [7616/14785], loss: 11.0457, overflow: False, scale: 33554432, lr: 0.000106, time: 594.76
- 2022-09-23 01:43:33,005:INFO:epoch: 0 step: [7672/14785], loss: 11.4795, overflow: False, scale: 33554432, lr: 0.000108, time: 598.21
- 2022-09-23 01:44:06,596:INFO:epoch: 0 step: [7728/14785], loss: 10.8455, overflow: False, scale: 33554432, lr: 0.000109, time: 599.80
- 2022-09-23 01:44:40,110:INFO:epoch: 0 step: [7784/14785], loss: 10.8661, overflow: False, scale: 33554432, lr: 0.000111, time: 598.42
- 2022-09-23 01:45:13,400:INFO:epoch: 0 step: [7840/14785], loss: 11.2441, overflow: False, scale: 33554432, lr: 0.000113, time: 594.43
- 2022-09-23 01:45:47,004:INFO:epoch: 0 step: [7896/14785], loss: 11.2885, overflow: False, scale: 33554432, lr: 0.000114, time: 600.03
- 2022-09-23 01:46:20,505:INFO:epoch: 0 step: [7952/14785], loss: 11.4667, overflow: False, scale: 67108864, lr: 0.000116, time: 598.21
- 2022-09-23 01:46:53,999:INFO:epoch: 0 step: [8008/14785], loss: 11.4195, overflow: False, scale: 67108864, lr: 0.000117, time: 598.06
- 2022-09-23 01:47:27,603:INFO:epoch: 0 step: [8064/14785], loss: 11.0689, overflow: False, scale: 67108864, lr: 0.000119, time: 600.04
- 2022-09-23 01:48:01,099:INFO:epoch: 0 step: [8120/14785], loss: 11.1816, overflow: False, scale: 67108864, lr: 0.000121, time: 598.09
- 2022-09-23 01:48:34,698:INFO:epoch: 0 step: [8176/14785], loss: 10.7030, overflow: False, scale: 67108864, lr: 0.000122, time: 599.95
- 2022-09-23 01:49:08,302:INFO:epoch: 0 step: [8232/14785], loss: 11.0094, overflow: False, scale: 67108864, lr: 0.000124, time: 600.02
- 2022-09-23 01:49:41,700:INFO:epoch: 0 step: [8288/14785], loss: 10.8420, overflow: False, scale: 67108864, lr: 0.000126, time: 596.34
- 2022-09-23 01:50:15,203:INFO:epoch: 0 step: [8344/14785], loss: 10.9873, overflow: False, scale: 67108864, lr: 0.000127, time: 598.23
- 2022-09-23 01:50:48,598:INFO:epoch: 0 step: [8400/14785], loss: 10.9380, overflow: False, scale: 67108864, lr: 0.000129, time: 596.28
- 2022-09-23 01:51:22,105:INFO:epoch: 0 step: [8456/14785], loss: 11.1417, overflow: False, scale: 67108864, lr: 0.000131, time: 598.29
- 2022-09-23 01:51:55,594:INFO:epoch: 0 step: [8512/14785], loss: 11.6600, overflow: False, scale: 67108864, lr: 0.000133, time: 597.99
- 2022-09-23 01:52:29,129:INFO:epoch: 0 step: [8568/14785], loss: 11.2255, overflow: False, scale: 67108864, lr: 0.000134, time: 598.80
- 2022-09-23 01:53:02,503:INFO:epoch: 0 step: [8624/14785], loss: 11.4813, overflow: False, scale: 67108864, lr: 0.000136, time: 595.94
- 2022-09-23 01:53:36,092:INFO:epoch: 0 step: [8680/14785], loss: 11.2146, overflow: False, scale: 67108864, lr: 0.000138, time: 599.78
- 2022-09-23 01:54:09,702:INFO:epoch: 0 step: [8736/14785], loss: 10.9229, overflow: False, scale: 67108864, lr: 0.000140, time: 600.13
- 2022-09-23 01:54:43,204:INFO:epoch: 0 step: [8792/14785], loss: 11.3413, overflow: False, scale: 67108864, lr: 0.000141, time: 598.21
- 2022-09-23 01:55:16,793:INFO:epoch: 0 step: [8848/14785], loss: 11.3068, overflow: False, scale: 67108864, lr: 0.000143, time: 599.77
- 2022-09-23 01:55:50,208:INFO:epoch: 0 step: [8904/14785], loss: 10.9157, overflow: False, scale: 67108864, lr: 0.000145, time: 596.64
- 2022-09-23 01:56:23,606:INFO:epoch: 0 step: [8960/14785], loss: 11.0286, overflow: False, scale: 67108864, lr: 0.000147, time: 596.36
- 2022-09-23 01:56:57,201:INFO:epoch: 0 step: [9016/14785], loss: 11.2310, overflow: False, scale: 67108864, lr: 0.000149, time: 599.87
- 2022-09-23 01:57:30,805:INFO:epoch: 0 step: [9072/14785], loss: 11.1832, overflow: False, scale: 67108864, lr: 0.000151, time: 600.03
- 2022-09-23 01:58:04,312:INFO:epoch: 0 step: [9128/14785], loss: 11.2789, overflow: False, scale: 67108864, lr: 0.000152, time: 598.31
- 2022-09-23 01:58:37,819:INFO:epoch: 0 step: [9184/14785], loss: 10.7294, overflow: False, scale: 67108864, lr: 0.000154, time: 598.30
- 2022-09-23 01:59:11,304:INFO:epoch: 0 step: [9240/14785], loss: 11.1377, overflow: False, scale: 67108864, lr: 0.000156, time: 597.89
- 2022-09-23 01:59:44,902:INFO:epoch: 0 step: [9296/14785], loss: 11.3302, overflow: False, scale: 67108864, lr: 0.000158, time: 599.93
- 2022-09-23 02:00:18,401:INFO:epoch: 0 step: [9352/14785], loss: 11.0546, overflow: False, scale: 67108864, lr: 0.000160, time: 598.16
- 2022-09-23 02:00:52,006:INFO:epoch: 0 step: [9408/14785], loss: 11.0475, overflow: False, scale: 67108864, lr: 0.000162, time: 600.02
- 2022-09-23 02:01:25,500:INFO:epoch: 0 step: [9464/14785], loss: 11.0932, overflow: False, scale: 67108864, lr: 0.000164, time: 598.07
- 2022-09-23 02:01:59,099:INFO:epoch: 0 step: [9520/14785], loss: 11.0672, overflow: False, scale: 67108864, lr: 0.000166, time: 599.94
- 2022-09-23 02:02:32,700:INFO:epoch: 0 step: [9576/14785], loss: 10.6480, overflow: False, scale: 67108864, lr: 0.000168, time: 599.97
- 2022-09-23 02:03:06,109:INFO:epoch: 0 step: [9632/14785], loss: 10.9026, overflow: False, scale: 67108864, lr: 0.000170, time: 596.55
- 2022-09-23 02:03:39,588:INFO:epoch: 0 step: [9688/14785], loss: 10.7853, overflow: False, scale: 67108864, lr: 0.000172, time: 597.80
- 2022-09-23 02:04:13,201:INFO:epoch: 0 step: [9744/14785], loss: 10.4725, overflow: False, scale: 67108864, lr: 0.000174, time: 600.18
- 2022-09-23 02:04:46,743:INFO:epoch: 0 step: [9800/14785], loss: 10.5847, overflow: False, scale: 67108864, lr: 0.000176, time: 598.93
- 2022-09-23 02:05:20,298:INFO:epoch: 0 step: [9856/14785], loss: 10.9624, overflow: False, scale: 67108864, lr: 0.000178, time: 599.17
- 2022-09-23 02:05:53,706:INFO:epoch: 0 step: [9912/14785], loss: 10.5960, overflow: False, scale: 67108864, lr: 0.000180, time: 596.53
- 2022-09-23 02:06:27,193:INFO:epoch: 0 step: [9968/14785], loss: 10.7256, overflow: False, scale: 134217728, lr: 0.000182, time: 597.90
- 2022-09-23 02:07:00,800:INFO:epoch: 0 step: [10024/14785], loss: 10.6042, overflow: False, scale: 134217728, lr: 0.000184, time: 600.07
- 2022-09-23 02:07:34,103:INFO:epoch: 0 step: [10080/14785], loss: 10.7816, overflow: False, scale: 134217728, lr: 0.000186, time: 594.66
- 2022-09-23 02:08:07,504:INFO:epoch: 0 step: [10136/14785], loss: 10.9219, overflow: False, scale: 134217728, lr: 0.000188, time: 596.38
- 2022-09-23 02:08:40,993:INFO:epoch: 0 step: [10192/14785], loss: 10.9003, overflow: False, scale: 134217728, lr: 0.000190, time: 597.98
- 2022-09-23 02:09:14,515:INFO:epoch: 0 step: [10248/14785], loss: 11.0276, overflow: False, scale: 134217728, lr: 0.000192, time: 598.57
- 2022-09-23 02:09:47,999:INFO:epoch: 0 step: [10304/14785], loss: 11.0062, overflow: False, scale: 134217728, lr: 0.000194, time: 597.88
- 2022-09-23 02:10:21,397:INFO:epoch: 0 step: [10360/14785], loss: 10.5180, overflow: False, scale: 134217728, lr: 0.000196, time: 596.36
- 2022-09-23 02:10:54,708:INFO:epoch: 0 step: [10416/14785], loss: 10.1787, overflow: False, scale: 134217728, lr: 0.000199, time: 594.78
- 2022-09-23 02:11:28,198:INFO:epoch: 0 step: [10472/14785], loss: 10.7086, overflow: False, scale: 134217728, lr: 0.000201, time: 598.01
- 2022-09-23 02:12:01,488:INFO:epoch: 0 step: [10528/14785], loss: 10.6857, overflow: False, scale: 134217728, lr: 0.000203, time: 594.42
- 2022-09-23 02:12:34,892:INFO:epoch: 0 step: [10584/14785], loss: 10.4074, overflow: False, scale: 134217728, lr: 0.000205, time: 596.45
- 2022-09-23 02:13:08,302:INFO:epoch: 0 step: [10640/14785], loss: 10.5908, overflow: False, scale: 134217728, lr: 0.000207, time: 596.57
- 2022-09-23 02:13:41,493:INFO:epoch: 0 step: [10696/14785], loss: 10.8052, overflow: False, scale: 134217728, lr: 0.000209, time: 592.63
- 2022-09-23 02:14:14,913:INFO:epoch: 0 step: [10752/14785], loss: 10.6352, overflow: False, scale: 134217728, lr: 0.000212, time: 596.76
- 2022-09-23 02:14:48,301:INFO:epoch: 0 step: [10808/14785], loss: 11.4855, overflow: False, scale: 134217728, lr: 0.000214, time: 596.15
- 2022-09-23 02:15:21,710:INFO:epoch: 0 step: [10864/14785], loss: 10.3795, overflow: False, scale: 134217728, lr: 0.000216, time: 596.54
- 2022-09-23 02:15:55,309:INFO:epoch: 0 step: [10920/14785], loss: 10.5703, overflow: False, scale: 134217728, lr: 0.000218, time: 599.95
- 2022-09-23 02:16:28,696:INFO:epoch: 0 step: [10976/14785], loss: 11.0911, overflow: False, scale: 134217728, lr: 0.000220, time: 596.17
- 2022-09-23 02:17:02,001:INFO:epoch: 0 step: [11032/14785], loss: 10.2485, overflow: False, scale: 134217728, lr: 0.000223, time: 594.66
- 2022-09-23 02:17:35,494:INFO:epoch: 0 step: [11088/14785], loss: 10.8232, overflow: False, scale: 134217728, lr: 0.000225, time: 598.04
- 2022-09-23 02:18:08,980:INFO:epoch: 0 step: [11144/14785], loss: 10.6465, overflow: False, scale: 134217728, lr: 0.000227, time: 597.95
- 2022-09-23 02:18:42,503:INFO:epoch: 0 step: [11200/14785], loss: 10.7726, overflow: False, scale: 134217728, lr: 0.000230, time: 598.58
- 2022-09-23 02:19:16,005:INFO:epoch: 0 step: [11256/14785], loss: 10.4669, overflow: False, scale: 134217728, lr: 0.000232, time: 598.20
- 2022-09-23 02:19:49,309:INFO:epoch: 0 step: [11312/14785], loss: 10.7581, overflow: False, scale: 134217728, lr: 0.000234, time: 594.69
- 2022-09-23 02:20:22,700:INFO:epoch: 0 step: [11368/14785], loss: 11.0890, overflow: False, scale: 134217728, lr: 0.000237, time: 596.24
- 2022-09-23 02:20:55,905:INFO:epoch: 0 step: [11424/14785], loss: 10.6089, overflow: False, scale: 134217728, lr: 0.000239, time: 592.88
- 2022-09-23 02:21:29,385:INFO:epoch: 0 step: [11480/14785], loss: 11.1573, overflow: False, scale: 134217728, lr: 0.000241, time: 597.76
- 2022-09-23 02:22:02,888:INFO:epoch: 0 step: [11536/14785], loss: 10.4851, overflow: False, scale: 134217728, lr: 0.000244, time: 598.22
- 2022-09-23 02:22:36,484:INFO:epoch: 0 step: [11592/14785], loss: 10.1617, overflow: False, scale: 134217728, lr: 0.000246, time: 599.90
- 2022-09-23 02:23:09,900:INFO:epoch: 0 step: [11648/14785], loss: 10.7944, overflow: False, scale: 134217728, lr: 0.000248, time: 596.66
- 2022-09-23 02:23:43,400:INFO:epoch: 0 step: [11704/14785], loss: 10.6493, overflow: False, scale: 134217728, lr: 0.000251, time: 598.18
- 2022-09-23 02:24:17,001:INFO:epoch: 0 step: [11760/14785], loss: 10.5283, overflow: False, scale: 134217728, lr: 0.000253, time: 599.98
- 2022-09-23 02:24:50,301:INFO:epoch: 0 step: [11816/14785], loss: 10.9214, overflow: False, scale: 134217728, lr: 0.000256, time: 594.59
- 2022-09-23 02:25:23,800:INFO:epoch: 0 step: [11872/14785], loss: 10.6636, overflow: False, scale: 134217728, lr: 0.000258, time: 598.15
- 2022-09-23 02:25:57,006:INFO:epoch: 0 step: [11928/14785], loss: 10.6014, overflow: False, scale: 134217728, lr: 0.000260, time: 592.94
- 2022-09-23 02:26:30,399:INFO:epoch: 0 step: [11984/14785], loss: 11.5114, overflow: False, scale: 268435456, lr: 0.000263, time: 596.25
- 2022-09-23 02:27:03,896:INFO:epoch: 0 step: [12040/14785], loss: 10.6090, overflow: False, scale: 268435456, lr: 0.000265, time: 598.11
- 2022-09-23 02:27:37,304:INFO:epoch: 0 step: [12096/14785], loss: 10.7228, overflow: False, scale: 268435456, lr: 0.000268, time: 596.53
- 2022-09-23 02:28:10,705:INFO:epoch: 0 step: [12152/14785], loss: 11.0681, overflow: False, scale: 268435456, lr: 0.000270, time: 596.41
- 2022-09-23 02:28:43,903:INFO:epoch: 0 step: [12208/14785], loss: 10.4967, overflow: False, scale: 268435456, lr: 0.000273, time: 592.79
- 2022-09-23 02:29:17,502:INFO:epoch: 0 step: [12264/14785], loss: 10.7724, overflow: False, scale: 268435456, lr: 0.000275, time: 599.93
- 2022-09-23 02:29:50,801:INFO:epoch: 0 step: [12320/14785], loss: 10.6460, overflow: False, scale: 268435456, lr: 0.000278, time: 594.60
- 2022-09-23 02:30:24,013:INFO:epoch: 0 step: [12376/14785], loss: 11.2589, overflow: False, scale: 268435456, lr: 0.000280, time: 593.03
- 2022-09-23 02:30:57,408:INFO:epoch: 0 step: [12432/14785], loss: 10.1945, overflow: False, scale: 268435456, lr: 0.000283, time: 596.30
- 2022-09-23 02:31:30,906:INFO:epoch: 0 step: [12488/14785], loss: 10.6522, overflow: False, scale: 268435456, lr: 0.000285, time: 598.15
- 2022-09-23 02:32:04,511:INFO:epoch: 0 step: [12544/14785], loss: 10.7739, overflow: False, scale: 268435456, lr: 0.000288, time: 600.03
- 2022-09-23 02:32:37,904:INFO:epoch: 0 step: [12600/14785], loss: 10.2057, overflow: False, scale: 268435456, lr: 0.000291, time: 596.28
- 2022-09-23 02:33:11,299:INFO:epoch: 0 step: [12656/14785], loss: 10.6218, overflow: False, scale: 268435456, lr: 0.000293, time: 596.31
- 2022-09-23 02:33:44,903:INFO:epoch: 0 step: [12712/14785], loss: 10.1153, overflow: False, scale: 268435456, lr: 0.000296, time: 600.00
- 2022-09-23 02:34:18,306:INFO:epoch: 0 step: [12768/14785], loss: 9.8727, overflow: False, scale: 268435456, lr: 0.000298, time: 596.44
- 2022-09-23 02:34:51,806:INFO:epoch: 0 step: [12824/14785], loss: 10.6588, overflow: False, scale: 268435456, lr: 0.000301, time: 598.18
- 2022-09-23 02:35:25,397:INFO:epoch: 0 step: [12880/14785], loss: 10.6960, overflow: False, scale: 268435456, lr: 0.000304, time: 599.82
- 2022-09-23 02:35:59,003:INFO:epoch: 0 step: [12936/14785], loss: 10.5902, overflow: False, scale: 268435456, lr: 0.000306, time: 600.07
- 2022-09-23 02:36:32,505:INFO:epoch: 0 step: [12992/14785], loss: 11.0020, overflow: False, scale: 268435456, lr: 0.000309, time: 598.21
- 2022-09-23 02:37:05,801:INFO:epoch: 0 step: [13048/14785], loss: 10.7197, overflow: False, scale: 268435456, lr: 0.000312, time: 594.49
- 2022-09-23 02:37:39,404:INFO:epoch: 0 step: [13104/14785], loss: 10.6387, overflow: False, scale: 268435456, lr: 0.000314, time: 600.01
- 2022-09-23 02:38:12,741:INFO:epoch: 0 step: [13160/14785], loss: 10.2149, overflow: False, scale: 268435456, lr: 0.000317, time: 595.27
- 2022-09-23 02:38:46,106:INFO:epoch: 0 step: [13216/14785], loss: 10.8423, overflow: False, scale: 268435456, lr: 0.000320, time: 595.74
- 2022-09-23 02:39:19,303:INFO:epoch: 0 step: [13272/14785], loss: 10.2034, overflow: False, scale: 268435456, lr: 0.000322, time: 592.72
- 2022-09-23 02:39:52,905:INFO:epoch: 0 step: [13328/14785], loss: 11.0404, overflow: False, scale: 268435456, lr: 0.000325, time: 599.99
- 2022-09-23 02:40:26,395:INFO:epoch: 0 step: [13384/14785], loss: 10.4699, overflow: False, scale: 268435456, lr: 0.000328, time: 598.00
- 2022-09-23 02:41:00,000:INFO:epoch: 0 step: [13440/14785], loss: 10.2516, overflow: False, scale: 268435456, lr: 0.000331, time: 600.07
- 2022-09-23 02:41:33,502:INFO:epoch: 0 step: [13496/14785], loss: 10.0479, overflow: False, scale: 268435456, lr: 0.000333, time: 598.21
- 2022-09-23 02:42:07,101:INFO:epoch: 0 step: [13552/14785], loss: 10.1592, overflow: False, scale: 268435456, lr: 0.000336, time: 599.93
- 2022-09-23 02:42:40,695:INFO:epoch: 0 step: [13608/14785], loss: 9.9792, overflow: False, scale: 268435456, lr: 0.000339, time: 599.87
- 2022-09-23 02:43:14,186:INFO:epoch: 0 step: [13664/14785], loss: 10.4370, overflow: False, scale: 268435456, lr: 0.000342, time: 598.02
- 2022-09-23 02:43:47,504:INFO:epoch: 0 step: [13720/14785], loss: 10.7331, overflow: False, scale: 268435456, lr: 0.000344, time: 594.92
- 2022-09-23 02:44:20,997:INFO:epoch: 0 step: [13776/14785], loss: 10.2880, overflow: False, scale: 268435456, lr: 0.000347, time: 598.04
- 2022-09-23 02:44:54,309:INFO:epoch: 0 step: [13832/14785], loss: 9.8749, overflow: False, scale: 268435456, lr: 0.000350, time: 594.82
- 2022-09-23 02:45:27,605:INFO:epoch: 0 step: [13888/14785], loss: 10.5890, overflow: False, scale: 268435456, lr: 0.000353, time: 594.55
- 2022-09-23 02:46:00,902:INFO:epoch: 0 step: [13944/14785], loss: 10.4217, overflow: False, scale: 536870912, lr: 0.000356, time: 594.54
- 2022-09-23 02:46:34,295:INFO:epoch: 0 step: [14000/14785], loss: 10.1989, overflow: False, scale: 536870912, lr: 0.000359, time: 596.24
- 2022-09-23 02:47:07,506:INFO:epoch: 0 step: [14056/14785], loss: 10.5789, overflow: False, scale: 536870912, lr: 0.000362, time: 593.01
- 2022-09-23 02:47:40,822:INFO:epoch: 0 step: [14112/14785], loss: 9.9240, overflow: False, scale: 536870912, lr: 0.000364, time: 594.90
- 2022-09-23 02:48:14,243:INFO:epoch: 0 step: [14168/14785], loss: 10.0125, overflow: False, scale: 536870912, lr: 0.000367, time: 596.75
- 2022-09-23 02:48:47,601:INFO:epoch: 0 step: [14224/14785], loss: 10.6186, overflow: False, scale: 536870912, lr: 0.000370, time: 595.64
- 2022-09-23 02:49:21,100:INFO:epoch: 0 step: [14280/14785], loss: 10.4794, overflow: False, scale: 536870912, lr: 0.000373, time: 598.15
- 2022-09-23 02:49:54,692:INFO:epoch: 0 step: [14336/14785], loss: 9.8960, overflow: False, scale: 536870912, lr: 0.000376, time: 599.83
- 2022-09-23 02:50:28,197:INFO:epoch: 0 step: [14392/14785], loss: 10.0872, overflow: False, scale: 536870912, lr: 0.000379, time: 598.27
- 2022-09-23 02:51:01,612:INFO:epoch: 0 step: [14448/14785], loss: 10.0917, overflow: False, scale: 536870912, lr: 0.000382, time: 596.64
- 2022-09-23 02:51:34,992:INFO:epoch: 0 step: [14504/14785], loss: 10.3253, overflow: False, scale: 536870912, lr: 0.000385, time: 596.03
- 2022-09-23 02:52:08,601:INFO:epoch: 0 step: [14560/14785], loss: 10.4719, overflow: False, scale: 536870912, lr: 0.000388, time: 600.12
- 2022-09-23 02:52:42,099:INFO:epoch: 0 step: [14616/14785], loss: 10.6865, overflow: False, scale: 536870912, lr: 0.000391, time: 598.14
- 2022-09-23 02:53:15,318:INFO:epoch: 0 step: [14672/14785], loss: 10.5218, overflow: False, scale: 536870912, lr: 0.000394, time: 593.16
- 2022-09-23 02:53:48,803:INFO:epoch: 0 step: [14728/14785], loss: 10.4066, overflow: False, scale: 536870912, lr: 0.000397, time: 597.91
- 2022-09-23 02:54:24,828:INFO:epoch: 0 step: [14784/14785], loss: 9.7606, overflow: False, scale: 536870912, lr: 0.000400, time: 643.25
- 2022-09-23 02:54:58,295:INFO:epoch: 1 step: [55/14785], loss: 9.9135, overflow: False, scale: 536870912, lr: 0.000403, time: 596.86
- 2022-09-23 02:55:31,708:INFO:epoch: 1 step: [111/14785], loss: 10.0420, overflow: False, scale: 536870912, lr: 0.000406, time: 596.62
- 2022-09-23 02:56:05,189:INFO:epoch: 1 step: [167/14785], loss: 10.0121, overflow: False, scale: 536870912, lr: 0.000409, time: 597.83
- 2022-09-23 02:56:38,710:INFO:epoch: 1 step: [223/14785], loss: 10.8215, overflow: False, scale: 536870912, lr: 0.000412, time: 598.54
- 2022-09-23 02:57:12,202:INFO:epoch: 1 step: [279/14785], loss: 10.1943, overflow: False, scale: 536870912, lr: 0.000415, time: 598.03
- 2022-09-23 02:57:45,807:INFO:epoch: 1 step: [335/14785], loss: 10.0846, overflow: False, scale: 536870912, lr: 0.000418, time: 600.07
- 2022-09-23 02:58:19,105:INFO:epoch: 1 step: [391/14785], loss: 9.9233, overflow: False, scale: 536870912, lr: 0.000421, time: 594.55
- 2022-09-23 02:58:52,470:INFO:epoch: 1 step: [447/14785], loss: 10.0112, overflow: False, scale: 536870912, lr: 0.000425, time: 595.79
- 2022-09-23 02:59:25,798:INFO:epoch: 1 step: [503/14785], loss: 10.6138, overflow: False, scale: 536870912, lr: 0.000428, time: 595.10
- 2022-09-23 02:59:59,196:INFO:epoch: 1 step: [559/14785], loss: 10.2826, overflow: False, scale: 536870912, lr: 0.000431, time: 596.37
- 2022-09-23 03:00:32,556:INFO:epoch: 1 step: [615/14785], loss: 9.7378, overflow: False, scale: 536870912, lr: 0.000434, time: 595.66
- 2022-09-23 03:01:06,004:INFO:epoch: 1 step: [671/14785], loss: 9.7175, overflow: False, scale: 536870912, lr: 0.000437, time: 597.23
- 2022-09-23 03:01:39,401:INFO:epoch: 1 step: [727/14785], loss: 10.2615, overflow: False, scale: 536870912, lr: 0.000440, time: 596.35
- 2022-09-23 03:02:12,692:INFO:epoch: 1 step: [783/14785], loss: 9.8738, overflow: False, scale: 536870912, lr: 0.000444, time: 594.45
- 2022-09-23 03:02:46,104:INFO:epoch: 1 step: [839/14785], loss: 9.8551, overflow: False, scale: 536870912, lr: 0.000447, time: 596.61
- 2022-09-23 03:03:19,605:INFO:epoch: 1 step: [895/14785], loss: 9.6866, overflow: False, scale: 536870912, lr: 0.000450, time: 598.20
- 2022-09-23 03:03:53,005:INFO:epoch: 1 step: [951/14785], loss: 9.6938, overflow: False, scale: 536870912, lr: 0.000453, time: 596.40
- 2022-09-23 03:04:26,502:INFO:epoch: 1 step: [1007/14785], loss: 9.9726, overflow: False, scale: 536870912, lr: 0.000456, time: 598.13
- 2022-09-23 03:04:59,798:INFO:epoch: 1 step: [1063/14785], loss: 10.4103, overflow: False, scale: 536870912, lr: 0.000460, time: 594.53
- 2022-09-23 03:05:33,301:INFO:epoch: 1 step: [1119/14785], loss: 10.8234, overflow: False, scale: 536870912, lr: 0.000463, time: 598.21
- 2022-09-23 03:06:06,716:INFO:epoch: 1 step: [1175/14785], loss: 9.7978, overflow: False, scale: 1073741824, lr: 0.000466, time: 596.67
- 2022-09-23 03:06:39,901:INFO:epoch: 1 step: [1231/14785], loss: 10.4200, overflow: False, scale: 1073741824, lr: 0.000469, time: 592.53
- 2022-09-23 03:07:13,401:INFO:epoch: 1 step: [1287/14785], loss: 10.1477, overflow: False, scale: 1073741824, lr: 0.000473, time: 598.16
- 2022-09-23 03:07:46,908:INFO:epoch: 1 step: [1343/14785], loss: 9.9682, overflow: False, scale: 1073741824, lr: 0.000476, time: 598.31
- 2022-09-23 03:08:20,211:INFO:epoch: 1 step: [1399/14785], loss: 9.9840, overflow: False, scale: 1073741824, lr: 0.000479, time: 594.65
- 2022-09-23 03:08:53,703:INFO:epoch: 1 step: [1455/14785], loss: 9.4455, overflow: False, scale: 1073741824, lr: 0.000483, time: 598.04
- 2022-09-23 03:09:27,304:INFO:epoch: 1 step: [1511/14785], loss: 10.2112, overflow: False, scale: 1073741824, lr: 0.000486, time: 599.98
- 2022-09-23 03:10:00,898:INFO:epoch: 1 step: [1567/14785], loss: 9.9426, overflow: False, scale: 1073741824, lr: 0.000489, time: 599.81
- 2022-09-23 03:10:34,513:INFO:epoch: 1 step: [1623/14785], loss: 9.4957, overflow: False, scale: 1073741824, lr: 0.000493, time: 600.21
- 2022-09-23 03:11:08,005:INFO:epoch: 1 step: [1679/14785], loss: 10.0776, overflow: False, scale: 1073741824, lr: 0.000496, time: 598.04
- 2022-09-23 03:11:41,296:INFO:epoch: 1 step: [1735/14785], loss: 10.0944, overflow: False, scale: 1073741824, lr: 0.000499, time: 594.47
- 2022-09-23 03:12:14,725:INFO:epoch: 1 step: [1791/14785], loss: 10.0862, overflow: False, scale: 1073741824, lr: 0.000503, time: 596.91
- 2022-09-23 03:12:48,000:INFO:epoch: 1 step: [1847/14785], loss: 9.5503, overflow: False, scale: 1073741824, lr: 0.000506, time: 594.18
- 2022-09-23 03:13:21,501:INFO:epoch: 1 step: [1903/14785], loss: 10.0269, overflow: False, scale: 1073741824, lr: 0.000510, time: 598.16
- 2022-09-23 03:13:55,101:INFO:epoch: 1 step: [1959/14785], loss: 9.2692, overflow: False, scale: 1073741824, lr: 0.000513, time: 599.97
- 2022-09-23 03:14:28,698:INFO:epoch: 1 step: [2015/14785], loss: 9.5302, overflow: False, scale: 1073741824, lr: 0.000517, time: 599.90
- 2022-09-23 03:15:02,196:INFO:epoch: 1 step: [2071/14785], loss: 10.3091, overflow: False, scale: 1073741824, lr: 0.000520, time: 598.14
- 2022-09-23 03:15:35,504:INFO:epoch: 1 step: [2127/14785], loss: 9.7635, overflow: False, scale: 1073741824, lr: 0.000523, time: 594.75
- 2022-09-23 03:16:08,805:INFO:epoch: 1 step: [2183/14785], loss: 9.6887, overflow: False, scale: 1073741824, lr: 0.000527, time: 594.61
- 2022-09-23 03:16:42,212:INFO:epoch: 1 step: [2239/14785], loss: 9.9813, overflow: False, scale: 1073741824, lr: 0.000530, time: 596.49
- 2022-09-23 03:17:15,809:INFO:epoch: 1 step: [2295/14785], loss: 9.6722, overflow: False, scale: 1073741824, lr: 0.000534, time: 599.91
- 2022-09-23 03:17:49,304:INFO:epoch: 1 step: [2351/14785], loss: 9.8699, overflow: False, scale: 1073741824, lr: 0.000537, time: 598.10
- 2022-09-23 03:18:22,894:INFO:epoch: 1 step: [2407/14785], loss: 9.5002, overflow: False, scale: 1073741824, lr: 0.000541, time: 599.74
- 2022-09-23 03:18:56,503:INFO:epoch: 1 step: [2463/14785], loss: 9.3329, overflow: False, scale: 1073741824, lr: 0.000544, time: 600.12
- 2022-09-23 03:19:29,994:INFO:epoch: 1 step: [2519/14785], loss: 9.6709, overflow: False, scale: 1073741824, lr: 0.000548, time: 598.01
- 2022-09-23 03:20:03,206:INFO:epoch: 1 step: [2575/14785], loss: 9.7537, overflow: False, scale: 1073741824, lr: 0.000552, time: 593.02
- 2022-09-23 03:20:36,595:INFO:epoch: 1 step: [2631/14785], loss: 9.9433, overflow: False, scale: 1073741824, lr: 0.000555, time: 596.20
- 2022-09-23 03:21:10,194:INFO:epoch: 1 step: [2687/14785], loss: 9.4496, overflow: False, scale: 1073741824, lr: 0.000559, time: 599.92
- 2022-09-23 03:21:43,805:INFO:epoch: 1 step: [2743/14785], loss: 9.3328, overflow: False, scale: 1073741824, lr: 0.000562, time: 600.15
- 2022-09-23 03:22:17,202:INFO:epoch: 1 step: [2799/14785], loss: 9.3303, overflow: False, scale: 1073741824, lr: 0.000566, time: 596.35
- 2022-09-23 03:22:50,802:INFO:epoch: 1 step: [2855/14785], loss: 9.6691, overflow: False, scale: 1073741824, lr: 0.000569, time: 599.92
- 2022-09-23 03:23:24,295:INFO:epoch: 1 step: [2911/14785], loss: 9.4577, overflow: False, scale: 1073741824, lr: 0.000573, time: 598.04
- 2022-09-23 03:23:57,678:INFO:epoch: 1 step: [2967/14785], loss: 9.6184, overflow: False, scale: 1073741824, lr: 0.000577, time: 596.07
- 2022-09-23 03:24:30,973:INFO:epoch: 1 step: [3023/14785], loss: 10.2277, overflow: False, scale: 1073741824, lr: 0.000580, time: 594.50
- 2022-09-23 03:25:04,413:INFO:epoch: 1 step: [3079/14785], loss: 9.8070, overflow: False, scale: 1073741824, lr: 0.000584, time: 597.11
- 2022-09-23 03:25:37,805:INFO:epoch: 1 step: [3135/14785], loss: 9.3532, overflow: False, scale: 1073741824, lr: 0.000588, time: 596.26
- 2022-09-23 03:26:11,249:INFO:epoch: 1 step: [3191/14785], loss: 9.8893, overflow: False, scale: 2147483648, lr: 0.000591, time: 597.18
- 2022-09-23 03:26:44,600:INFO:epoch: 1 step: [3247/14785], loss: 9.8718, overflow: False, scale: 2147483648, lr: 0.000595, time: 595.53
- 2022-09-23 03:27:18,006:INFO:epoch: 1 step: [3303/14785], loss: 9.4158, overflow: False, scale: 2147483648, lr: 0.000599, time: 596.50
- 2022-09-23 03:27:51,201:INFO:epoch: 1 step: [3359/14785], loss: 10.1599, overflow: False, scale: 2147483648, lr: 0.000602, time: 592.75
- 2022-09-23 03:28:24,502:INFO:epoch: 1 step: [3415/14785], loss: 10.5004, overflow: False, scale: 2147483648, lr: 0.000606, time: 594.62
- 2022-09-23 03:28:57,993:INFO:epoch: 1 step: [3471/14785], loss: 10.0227, overflow: False, scale: 2147483648, lr: 0.000610, time: 597.99
- 2022-09-23 03:29:31,617:INFO:epoch: 1 step: [3527/14785], loss: 10.0540, overflow: False, scale: 2147483648, lr: 0.000614, time: 600.41
- 2022-09-23 03:30:05,007:INFO:epoch: 1 step: [3583/14785], loss: 9.1539, overflow: False, scale: 2147483648, lr: 0.000617, time: 596.20
- 2022-09-23 03:30:38,509:INFO:epoch: 1 step: [3639/14785], loss: 9.7598, overflow: False, scale: 2147483648, lr: 0.000621, time: 598.22
- 2022-09-23 03:31:12,098:INFO:epoch: 1 step: [3695/14785], loss: 9.9807, overflow: False, scale: 2147483648, lr: 0.000625, time: 599.77
- 2022-09-23 03:31:45,605:INFO:epoch: 1 step: [3751/14785], loss: 9.1850, overflow: False, scale: 2147483648, lr: 0.000629, time: 598.29
- 2022-09-23 03:32:19,102:INFO:epoch: 1 step: [3807/14785], loss: 9.7599, overflow: False, scale: 2147483648, lr: 0.000633, time: 598.12
- 2022-09-23 03:32:52,702:INFO:epoch: 1 step: [3863/14785], loss: 9.5363, overflow: False, scale: 2147483648, lr: 0.000636, time: 599.98
- 2022-09-23 03:33:26,107:INFO:epoch: 1 step: [3919/14785], loss: 10.0038, overflow: False, scale: 2147483648, lr: 0.000640, time: 596.49
- 2022-09-23 03:33:59,610:INFO:epoch: 1 step: [3975/14785], loss: 9.7335, overflow: False, scale: 2147483648, lr: 0.000644, time: 598.22
- 2022-09-23 03:34:32,997:INFO:epoch: 1 step: [4031/14785], loss: 9.7595, overflow: False, scale: 2147483648, lr: 0.000648, time: 596.15
- 2022-09-23 03:35:06,407:INFO:epoch: 1 step: [4087/14785], loss: 10.2106, overflow: False, scale: 2147483648, lr: 0.000652, time: 596.56
- 2022-09-23 03:35:39,810:INFO:epoch: 1 step: [4143/14785], loss: 9.9412, overflow: False, scale: 2147483648, lr: 0.000656, time: 596.42
- 2022-09-23 03:36:13,201:INFO:epoch: 1 step: [4199/14785], loss: 9.5799, overflow: False, scale: 2147483648, lr: 0.000660, time: 596.24
- 2022-09-23 03:36:46,593:INFO:epoch: 1 step: [4255/14785], loss: 9.5966, overflow: False, scale: 2147483648, lr: 0.000663, time: 596.24
- 2022-09-23 03:37:20,202:INFO:epoch: 1 step: [4311/14785], loss: 9.2931, overflow: False, scale: 2147483648, lr: 0.000667, time: 600.13
- 2022-09-23 03:37:53,700:INFO:epoch: 1 step: [4367/14785], loss: 9.5444, overflow: False, scale: 2147483648, lr: 0.000671, time: 598.14
- 2022-09-23 03:38:27,194:INFO:epoch: 1 step: [4423/14785], loss: 9.3101, overflow: False, scale: 2147483648, lr: 0.000675, time: 598.07
- 2022-09-23 03:39:00,613:INFO:epoch: 1 step: [4479/14785], loss: 9.1899, overflow: False, scale: 2147483648, lr: 0.000679, time: 596.71
- 2022-09-23 03:39:33,998:INFO:epoch: 1 step: [4535/14785], loss: 9.5395, overflow: False, scale: 2147483648, lr: 0.000683, time: 596.13
- 2022-09-23 03:40:07,597:INFO:epoch: 1 step: [4591/14785], loss: 9.9291, overflow: False, scale: 2147483648, lr: 0.000687, time: 599.94
- 2022-09-23 03:40:40,897:INFO:epoch: 1 step: [4647/14785], loss: 9.4937, overflow: False, scale: 2147483648, lr: 0.000691, time: 594.62
- 2022-09-23 03:41:14,203:INFO:epoch: 1 step: [4703/14785], loss: 9.2657, overflow: False, scale: 2147483648, lr: 0.000695, time: 594.70
- 2022-09-23 03:41:47,710:INFO:epoch: 1 step: [4759/14785], loss: 9.9678, overflow: False, scale: 2147483648, lr: 0.000699, time: 598.30
- 2022-09-23 03:42:21,298:INFO:epoch: 1 step: [4815/14785], loss: 9.3597, overflow: False, scale: 2147483648, lr: 0.000703, time: 599.74
- 2022-09-23 03:42:54,811:INFO:epoch: 1 step: [4871/14785], loss: 9.5553, overflow: False, scale: 2147483648, lr: 0.000707, time: 598.39
- 2022-09-23 03:43:28,305:INFO:epoch: 1 step: [4927/14785], loss: 9.8914, overflow: False, scale: 2147483648, lr: 0.000711, time: 598.07
- 2022-09-23 03:44:01,705:INFO:epoch: 1 step: [4983/14785], loss: 9.5019, overflow: False, scale: 2147483648, lr: 0.000715, time: 596.39
- 2022-09-23 03:44:35,194:INFO:epoch: 1 step: [5039/14785], loss: 9.4579, overflow: False, scale: 2147483648, lr: 0.000719, time: 597.99
- 2022-09-23 03:45:08,797:INFO:epoch: 1 step: [5095/14785], loss: 9.9354, overflow: False, scale: 2147483648, lr: 0.000723, time: 599.99
- 2022-09-23 03:45:42,197:INFO:epoch: 1 step: [5151/14785], loss: 9.7913, overflow: False, scale: 2147483648, lr: 0.000727, time: 596.39
- 2022-09-23 03:46:15,703:INFO:epoch: 1 step: [5207/14785], loss: 9.5369, overflow: False, scale: 4294967296, lr: 0.000731, time: 598.28
- 2022-09-23 03:46:49,306:INFO:epoch: 1 step: [5263/14785], loss: 9.3103, overflow: False, scale: 4294967296, lr: 0.000736, time: 600.00
- 2022-09-23 03:47:22,808:INFO:epoch: 1 step: [5319/14785], loss: 9.5749, overflow: False, scale: 4294967296, lr: 0.000740, time: 598.22
- 2022-09-23 03:47:56,306:INFO:epoch: 1 step: [5375/14785], loss: 8.7258, overflow: False, scale: 4294967296, lr: 0.000744, time: 598.12
- 2022-09-23 03:48:29,715:INFO:epoch: 1 step: [5431/14785], loss: 10.0820, overflow: False, scale: 4294967296, lr: 0.000748, time: 596.54
- 2022-09-23 03:49:03,218:INFO:epoch: 1 step: [5487/14785], loss: 9.1794, overflow: False, scale: 4294967296, lr: 0.000752, time: 598.22
- 2022-09-23 03:49:36,695:INFO:epoch: 1 step: [5543/14785], loss: 9.2224, overflow: False, scale: 4294967296, lr: 0.000756, time: 597.78
- 2022-09-23 03:50:10,300:INFO:epoch: 1 step: [5599/14785], loss: 9.7375, overflow: False, scale: 4294967296, lr: 0.000760, time: 600.04
- 2022-09-23 03:50:43,799:INFO:epoch: 1 step: [5655/14785], loss: 9.6366, overflow: False, scale: 4294967296, lr: 0.000765, time: 598.16
- 2022-09-23 03:51:17,297:INFO:epoch: 1 step: [5711/14785], loss: 9.4326, overflow: False, scale: 4294967296, lr: 0.000769, time: 598.12
- 2022-09-23 03:51:50,699:INFO:epoch: 1 step: [5767/14785], loss: 9.2774, overflow: False, scale: 4294967296, lr: 0.000773, time: 596.43
- 2022-09-23 03:52:24,199:INFO:epoch: 1 step: [5823/14785], loss: 9.7048, overflow: False, scale: 4294967296, lr: 0.000777, time: 598.18
- 2022-09-23 03:52:57,595:INFO:epoch: 1 step: [5879/14785], loss: 9.8731, overflow: False, scale: 4294967296, lr: 0.000781, time: 596.30
- 2022-09-23 03:53:31,207:INFO:epoch: 1 step: [5935/14785], loss: 9.3940, overflow: False, scale: 4294967296, lr: 0.000786, time: 600.17
- 2022-09-23 03:54:04,694:INFO:epoch: 1 step: [5991/14785], loss: 9.7185, overflow: False, scale: 4294967296, lr: 0.000790, time: 597.95
- 2022-09-23 03:54:37,997:INFO:epoch: 1 step: [6047/14785], loss: 9.8123, overflow: False, scale: 4294967296, lr: 0.000794, time: 594.66
- 2022-09-23 03:55:11,498:INFO:epoch: 1 step: [6103/14785], loss: 9.6827, overflow: False, scale: 4294967296, lr: 0.000798, time: 598.19
- 2022-09-23 03:55:44,905:INFO:epoch: 1 step: [6159/14785], loss: 9.5119, overflow: False, scale: 4294967296, lr: 0.000803, time: 596.50
- 2022-09-23 03:56:18,509:INFO:epoch: 1 step: [6215/14785], loss: 9.8186, overflow: False, scale: 4294967296, lr: 0.000807, time: 600.04
- 2022-09-23 03:56:51,895:INFO:epoch: 1 step: [6271/14785], loss: 9.0141, overflow: False, scale: 4294967296, lr: 0.000811, time: 596.14
- 2022-09-23 03:57:25,309:INFO:epoch: 1 step: [6327/14785], loss: 9.4876, overflow: False, scale: 4294967296, lr: 0.000816, time: 596.64
- 2022-09-23 03:57:58,697:INFO:epoch: 1 step: [6383/14785], loss: 9.3665, overflow: False, scale: 4294967296, lr: 0.000820, time: 596.17
- 2022-09-23 03:58:32,210:INFO:epoch: 1 step: [6439/14785], loss: 10.0529, overflow: False, scale: 4294967296, lr: 0.000824, time: 598.41
- 2022-09-23 03:59:05,696:INFO:epoch: 1 step: [6495/14785], loss: 9.4561, overflow: False, scale: 4294967296, lr: 0.000829, time: 597.93
- 2022-09-23 03:59:38,899:INFO:epoch: 1 step: [6551/14785], loss: 9.4037, overflow: False, scale: 4294967296, lr: 0.000833, time: 592.86
- 2022-09-23 04:00:12,402:INFO:epoch: 1 step: [6607/14785], loss: 9.6566, overflow: False, scale: 4294967296, lr: 0.000837, time: 598.21
- 2022-09-23 04:00:45,910:INFO:epoch: 1 step: [6663/14785], loss: 9.2493, overflow: False, scale: 4294967296, lr: 0.000842, time: 598.30
- 2022-09-23 04:01:19,302:INFO:epoch: 1 step: [6719/14785], loss: 9.5394, overflow: False, scale: 4294967296, lr: 0.000846, time: 596.22
- 2022-09-23 04:01:52,797:INFO:epoch: 1 step: [6775/14785], loss: 9.0807, overflow: False, scale: 4294967296, lr: 0.000851, time: 598.10
- 2022-09-23 04:02:26,405:INFO:epoch: 1 step: [6831/14785], loss: 9.5008, overflow: False, scale: 4294967296, lr: 0.000855, time: 600.09
- 2022-09-23 04:02:59,805:INFO:epoch: 1 step: [6887/14785], loss: 8.7618, overflow: False, scale: 4294967296, lr: 0.000860, time: 596.40
- 2022-09-23 04:03:33,311:INFO:epoch: 1 step: [6943/14785], loss: 8.9230, overflow: False, scale: 4294967296, lr: 0.000864, time: 598.28
- 2022-09-23 04:04:06,797:INFO:epoch: 1 step: [6999/14785], loss: 9.2629, overflow: False, scale: 4294967296, lr: 0.000868, time: 597.94
- 2022-09-23 04:04:40,197:INFO:epoch: 1 step: [7055/14785], loss: 9.1311, overflow: False, scale: 4294967296, lr: 0.000873, time: 596.36
- 2022-09-23 04:05:13,618:INFO:epoch: 1 step: [7111/14785], loss: 9.6274, overflow: False, scale: 4294967296, lr: 0.000877, time: 596.77
- 2022-09-23 04:05:47,199:INFO:epoch: 1 step: [7167/14785], loss: 9.7792, overflow: False, scale: 8589934592, lr: 0.000882, time: 599.60
- 2022-09-23 04:06:20,609:INFO:epoch: 1 step: [7223/14785], loss: 9.5661, overflow: False, scale: 8589934592, lr: 0.000886, time: 596.58
- 2022-09-23 04:06:54,106:INFO:epoch: 1 step: [7279/14785], loss: 9.0128, overflow: False, scale: 8589934592, lr: 0.000891, time: 598.12
- 2022-09-23 04:07:27,701:INFO:epoch: 1 step: [7335/14785], loss: 9.0903, overflow: False, scale: 8589934592, lr: 0.000895, time: 599.87
- 2022-09-23 04:08:01,306:INFO:epoch: 1 step: [7391/14785], loss: 9.2562, overflow: False, scale: 8589934592, lr: 0.000900, time: 600.03
- 2022-09-23 04:08:34,798:INFO:epoch: 1 step: [7447/14785], loss: 9.7419, overflow: False, scale: 8589934592, lr: 0.000905, time: 598.03
- 2022-09-23 04:09:08,393:INFO:epoch: 1 step: [7503/14785], loss: 8.8929, overflow: False, scale: 8589934592, lr: 0.000909, time: 599.88
- 2022-09-23 04:09:41,802:INFO:epoch: 1 step: [7559/14785], loss: 9.7744, overflow: False, scale: 8589934592, lr: 0.000914, time: 596.55
- 2022-09-23 04:10:15,297:INFO:epoch: 1 step: [7615/14785], loss: 9.5938, overflow: False, scale: 8589934592, lr: 0.000918, time: 598.07
- 2022-09-23 04:10:48,902:INFO:epoch: 1 step: [7671/14785], loss: 8.6351, overflow: False, scale: 8589934592, lr: 0.000923, time: 600.04
- 2022-09-23 04:11:22,200:INFO:epoch: 1 step: [7727/14785], loss: 9.8130, overflow: False, scale: 8589934592, lr: 0.000927, time: 594.58
- 2022-09-23 04:11:55,798:INFO:epoch: 1 step: [7783/14785], loss: 9.5215, overflow: False, scale: 8589934592, lr: 0.000932, time: 599.93
- 2022-09-23 04:12:29,407:INFO:epoch: 1 step: [7839/14785], loss: 8.8263, overflow: False, scale: 8589934592, lr: 0.000937, time: 600.13
- 2022-09-23 04:13:02,904:INFO:epoch: 1 step: [7895/14785], loss: 8.9529, overflow: False, scale: 8589934592, lr: 0.000941, time: 598.11
- 2022-09-23 04:13:36,308:INFO:epoch: 1 step: [7951/14785], loss: 9.7952, overflow: False, scale: 8589934592, lr: 0.000946, time: 596.46
- 2022-09-23 04:14:09,801:INFO:epoch: 1 step: [8007/14785], loss: 9.1061, overflow: False, scale: 8589934592, lr: 0.000951, time: 598.05
- 2022-09-23 04:14:43,306:INFO:epoch: 1 step: [8063/14785], loss: 9.8917, overflow: False, scale: 8589934592, lr: 0.000955, time: 598.26
- 2022-09-23 04:15:16,806:INFO:epoch: 1 step: [8119/14785], loss: 9.1132, overflow: False, scale: 8589934592, lr: 0.000960, time: 598.17
- 2022-09-23 04:15:50,190:INFO:epoch: 1 step: [8175/14785], loss: 9.8842, overflow: False, scale: 8589934592, lr: 0.000965, time: 596.12
- 2022-09-23 04:16:23,614:INFO:epoch: 1 step: [8231/14785], loss: 9.8217, overflow: False, scale: 8589934592, lr: 0.000969, time: 596.81
- 2022-09-23 04:16:57,097:INFO:epoch: 1 step: [8287/14785], loss: 10.0031, overflow: False, scale: 8589934592, lr: 0.000974, time: 597.87
- 2022-09-23 04:17:30,596:INFO:epoch: 1 step: [8343/14785], loss: 8.6643, overflow: False, scale: 8589934592, lr: 0.000979, time: 598.16
- 2022-09-23 04:18:04,096:INFO:epoch: 1 step: [8399/14785], loss: 9.0963, overflow: False, scale: 8589934592, lr: 0.000984, time: 598.18
- 2022-09-23 04:18:37,611:INFO:epoch: 1 step: [8455/14785], loss: 9.6278, overflow: False, scale: 8589934592, lr: 0.000988, time: 598.45
- 2022-09-23 04:19:11,090:INFO:epoch: 1 step: [8511/14785], loss: 9.0905, overflow: False, scale: 8589934592, lr: 0.000993, time: 597.80
- 2022-09-23 04:19:44,498:INFO:epoch: 1 step: [8567/14785], loss: 10.0063, overflow: False, scale: 8589934592, lr: 0.000998, time: 596.52
- 2022-09-23 04:20:17,901:INFO:epoch: 1 step: [8623/14785], loss: 8.5672, overflow: False, scale: 8589934592, lr: 0.001003, time: 596.46
- 2022-09-23 04:20:51,497:INFO:epoch: 1 step: [8679/14785], loss: 9.4380, overflow: False, scale: 8589934592, lr: 0.001008, time: 599.89
- 2022-09-23 04:21:25,011:INFO:epoch: 1 step: [8735/14785], loss: 8.8403, overflow: False, scale: 8589934592, lr: 0.001012, time: 598.43
- 2022-09-23 04:21:58,594:INFO:epoch: 1 step: [8791/14785], loss: 9.5717, overflow: False, scale: 8589934592, lr: 0.001017, time: 599.66
- 2022-09-23 04:22:31,805:INFO:epoch: 1 step: [8847/14785], loss: 9.7538, overflow: False, scale: 8589934592, lr: 0.001022, time: 593.02
- 2022-09-23 04:23:05,202:INFO:epoch: 1 step: [8903/14785], loss: 8.9386, overflow: False, scale: 8589934592, lr: 0.001027, time: 596.30
- 2022-09-23 04:23:38,697:INFO:epoch: 1 step: [8959/14785], loss: 8.9100, overflow: False, scale: 8589934592, lr: 0.001032, time: 598.09
- 2022-09-23 04:24:11,994:INFO:epoch: 1 step: [9015/14785], loss: 9.2104, overflow: False, scale: 8589934592, lr: 0.001037, time: 594.55
- 2022-09-23 04:24:45,418:INFO:epoch: 1 step: [9071/14785], loss: 8.9887, overflow: False, scale: 8589934592, lr: 0.001041, time: 596.81
- 2022-09-23 04:25:19,006:INFO:epoch: 1 step: [9127/14785], loss: 9.2940, overflow: False, scale: 8589934592, lr: 0.001046, time: 599.75
- 2022-09-23 04:25:52,513:INFO:epoch: 1 step: [9183/14785], loss: 9.2786, overflow: False, scale: 17179869184, lr: 0.001051, time: 598.30
- 2022-09-23 04:26:25,923:INFO:epoch: 1 step: [9239/14785], loss: 8.9857, overflow: False, scale: 17179869184, lr: 0.001056, time: 596.57
- 2022-09-23 04:26:59,201:INFO:epoch: 1 step: [9295/14785], loss: 9.2741, overflow: False, scale: 17179869184, lr: 0.001061, time: 594.21
- 2022-09-23 04:27:32,549:INFO:epoch: 1 step: [9351/14785], loss: 9.2751, overflow: False, scale: 17179869184, lr: 0.001066, time: 595.45
- 2022-09-23 04:28:06,099:INFO:epoch: 1 step: [9407/14785], loss: 8.8163, overflow: False, scale: 17179869184, lr: 0.001071, time: 599.07
- 2022-09-23 04:28:39,506:INFO:epoch: 1 step: [9463/14785], loss: 9.4368, overflow: False, scale: 17179869184, lr: 0.001076, time: 596.50
- 2022-09-23 04:29:12,906:INFO:epoch: 1 step: [9519/14785], loss: 9.4674, overflow: False, scale: 17179869184, lr: 0.001081, time: 596.41
- 2022-09-23 04:29:46,292:INFO:epoch: 1 step: [9575/14785], loss: 8.5828, overflow: False, scale: 17179869184, lr: 0.001086, time: 596.13
- 2022-09-23 04:30:19,789:INFO:epoch: 1 step: [9631/14785], loss: 9.2048, overflow: False, scale: 17179869184, lr: 0.001091, time: 598.13
- 2022-09-23 04:30:53,305:INFO:epoch: 1 step: [9687/14785], loss: 9.0925, overflow: False, scale: 17179869184, lr: 0.001096, time: 598.45
- 2022-09-23 04:31:26,795:INFO:epoch: 1 step: [9743/14785], loss: 9.1000, overflow: False, scale: 17179869184, lr: 0.001101, time: 597.99
- 2022-09-23 04:32:00,108:INFO:epoch: 1 step: [9799/14785], loss: 8.7560, overflow: False, scale: 17179869184, lr: 0.001106, time: 594.82
- 2022-09-23 04:32:33,399:INFO:epoch: 1 step: [9855/14785], loss: 8.5790, overflow: False, scale: 17179869184, lr: 0.001111, time: 594.43
- 2022-09-23 04:33:06,910:INFO:epoch: 1 step: [9911/14785], loss: 9.0419, overflow: False, scale: 17179869184, lr: 0.001116, time: 598.37
- 2022-09-23 04:33:40,097:INFO:epoch: 1 step: [9967/14785], loss: 9.1333, overflow: False, scale: 17179869184, lr: 0.001121, time: 592.57
- 2022-09-23 04:34:13,611:INFO:epoch: 1 step: [10023/14785], loss: 8.6272, overflow: False, scale: 17179869184, lr: 0.001126, time: 598.44
- 2022-09-23 04:34:47,100:INFO:epoch: 1 step: [10079/14785], loss: 9.8033, overflow: False, scale: 17179869184, lr: 0.001131, time: 597.97
- 2022-09-23 04:35:20,501:INFO:epoch: 1 step: [10135/14785], loss: 9.5774, overflow: False, scale: 17179869184, lr: 0.001136, time: 596.41
- 2022-09-23 04:35:53,892:INFO:epoch: 1 step: [10191/14785], loss: 9.1902, overflow: False, scale: 17179869184, lr: 0.001142, time: 596.21
- 2022-09-23 04:36:27,397:INFO:epoch: 1 step: [10247/14785], loss: 8.9970, overflow: False, scale: 17179869184, lr: 0.001147, time: 598.27
- 2022-09-23 04:37:00,901:INFO:epoch: 1 step: [10303/14785], loss: 9.4398, overflow: False, scale: 17179869184, lr: 0.001152, time: 598.24
- 2022-09-23 04:37:34,402:INFO:epoch: 1 step: [10359/14785], loss: 8.8445, overflow: False, scale: 17179869184, lr: 0.001157, time: 598.20
- 2022-09-23 04:38:07,963:INFO:epoch: 1 step: [10415/14785], loss: 9.1666, overflow: False, scale: 17179869184, lr: 0.001162, time: 599.26
- 2022-09-23 04:38:41,505:INFO:epoch: 1 step: [10471/14785], loss: 9.4863, overflow: False, scale: 17179869184, lr: 0.001167, time: 598.92
- 2022-09-23 04:39:14,897:INFO:epoch: 1 step: [10527/14785], loss: 8.9336, overflow: False, scale: 17179869184, lr: 0.001172, time: 596.24
- 2022-09-23 04:39:48,402:INFO:epoch: 1 step: [10583/14785], loss: 9.1719, overflow: False, scale: 17179869184, lr: 0.001178, time: 598.26
- 2022-09-23 04:40:21,804:INFO:epoch: 1 step: [10639/14785], loss: 9.3856, overflow: False, scale: 17179869184, lr: 0.001183, time: 596.43
- 2022-09-23 04:40:55,196:INFO:epoch: 1 step: [10695/14785], loss: 9.1431, overflow: False, scale: 17179869184, lr: 0.001188, time: 596.26
- 2022-09-23 04:41:28,498:INFO:epoch: 1 step: [10751/14785], loss: 10.0657, overflow: False, scale: 17179869184, lr: 0.001193, time: 594.65
- 2022-09-23 04:42:01,904:INFO:epoch: 1 step: [10807/14785], loss: 9.0490, overflow: False, scale: 17179869184, lr: 0.001199, time: 596.48
- 2022-09-23 04:42:35,204:INFO:epoch: 1 step: [10863/14785], loss: 8.5793, overflow: False, scale: 17179869184, lr: 0.001204, time: 594.62
- 2022-09-23 04:43:08,709:INFO:epoch: 1 step: [10919/14785], loss: 9.1504, overflow: False, scale: 17179869184, lr: 0.001209, time: 598.26
- 2022-09-23 04:43:42,212:INFO:epoch: 1 step: [10975/14785], loss: 8.9307, overflow: False, scale: 17179869184, lr: 0.001214, time: 598.23
- 2022-09-23 04:44:15,702:INFO:epoch: 1 step: [11031/14785], loss: 8.9479, overflow: False, scale: 17179869184, lr: 0.001220, time: 598.00
- 2022-09-23 04:44:49,309:INFO:epoch: 1 step: [11087/14785], loss: 9.1552, overflow: False, scale: 17179869184, lr: 0.001225, time: 600.07
- 2022-09-23 04:45:22,715:INFO:epoch: 1 step: [11143/14785], loss: 8.9551, overflow: False, scale: 17179869184, lr: 0.001230, time: 596.50
- 2022-09-23 04:45:56,198:INFO:epoch: 1 step: [11199/14785], loss: 8.9501, overflow: False, scale: 34359738368, lr: 0.001236, time: 597.86
- 2022-09-23 04:46:29,801:INFO:epoch: 1 step: [11255/14785], loss: 9.1171, overflow: False, scale: 34359738368, lr: 0.001241, time: 599.99
- 2022-09-23 04:47:03,399:INFO:epoch: 1 step: [11311/14785], loss: 9.4663, overflow: False, scale: 34359738368, lr: 0.001246, time: 599.91
- 2022-09-23 04:47:36,718:INFO:epoch: 1 step: [11367/14785], loss: 9.3201, overflow: False, scale: 34359738368, lr: 0.001252, time: 594.93
- 2022-09-23 04:48:10,197:INFO:epoch: 1 step: [11423/14785], loss: 8.7724, overflow: False, scale: 34359738368, lr: 0.001257, time: 597.81
- 2022-09-23 04:48:43,805:INFO:epoch: 1 step: [11479/14785], loss: 9.2772, overflow: False, scale: 34359738368, lr: 0.001262, time: 600.12
- 2022-09-23 04:49:17,202:INFO:epoch: 1 step: [11535/14785], loss: 9.1816, overflow: False, scale: 34359738368, lr: 0.001268, time: 596.34
- 2022-09-23 04:49:50,606:INFO:epoch: 1 step: [11591/14785], loss: 9.0535, overflow: False, scale: 34359738368, lr: 0.001273, time: 596.47
- 2022-09-23 04:50:24,006:INFO:epoch: 1 step: [11647/14785], loss: 8.8178, overflow: False, scale: 34359738368, lr: 0.001279, time: 596.37
- 2022-09-23 04:50:57,502:INFO:epoch: 1 step: [11703/14785], loss: 8.2567, overflow: False, scale: 34359738368, lr: 0.001284, time: 598.11
- 2022-09-23 04:51:31,112:INFO:epoch: 1 step: [11759/14785], loss: 8.8954, overflow: False, scale: 34359738368, lr: 0.001289, time: 600.15
- 2022-09-23 04:52:04,601:INFO:epoch: 1 step: [11815/14785], loss: 8.6515, overflow: False, scale: 34359738368, lr: 0.001295, time: 597.98
- 2022-09-23 04:52:38,098:INFO:epoch: 1 step: [11871/14785], loss: 8.9237, overflow: False, scale: 34359738368, lr: 0.001300, time: 598.12
- 2022-09-23 04:53:11,704:INFO:epoch: 1 step: [11927/14785], loss: 8.5546, overflow: False, scale: 34359738368, lr: 0.001306, time: 600.05
- 2022-09-23 04:53:45,093:INFO:epoch: 1 step: [11983/14785], loss: 8.6092, overflow: False, scale: 34359738368, lr: 0.001311, time: 596.19
- 2022-09-23 04:54:18,303:INFO:epoch: 1 step: [12039/14785], loss: 8.1982, overflow: False, scale: 34359738368, lr: 0.001317, time: 593.01
- 2022-09-23 04:54:51,757:INFO:epoch: 1 step: [12095/14785], loss: 8.6023, overflow: False, scale: 34359738368, lr: 0.001322, time: 597.35
- 2022-09-23 04:55:25,205:INFO:epoch: 1 step: [12151/14785], loss: 8.7860, overflow: False, scale: 34359738368, lr: 0.001328, time: 597.26
- 2022-09-23 04:55:58,804:INFO:epoch: 1 step: [12207/14785], loss: 9.0460, overflow: False, scale: 34359738368, lr: 0.001333, time: 599.93
- 2022-09-23 04:56:32,302:INFO:epoch: 1 step: [12263/14785], loss: 9.1677, overflow: False, scale: 34359738368, lr: 0.001339, time: 598.14
- 2022-09-23 04:57:05,693:INFO:epoch: 1 step: [12319/14785], loss: 8.8656, overflow: False, scale: 34359738368, lr: 0.001344, time: 596.24
- 2022-09-23 04:57:39,213:INFO:epoch: 1 step: [12375/14785], loss: 8.9426, overflow: False, scale: 34359738368, lr: 0.001350, time: 598.53
- 2022-09-23 04:58:12,697:INFO:epoch: 1 step: [12431/14785], loss: 8.6160, overflow: False, scale: 34359738368, lr: 0.001355, time: 597.90
- 2022-09-23 04:58:46,198:INFO:epoch: 1 step: [12487/14785], loss: 8.8030, overflow: False, scale: 34359738368, lr: 0.001361, time: 598.19
- 2022-09-23 04:59:19,506:INFO:epoch: 1 step: [12543/14785], loss: 7.9534, overflow: False, scale: 34359738368, lr: 0.001367, time: 594.75
- 2022-09-23 04:59:53,109:INFO:epoch: 1 step: [12599/14785], loss: 9.1145, overflow: False, scale: 34359738368, lr: 0.001372, time: 600.02
- 2022-09-23 05:00:26,509:INFO:epoch: 1 step: [12655/14785], loss: 8.4000, overflow: False, scale: 34359738368, lr: 0.001378, time: 596.38
- 2022-09-23 05:01:00,007:INFO:epoch: 1 step: [12711/14785], loss: 8.6712, overflow: False, scale: 34359738368, lr: 0.001384, time: 598.14
- 2022-09-23 05:01:33,300:INFO:epoch: 1 step: [12767/14785], loss: 8.8747, overflow: False, scale: 34359738368, lr: 0.001389, time: 594.49
- 2022-09-23 05:02:06,798:INFO:epoch: 1 step: [12823/14785], loss: 8.9567, overflow: False, scale: 34359738368, lr: 0.001395, time: 598.13
- 2022-09-23 05:02:39,998:INFO:epoch: 1 step: [12879/14785], loss: 9.4060, overflow: False, scale: 34359738368, lr: 0.001400, time: 592.81
- 2022-09-23 05:03:13,301:INFO:epoch: 1 step: [12935/14785], loss: 8.4930, overflow: False, scale: 34359738368, lr: 0.001406, time: 594.64
- 2022-09-23 05:03:46,787:INFO:epoch: 1 step: [12991/14785], loss: 8.9740, overflow: False, scale: 34359738368, lr: 0.001412, time: 597.91
- 2022-09-23 05:04:20,404:INFO:epoch: 1 step: [13047/14785], loss: 8.8689, overflow: False, scale: 34359738368, lr: 0.001418, time: 600.27
- 2022-09-23 05:04:53,896:INFO:epoch: 1 step: [13103/14785], loss: 9.0985, overflow: False, scale: 34359738368, lr: 0.001423, time: 598.02
- 2022-09-23 05:05:27,305:INFO:epoch: 1 step: [13159/14785], loss: 9.0903, overflow: False, scale: 68719476736, lr: 0.001429, time: 596.57
- 2022-09-23 05:06:00,898:INFO:epoch: 1 step: [13215/14785], loss: 8.9772, overflow: False, scale: 68719476736, lr: 0.001435, time: 599.84
- 2022-09-23 05:06:34,122:INFO:epoch: 1 step: [13271/14785], loss: 8.5632, overflow: False, scale: 68719476736, lr: 0.001440, time: 593.24
- 2022-09-23 05:07:07,501:INFO:epoch: 1 step: [13327/14785], loss: 8.9135, overflow: False, scale: 68719476736, lr: 0.001446, time: 596.00
- 2022-09-23 05:07:41,059:INFO:epoch: 1 step: [13383/14785], loss: 9.6293, overflow: False, scale: 68719476736, lr: 0.001452, time: 599.22
- 2022-09-23 05:08:14,606:INFO:epoch: 1 step: [13439/14785], loss: 8.6038, overflow: False, scale: 68719476736, lr: 0.001458, time: 598.99
- 2022-09-23 05:08:48,205:INFO:epoch: 1 step: [13495/14785], loss: 7.9577, overflow: False, scale: 68719476736, lr: 0.001464, time: 599.95
- 2022-09-23 05:09:21,703:INFO:epoch: 1 step: [13551/14785], loss: 8.0801, overflow: False, scale: 68719476736, lr: 0.001469, time: 598.11
- 2022-09-23 05:09:55,205:INFO:epoch: 1 step: [13607/14785], loss: 8.7831, overflow: False, scale: 68719476736, lr: 0.001475, time: 598.22
- 2022-09-23 05:10:28,805:INFO:epoch: 1 step: [13663/14785], loss: 10.1177, overflow: False, scale: 68719476736, lr: 0.001481, time: 599.95
- 2022-09-23 05:11:02,311:INFO:epoch: 1 step: [13719/14785], loss: 8.8395, overflow: False, scale: 68719476736, lr: 0.001487, time: 598.27
- 2022-09-23 05:11:35,798:INFO:epoch: 1 step: [13775/14785], loss: 8.9450, overflow: False, scale: 68719476736, lr: 0.001493, time: 597.94
- 2022-09-23 05:12:09,102:INFO:epoch: 1 step: [13831/14785], loss: 8.8830, overflow: False, scale: 68719476736, lr: 0.001499, time: 594.68
- 2022-09-23 05:12:42,703:INFO:epoch: 1 step: [13887/14785], loss: 8.6471, overflow: False, scale: 68719476736, lr: 0.001504, time: 599.98
- 2022-09-23 05:13:16,175:INFO:epoch: 1 step: [13943/14785], loss: 8.9486, overflow: False, scale: 68719476736, lr: 0.001510, time: 597.67
- 2022-09-23 05:13:49,705:INFO:epoch: 1 step: [13999/14785], loss: 9.4674, overflow: False, scale: 68719476736, lr: 0.001516, time: 598.62
- 2022-09-23 05:14:23,196:INFO:epoch: 1 step: [14055/14785], loss: 8.3987, overflow: False, scale: 68719476736, lr: 0.001522, time: 598.01
- 2022-09-23 05:14:56,700:INFO:epoch: 1 step: [14111/14785], loss: 9.3581, overflow: False, scale: 68719476736, lr: 0.001528, time: 598.25
- 2022-09-23 05:15:30,206:INFO:epoch: 1 step: [14167/14785], loss: 9.1411, overflow: False, scale: 68719476736, lr: 0.001534, time: 598.28
- 2022-09-23 05:16:03,697:INFO:epoch: 1 step: [14223/14785], loss: 8.4327, overflow: False, scale: 68719476736, lr: 0.001540, time: 598.03
- 2022-09-23 05:16:37,210:INFO:epoch: 1 step: [14279/14785], loss: 8.7081, overflow: False, scale: 68719476736, lr: 0.001546, time: 598.38
- 2022-09-23 05:17:10,706:INFO:epoch: 1 step: [14335/14785], loss: 8.5417, overflow: False, scale: 68719476736, lr: 0.001552, time: 598.12
- 2022-09-23 05:17:44,300:INFO:epoch: 1 step: [14391/14785], loss: 8.1278, overflow: False, scale: 68719476736, lr: 0.001558, time: 599.85
- 2022-09-23 05:18:17,597:INFO:epoch: 1 step: [14447/14785], loss: 8.1599, overflow: False, scale: 68719476736, lr: 0.001564, time: 594.57
- 2022-09-23 05:18:50,915:INFO:epoch: 1 step: [14503/14785], loss: 8.6053, overflow: False, scale: 68719476736, lr: 0.001570, time: 594.92
- 2022-09-23 05:19:24,371:INFO:epoch: 1 step: [14559/14785], loss: 8.9080, overflow: False, scale: 68719476736, lr: 0.001576, time: 597.37
- 2022-09-23 05:19:58,000:INFO:epoch: 1 step: [14615/14785], loss: 8.1768, overflow: False, scale: 68719476736, lr: 0.001582, time: 600.48
- 2022-09-23 05:20:31,296:INFO:epoch: 1 step: [14671/14785], loss: 8.4994, overflow: False, scale: 68719476736, lr: 0.001588, time: 594.53
- 2022-09-23 05:21:04,822:INFO:epoch: 1 step: [14727/14785], loss: 8.7476, overflow: False, scale: 68719476736, lr: 0.001594, time: 598.65
- 2022-09-23 05:21:38,204:INFO:epoch: 1 step: [14783/14785], loss: 8.2223, overflow: False, scale: 68719476736, lr: 0.001600, time: 596.07
- 2022-09-23 05:22:13,562:INFO:epoch: 2 step: [54/14785], loss: 8.4084, overflow: False, scale: 68719476736, lr: 0.001606, time: 631.34
- 2022-09-23 05:22:46,901:INFO:epoch: 2 step: [110/14785], loss: 8.9394, overflow: False, scale: 68719476736, lr: 0.001612, time: 595.31
- 2022-09-23 05:23:20,305:INFO:epoch: 2 step: [166/14785], loss: 8.5618, overflow: False, scale: 68719476736, lr: 0.001618, time: 596.47
- 2022-09-23 05:23:53,605:INFO:epoch: 2 step: [222/14785], loss: 9.2533, overflow: False, scale: 68719476736, lr: 0.001624, time: 594.61
- 2022-09-23 05:24:27,202:INFO:epoch: 2 step: [278/14785], loss: 8.1937, overflow: False, scale: 68719476736, lr: 0.001630, time: 599.92
- 2022-09-23 05:25:00,499:INFO:epoch: 2 step: [334/14785], loss: 9.3451, overflow: False, scale: 68719476736, lr: 0.001636, time: 594.56
- 2022-09-23 05:25:33,999:INFO:epoch: 2 step: [390/14785], loss: 7.9416, overflow: False, scale: 137438953472, lr: 0.001643, time: 598.17
- 2022-09-23 05:26:07,498:INFO:epoch: 2 step: [446/14785], loss: 8.2450, overflow: False, scale: 137438953472, lr: 0.001649, time: 598.15
- 2022-09-23 05:26:41,029:INFO:epoch: 2 step: [502/14785], loss: 8.9656, overflow: False, scale: 137438953472, lr: 0.001655, time: 598.73
- 2022-09-23 05:27:14,494:INFO:epoch: 2 step: [558/14785], loss: 9.0547, overflow: False, scale: 137438953472, lr: 0.001661, time: 597.56
- 2022-09-23 05:27:47,899:INFO:epoch: 2 step: [614/14785], loss: 8.2176, overflow: False, scale: 137438953472, lr: 0.001667, time: 596.49
- 2022-09-23 05:28:21,299:INFO:epoch: 2 step: [670/14785], loss: 8.0190, overflow: False, scale: 137438953472, lr: 0.001673, time: 596.38
- 2022-09-23 05:28:54,699:INFO:epoch: 2 step: [726/14785], loss: 8.0467, overflow: False, scale: 137438953472, lr: 0.001680, time: 596.39
- 2022-09-23 05:29:28,195:INFO:epoch: 2 step: [782/14785], loss: 8.7046, overflow: False, scale: 137438953472, lr: 0.001686, time: 598.11
- 2022-09-23 05:30:01,708:INFO:epoch: 2 step: [838/14785], loss: 8.8456, overflow: False, scale: 137438953472, lr: 0.001692, time: 598.39
- 2022-09-23 05:30:35,304:INFO:epoch: 2 step: [894/14785], loss: 9.1878, overflow: False, scale: 137438953472, lr: 0.001698, time: 599.88
- 2022-09-23 05:31:08,696:INFO:epoch: 2 step: [950/14785], loss: 8.5812, overflow: False, scale: 137438953472, lr: 0.001705, time: 596.24
- 2022-09-23 05:31:41,995:INFO:epoch: 2 step: [1006/14785], loss: 8.4849, overflow: False, scale: 137438953472, lr: 0.001711, time: 594.58
- 2022-09-23 05:32:15,502:INFO:epoch: 2 step: [1062/14785], loss: 8.6839, overflow: False, scale: 137438953472, lr: 0.001717, time: 598.31
- 2022-09-23 05:32:49,012:INFO:epoch: 2 step: [1118/14785], loss: 9.0987, overflow: False, scale: 137438953472, lr: 0.001723, time: 598.36
- 2022-09-23 05:33:22,402:INFO:epoch: 2 step: [1174/14785], loss: 8.6762, overflow: False, scale: 137438953472, lr: 0.001730, time: 596.22
- 2022-09-23 05:33:55,804:INFO:epoch: 2 step: [1230/14785], loss: 8.5430, overflow: False, scale: 137438953472, lr: 0.001736, time: 596.43
- 2022-09-23 05:34:29,402:INFO:epoch: 2 step: [1286/14785], loss: 8.2631, overflow: False, scale: 137438953472, lr: 0.001742, time: 599.84
- 2022-09-23 05:35:02,953:INFO:epoch: 2 step: [1342/14785], loss: 8.5058, overflow: False, scale: 137438953472, lr: 0.001749, time: 599.10
- 2022-09-23 05:35:36,504:INFO:epoch: 2 step: [1398/14785], loss: 8.8853, overflow: False, scale: 137438953472, lr: 0.001755, time: 599.07
- 2022-09-23 05:36:09,898:INFO:epoch: 2 step: [1454/14785], loss: 8.3059, overflow: False, scale: 137438953472, lr: 0.001761, time: 596.28
- 2022-09-23 05:36:43,413:INFO:epoch: 2 step: [1510/14785], loss: 8.9970, overflow: False, scale: 137438953472, lr: 0.001768, time: 598.44
- 2022-09-23 05:37:16,900:INFO:epoch: 2 step: [1566/14785], loss: 8.9060, overflow: False, scale: 137438953472, lr: 0.001774, time: 597.92
- 2022-09-23 05:37:50,400:INFO:epoch: 2 step: [1622/14785], loss: 8.3835, overflow: False, scale: 137438953472, lr: 0.001780, time: 598.17
- 2022-09-23 05:38:23,702:INFO:epoch: 2 step: [1678/14785], loss: 7.7378, overflow: False, scale: 137438953472, lr: 0.001787, time: 594.65
- 2022-09-23 05:38:57,201:INFO:epoch: 2 step: [1734/14785], loss: 8.6546, overflow: False, scale: 137438953472, lr: 0.001793, time: 598.15
- 2022-09-23 05:39:30,616:INFO:epoch: 2 step: [1790/14785], loss: 8.7968, overflow: False, scale: 137438953472, lr: 0.001800, time: 596.65
- 2022-09-23 05:40:03,888:INFO:epoch: 2 step: [1846/14785], loss: 8.1595, overflow: False, scale: 137438953472, lr: 0.001806, time: 594.09
- 2022-09-23 05:40:37,289:INFO:epoch: 2 step: [1902/14785], loss: 8.3783, overflow: False, scale: 137438953472, lr: 0.001813, time: 596.41
- 2022-09-23 05:41:10,700:INFO:epoch: 2 step: [1958/14785], loss: 8.7106, overflow: False, scale: 137438953472, lr: 0.001819, time: 596.60
- 2022-09-23 05:41:44,199:INFO:epoch: 2 step: [2014/14785], loss: 8.5017, overflow: False, scale: 137438953472, lr: 0.001825, time: 598.15
- 2022-09-23 05:42:17,588:INFO:epoch: 2 step: [2070/14785], loss: 8.3189, overflow: False, scale: 137438953472, lr: 0.001832, time: 596.18
- 2022-09-23 05:42:50,992:INFO:epoch: 2 step: [2126/14785], loss: 8.0332, overflow: False, scale: 137438953472, lr: 0.001838, time: 596.46
- 2022-09-23 05:43:24,405:INFO:epoch: 2 step: [2182/14785], loss: 7.7771, overflow: False, scale: 137438953472, lr: 0.001845, time: 596.62
- 2022-09-23 05:43:57,916:INFO:epoch: 2 step: [2238/14785], loss: 8.5042, overflow: False, scale: 137438953472, lr: 0.001851, time: 598.39
- 2022-09-23 05:44:31,296:INFO:epoch: 2 step: [2294/14785], loss: 8.3703, overflow: False, scale: 137438953472, lr: 0.001858, time: 596.03
- 2022-09-23 05:45:04,595:INFO:epoch: 2 step: [2350/14785], loss: 8.8804, overflow: False, scale: 137438953472, lr: 0.001865, time: 594.59
- 2022-09-23 05:45:38,213:INFO:epoch: 2 step: [2406/14785], loss: 8.7177, overflow: False, scale: 274877906944, lr: 0.001871, time: 600.28
- 2022-09-23 05:46:11,807:INFO:epoch: 2 step: [2462/14785], loss: 8.3903, overflow: False, scale: 274877906944, lr: 0.001878, time: 599.85
- 2022-09-23 05:46:45,411:INFO:epoch: 2 step: [2518/14785], loss: 8.9127, overflow: False, scale: 274877906944, lr: 0.001884, time: 600.05
- 2022-09-23 05:47:19,008:INFO:epoch: 2 step: [2574/14785], loss: 8.6602, overflow: False, scale: 274877906944, lr: 0.001891, time: 599.92
- 2022-09-23 05:47:52,402:INFO:epoch: 2 step: [2630/14785], loss: 8.0990, overflow: False, scale: 274877906944, lr: 0.001897, time: 596.28
- 2022-09-23 05:48:25,981:INFO:epoch: 2 step: [2686/14785], loss: 9.0403, overflow: False, scale: 274877906944, lr: 0.001904, time: 599.57
- 2022-09-23 05:48:59,288:INFO:epoch: 2 step: [2742/14785], loss: 8.3040, overflow: False, scale: 274877906944, lr: 0.001911, time: 594.72
- 2022-09-23 05:49:32,703:INFO:epoch: 2 step: [2798/14785], loss: 8.4526, overflow: False, scale: 274877906944, lr: 0.001917, time: 596.64
- 2022-09-23 05:50:06,191:INFO:epoch: 2 step: [2854/14785], loss: 9.0865, overflow: False, scale: 274877906944, lr: 0.001924, time: 597.97
- 2022-09-23 05:50:39,697:INFO:epoch: 2 step: [2910/14785], loss: 9.0085, overflow: False, scale: 274877906944, lr: 0.001931, time: 598.28
- 2022-09-23 05:51:13,211:INFO:epoch: 2 step: [2966/14785], loss: 9.0092, overflow: False, scale: 274877906944, lr: 0.001937, time: 598.43
- 2022-09-23 05:51:46,802:INFO:epoch: 2 step: [3022/14785], loss: 8.0457, overflow: False, scale: 274877906944, lr: 0.001944, time: 599.79
- 2022-09-23 05:52:20,406:INFO:epoch: 2 step: [3078/14785], loss: 8.3786, overflow: False, scale: 274877906944, lr: 0.001951, time: 600.04
- 2022-09-23 05:52:53,798:INFO:epoch: 2 step: [3134/14785], loss: 8.4215, overflow: False, scale: 274877906944, lr: 0.001957, time: 596.25
- 2022-09-23 05:53:27,309:INFO:epoch: 2 step: [3190/14785], loss: 7.8594, overflow: False, scale: 274877906944, lr: 0.001964, time: 598.36
- 2022-09-23 05:54:00,814:INFO:epoch: 2 step: [3246/14785], loss: 8.3120, overflow: False, scale: 274877906944, lr: 0.001971, time: 598.25
- 2022-09-23 05:54:34,089:INFO:epoch: 2 step: [3302/14785], loss: 8.8624, overflow: False, scale: 274877906944, lr: 0.001977, time: 594.14
- 2022-09-23 05:55:07,598:INFO:epoch: 2 step: [3358/14785], loss: 8.4326, overflow: False, scale: 274877906944, lr: 0.001984, time: 598.35
- 2022-09-23 05:55:41,100:INFO:epoch: 2 step: [3414/14785], loss: 8.0675, overflow: False, scale: 274877906944, lr: 0.001991, time: 598.17
- 2022-09-23 05:56:14,712:INFO:epoch: 2 step: [3470/14785], loss: 8.3576, overflow: False, scale: 274877906944, lr: 0.001998, time: 600.18
- 2022-09-23 05:56:48,300:INFO:epoch: 2 step: [3526/14785], loss: 8.9452, overflow: False, scale: 274877906944, lr: 0.002004, time: 599.76
- 2022-09-23 05:57:21,888:INFO:epoch: 2 step: [3582/14785], loss: 8.2252, overflow: False, scale: 274877906944, lr: 0.002011, time: 599.71
- 2022-09-23 05:57:55,308:INFO:epoch: 2 step: [3638/14785], loss: 8.8926, overflow: False, scale: 274877906944, lr: 0.002018, time: 596.76
- 2022-09-23 05:58:28,902:INFO:epoch: 2 step: [3694/14785], loss: 8.4339, overflow: False, scale: 274877906944, lr: 0.002025, time: 599.85
- 2022-09-23 05:59:02,514:INFO:epoch: 2 step: [3750/14785], loss: 8.5513, overflow: False, scale: 274877906944, lr: 0.002032, time: 600.17
- 2022-09-23 05:59:35,902:INFO:epoch: 2 step: [3806/14785], loss: 7.9383, overflow: False, scale: 274877906944, lr: 0.002039, time: 596.19
- 2022-09-23 06:00:09,511:INFO:epoch: 2 step: [3862/14785], loss: 8.5998, overflow: False, scale: 274877906944, lr: 0.002045, time: 600.13
- 2022-09-23 06:00:43,100:INFO:epoch: 2 step: [3918/14785], loss: 8.1415, overflow: False, scale: 274877906944, lr: 0.002052, time: 599.77
- 2022-09-23 06:01:16,594:INFO:epoch: 2 step: [3974/14785], loss: 8.8996, overflow: False, scale: 274877906944, lr: 0.002059, time: 598.08
- 2022-09-23 06:01:50,134:INFO:epoch: 2 step: [4030/14785], loss: 8.2534, overflow: False, scale: 274877906944, lr: 0.002066, time: 598.89
- 2022-09-23 06:02:23,505:INFO:epoch: 2 step: [4086/14785], loss: 8.9520, overflow: False, scale: 274877906944, lr: 0.002073, time: 595.86
- 2022-09-23 06:02:57,095:INFO:epoch: 2 step: [4142/14785], loss: 9.2037, overflow: False, scale: 274877906944, lr: 0.002080, time: 599.80
- 2022-09-23 06:03:30,509:INFO:epoch: 2 step: [4198/14785], loss: 8.6261, overflow: False, scale: 274877906944, lr: 0.002087, time: 596.62
- 2022-09-23 06:04:04,001:INFO:epoch: 2 step: [4254/14785], loss: 8.9077, overflow: False, scale: 274877906944, lr: 0.002094, time: 598.05
- 2022-09-23 06:04:37,611:INFO:epoch: 2 step: [4310/14785], loss: 8.9284, overflow: False, scale: 274877906944, lr: 0.002101, time: 600.12
- 2022-09-23 06:05:11,098:INFO:epoch: 2 step: [4366/14785], loss: 8.2107, overflow: False, scale: 274877906944, lr: 0.002107, time: 597.94
- 2022-09-23 06:05:44,702:INFO:epoch: 2 step: [4422/14785], loss: 8.0823, overflow: False, scale: 549755813888, lr: 0.002114, time: 600.02
- 2022-09-23 06:06:18,305:INFO:epoch: 2 step: [4478/14785], loss: 8.2090, overflow: False, scale: 549755813888, lr: 0.002121, time: 600.03
- 2022-09-23 06:06:51,752:INFO:epoch: 2 step: [4534/14785], loss: 8.1061, overflow: False, scale: 549755813888, lr: 0.002128, time: 597.24
- 2022-09-23 06:07:25,300:INFO:epoch: 2 step: [4590/14785], loss: 8.5850, overflow: False, scale: 549755813888, lr: 0.002135, time: 599.03
- 2022-09-23 06:07:58,704:INFO:epoch: 2 step: [4646/14785], loss: 8.1525, overflow: False, scale: 549755813888, lr: 0.002142, time: 596.45
- 2022-09-23 06:08:32,001:INFO:epoch: 2 step: [4702/14785], loss: 8.8808, overflow: False, scale: 549755813888, lr: 0.002149, time: 594.56
- 2022-09-23 06:09:05,513:INFO:epoch: 2 step: [4758/14785], loss: 8.4109, overflow: False, scale: 549755813888, lr: 0.002156, time: 598.40
- 2022-09-23 06:09:38,702:INFO:epoch: 2 step: [4814/14785], loss: 8.2950, overflow: False, scale: 549755813888, lr: 0.002163, time: 592.61
- 2022-09-23 06:10:12,066:INFO:epoch: 2 step: [4870/14785], loss: 8.4552, overflow: False, scale: 549755813888, lr: 0.002171, time: 595.75
- 2022-09-23 06:10:45,495:INFO:epoch: 2 step: [4926/14785], loss: 8.4356, overflow: False, scale: 549755813888, lr: 0.002178, time: 596.91
- 2022-09-23 06:11:19,000:INFO:epoch: 2 step: [4982/14785], loss: 8.0225, overflow: False, scale: 549755813888, lr: 0.002185, time: 598.27
- 2022-09-23 06:11:52,606:INFO:epoch: 2 step: [5038/14785], loss: 8.4430, overflow: False, scale: 549755813888, lr: 0.002192, time: 600.06
- 2022-09-23 06:12:26,207:INFO:epoch: 2 step: [5094/14785], loss: 8.6124, overflow: False, scale: 549755813888, lr: 0.002199, time: 599.98
- 2022-09-23 06:12:59,701:INFO:epoch: 2 step: [5150/14785], loss: 7.9609, overflow: False, scale: 549755813888, lr: 0.002206, time: 598.05
- 2022-09-23 06:13:33,302:INFO:epoch: 2 step: [5206/14785], loss: 8.7263, overflow: False, scale: 549755813888, lr: 0.002213, time: 599.99
- 2022-09-23 06:14:06,901:INFO:epoch: 2 step: [5262/14785], loss: 8.7606, overflow: False, scale: 549755813888, lr: 0.002220, time: 599.95
- 2022-09-23 06:14:40,303:INFO:epoch: 2 step: [5318/14785], loss: 8.5328, overflow: False, scale: 549755813888, lr: 0.002227, time: 596.40
- 2022-09-23 06:15:13,804:INFO:epoch: 2 step: [5374/14785], loss: 8.3351, overflow: False, scale: 549755813888, lr: 0.002235, time: 598.20
- 2022-09-23 06:15:47,204:INFO:epoch: 2 step: [5430/14785], loss: 8.5267, overflow: False, scale: 549755813888, lr: 0.002242, time: 596.40
- 2022-09-23 06:16:20,811:INFO:epoch: 2 step: [5486/14785], loss: 8.3085, overflow: False, scale: 549755813888, lr: 0.002249, time: 600.09
- 2022-09-23 06:16:54,400:INFO:epoch: 2 step: [5542/14785], loss: 8.3998, overflow: False, scale: 549755813888, lr: 0.002256, time: 599.75
- 2022-09-23 06:17:27,894:INFO:epoch: 2 step: [5598/14785], loss: 8.3264, overflow: False, scale: 549755813888, lr: 0.002263, time: 598.08
- 2022-09-23 06:18:01,405:INFO:epoch: 2 step: [5654/14785], loss: 8.1848, overflow: False, scale: 549755813888, lr: 0.002270, time: 598.36
- 2022-09-23 06:18:34,801:INFO:epoch: 2 step: [5710/14785], loss: 8.3497, overflow: False, scale: 549755813888, lr: 0.002278, time: 596.32
- 2022-09-23 06:19:08,198:INFO:epoch: 2 step: [5766/14785], loss: 7.9720, overflow: False, scale: 549755813888, lr: 0.002285, time: 596.34
- 2022-09-23 06:19:41,698:INFO:epoch: 2 step: [5822/14785], loss: 8.0764, overflow: False, scale: 549755813888, lr: 0.002292, time: 598.17
- 2022-09-23 06:20:15,111:INFO:epoch: 2 step: [5878/14785], loss: 8.5998, overflow: False, scale: 549755813888, lr: 0.002299, time: 596.61
- 2022-09-23 06:20:48,710:INFO:epoch: 2 step: [5934/14785], loss: 7.7923, overflow: False, scale: 549755813888, lr: 0.002307, time: 599.96
- 2022-09-23 06:21:22,297:INFO:epoch: 2 step: [5990/14785], loss: 8.4827, overflow: False, scale: 549755813888, lr: 0.002314, time: 599.71
- 2022-09-23 06:21:55,902:INFO:epoch: 2 step: [6046/14785], loss: 8.8309, overflow: False, scale: 549755813888, lr: 0.002321, time: 600.07
- 2022-09-23 06:22:29,301:INFO:epoch: 2 step: [6102/14785], loss: 9.0007, overflow: False, scale: 549755813888, lr: 0.002329, time: 596.35
- 2022-09-23 06:23:02,901:INFO:epoch: 2 step: [6158/14785], loss: 8.7716, overflow: False, scale: 549755813888, lr: 0.002336, time: 599.96
- 2022-09-23 06:23:36,499:INFO:epoch: 2 step: [6214/14785], loss: 8.4443, overflow: False, scale: 549755813888, lr: 0.002343, time: 599.93
- 2022-09-23 06:24:10,101:INFO:epoch: 2 step: [6270/14785], loss: 8.1949, overflow: False, scale: 549755813888, lr: 0.002351, time: 599.98
- 2022-09-23 06:24:43,615:INFO:epoch: 2 step: [6326/14785], loss: 8.5530, overflow: False, scale: 549755813888, lr: 0.002358, time: 598.40
- 2022-09-23 06:25:17,200:INFO:epoch: 2 step: [6382/14785], loss: 8.8868, overflow: False, scale: 1099511627776, lr: 0.002365, time: 599.68
- 2022-09-23 06:25:50,810:INFO:epoch: 2 step: [6438/14785], loss: 8.1346, overflow: False, scale: 1099511627776, lr: 0.002373, time: 600.13
- 2022-09-23 06:26:24,198:INFO:epoch: 2 step: [6494/14785], loss: 8.6281, overflow: False, scale: 1099511627776, lr: 0.002380, time: 596.18
- 2022-09-23 06:26:57,704:INFO:epoch: 2 step: [6550/14785], loss: 8.0781, overflow: False, scale: 1099511627776, lr: 0.002387, time: 598.28
- 2022-09-23 06:27:31,005:INFO:epoch: 2 step: [6606/14785], loss: 8.0840, overflow: False, scale: 1099511627776, lr: 0.002395, time: 594.62
- 2022-09-23 06:28:04,408:INFO:epoch: 2 step: [6662/14785], loss: 8.1585, overflow: False, scale: 1099511627776, lr: 0.002402, time: 596.44
- 2022-09-23 06:28:38,010:INFO:epoch: 2 step: [6718/14785], loss: 7.9578, overflow: False, scale: 1099511627776, lr: 0.002410, time: 600.01
- 2022-09-23 06:29:11,596:INFO:epoch: 2 step: [6774/14785], loss: 8.4871, overflow: False, scale: 1099511627776, lr: 0.002417, time: 599.70
- 2022-09-23 06:29:45,209:INFO:epoch: 2 step: [6830/14785], loss: 8.7218, overflow: False, scale: 1099511627776, lr: 0.002425, time: 600.20
- 2022-09-23 06:30:18,701:INFO:epoch: 2 step: [6886/14785], loss: 8.2062, overflow: False, scale: 1099511627776, lr: 0.002432, time: 598.04
- 2022-09-23 06:30:52,111:INFO:epoch: 2 step: [6942/14785], loss: 8.1467, overflow: False, scale: 1099511627776, lr: 0.002440, time: 596.54
- 2022-09-23 06:31:25,699:INFO:epoch: 2 step: [6998/14785], loss: 7.8881, overflow: False, scale: 1099511627776, lr: 0.002447, time: 599.76
- 2022-09-23 06:31:59,190:INFO:epoch: 2 step: [7054/14785], loss: 8.1597, overflow: False, scale: 1099511627776, lr: 0.002455, time: 598.03
- 2022-09-23 06:32:32,689:INFO:epoch: 2 step: [7110/14785], loss: 8.3214, overflow: False, scale: 1099511627776, lr: 0.002462, time: 598.14
- 2022-09-23 06:33:06,007:INFO:epoch: 2 step: [7166/14785], loss: 7.9301, overflow: False, scale: 1099511627776, lr: 0.002470, time: 594.91
- 2022-09-23 06:33:39,597:INFO:epoch: 2 step: [7222/14785], loss: 8.2218, overflow: False, scale: 1099511627776, lr: 0.002477, time: 599.76
- 2022-09-23 06:34:13,205:INFO:epoch: 2 step: [7278/14785], loss: 8.7436, overflow: False, scale: 1099511627776, lr: 0.002485, time: 600.10
- 2022-09-23 06:34:46,609:INFO:epoch: 2 step: [7334/14785], loss: 8.3064, overflow: False, scale: 1099511627776, lr: 0.002492, time: 596.47
- 2022-09-23 06:35:20,199:INFO:epoch: 2 step: [7390/14785], loss: 8.0904, overflow: False, scale: 1099511627776, lr: 0.002500, time: 599.79
- 2022-09-23 06:35:53,800:INFO:epoch: 2 step: [7446/14785], loss: 8.3922, overflow: False, scale: 1099511627776, lr: 0.002507, time: 599.98
- 2022-09-23 06:36:27,197:INFO:epoch: 2 step: [7502/14785], loss: 8.1099, overflow: False, scale: 1099511627776, lr: 0.002515, time: 596.34
- 2022-09-23 06:37:00,709:INFO:epoch: 2 step: [7558/14785], loss: 7.2665, overflow: False, scale: 1099511627776, lr: 0.002523, time: 598.39
- 2022-09-23 06:37:34,103:INFO:epoch: 2 step: [7614/14785], loss: 8.3456, overflow: False, scale: 1099511627776, lr: 0.002530, time: 596.27
- 2022-09-23 06:38:07,622:INFO:epoch: 2 step: [7670/14785], loss: 8.3008, overflow: False, scale: 1099511627776, lr: 0.002538, time: 598.52
- 2022-09-23 06:38:41,204:INFO:epoch: 2 step: [7726/14785], loss: 7.8513, overflow: False, scale: 1099511627776, lr: 0.002545, time: 599.63
- 2022-09-23 06:39:14,792:INFO:epoch: 2 step: [7782/14785], loss: 7.5152, overflow: False, scale: 1099511627776, lr: 0.002553, time: 599.75
- 2022-09-23 06:39:48,207:INFO:epoch: 2 step: [7838/14785], loss: 7.5957, overflow: False, scale: 1099511627776, lr: 0.002561, time: 596.65
- 2022-09-23 06:40:21,501:INFO:epoch: 2 step: [7894/14785], loss: 8.8925, overflow: False, scale: 1099511627776, lr: 0.002568, time: 594.50
- 2022-09-23 06:40:54,914:INFO:epoch: 2 step: [7950/14785], loss: 8.4441, overflow: False, scale: 1099511627776, lr: 0.002576, time: 596.63
- 2022-09-23 06:41:28,382:INFO:epoch: 2 step: [8006/14785], loss: 8.3690, overflow: False, scale: 1099511627776, lr: 0.002584, time: 597.60
- 2022-09-23 06:42:01,896:INFO:epoch: 2 step: [8062/14785], loss: 7.5921, overflow: False, scale: 1099511627776, lr: 0.002592, time: 598.44
- 2022-09-23 06:42:35,398:INFO:epoch: 2 step: [8118/14785], loss: 7.9802, overflow: False, scale: 1099511627776, lr: 0.002599, time: 598.21
- 2022-09-23 06:43:08,904:INFO:epoch: 2 step: [8174/14785], loss: 8.3621, overflow: False, scale: 1099511627776, lr: 0.002607, time: 598.27
- 2022-09-23 06:43:42,492:INFO:epoch: 2 step: [8230/14785], loss: 8.0490, overflow: False, scale: 1099511627776, lr: 0.002615, time: 599.75
- 2022-09-23 06:44:16,000:INFO:epoch: 2 step: [8286/14785], loss: 8.2416, overflow: False, scale: 1099511627776, lr: 0.002622, time: 598.31
- 2022-09-23 06:44:49,397:INFO:epoch: 2 step: [8342/14785], loss: 8.5163, overflow: False, scale: 1099511627776, lr: 0.002630, time: 596.35
- 2022-09-23 06:45:23,005:INFO:epoch: 2 step: [8398/14785], loss: 7.8025, overflow: False, scale: 2199023255552, lr: 0.002638, time: 600.10
- 2022-09-23 06:45:56,408:INFO:epoch: 2 step: [8454/14785], loss: 8.0980, overflow: False, scale: 2199023255552, lr: 0.002646, time: 596.45
- 2022-09-23 06:46:30,006:INFO:epoch: 2 step: [8510/14785], loss: 8.4122, overflow: False, scale: 2199023255552, lr: 0.002654, time: 599.93
- 2022-09-23 06:47:03,295:INFO:epoch: 2 step: [8566/14785], loss: 9.1459, overflow: False, scale: 2199023255552, lr: 0.002661, time: 594.41
- 2022-09-23 06:47:36,806:INFO:epoch: 2 step: [8622/14785], loss: 8.2284, overflow: False, scale: 2199023255552, lr: 0.002669, time: 598.36
- 2022-09-23 06:48:10,397:INFO:epoch: 2 step: [8678/14785], loss: 8.2170, overflow: False, scale: 2199023255552, lr: 0.002677, time: 599.78
- 2022-09-23 06:48:43,897:INFO:epoch: 2 step: [8734/14785], loss: 8.1528, overflow: False, scale: 2199023255552, lr: 0.002685, time: 598.18
- 2022-09-23 06:49:17,269:INFO:epoch: 2 step: [8790/14785], loss: 8.4778, overflow: False, scale: 2199023255552, lr: 0.002693, time: 595.87
- 2022-09-23 06:49:50,810:INFO:epoch: 2 step: [8846/14785], loss: 8.1463, overflow: False, scale: 2199023255552, lr: 0.002701, time: 598.91
- 2022-09-23 06:50:24,302:INFO:epoch: 2 step: [8902/14785], loss: 8.3661, overflow: False, scale: 2199023255552, lr: 0.002709, time: 598.04
- 2022-09-23 06:50:57,603:INFO:epoch: 2 step: [8958/14785], loss: 8.6549, overflow: False, scale: 2199023255552, lr: 0.002716, time: 594.63
- 2022-09-23 06:51:31,198:INFO:epoch: 2 step: [9014/14785], loss: 8.3516, overflow: False, scale: 2199023255552, lr: 0.002724, time: 599.84
- 2022-09-23 06:52:04,613:INFO:epoch: 2 step: [9070/14785], loss: 8.3291, overflow: False, scale: 2199023255552, lr: 0.002732, time: 596.66
- 2022-09-23 06:52:38,101:INFO:epoch: 2 step: [9126/14785], loss: 8.4161, overflow: False, scale: 2199023255552, lr: 0.002740, time: 597.99
- 2022-09-23 06:53:11,593:INFO:epoch: 2 step: [9182/14785], loss: 8.2235, overflow: False, scale: 2199023255552, lr: 0.002748, time: 598.02
- 2022-09-23 06:53:45,007:INFO:epoch: 2 step: [9238/14785], loss: 8.3579, overflow: False, scale: 2199023255552, lr: 0.002756, time: 596.62
- 2022-09-23 06:54:18,508:INFO:epoch: 2 step: [9294/14785], loss: 7.7905, overflow: False, scale: 2199023255552, lr: 0.002764, time: 598.18
- 2022-09-23 06:54:51,993:INFO:epoch: 2 step: [9350/14785], loss: 8.4164, overflow: False, scale: 2199023255552, lr: 0.002772, time: 597.93
- 2022-09-23 06:55:25,505:INFO:epoch: 2 step: [9406/14785], loss: 7.9425, overflow: False, scale: 2199023255552, lr: 0.002780, time: 598.38
- 2022-09-23 06:55:58,991:INFO:epoch: 2 step: [9462/14785], loss: 8.6012, overflow: False, scale: 2199023255552, lr: 0.002788, time: 597.92
- 2022-09-23 06:56:32,511:INFO:epoch: 2 step: [9518/14785], loss: 8.2207, overflow: False, scale: 2199023255552, lr: 0.002796, time: 598.53
- 2022-09-23 06:57:05,897:INFO:epoch: 2 step: [9574/14785], loss: 7.6349, overflow: False, scale: 2199023255552, lr: 0.002804, time: 596.13
- 2022-09-23 06:57:39,299:INFO:epoch: 2 step: [9630/14785], loss: 8.2929, overflow: False, scale: 2199023255552, lr: 0.002812, time: 596.42
- 2022-09-23 06:58:12,595:INFO:epoch: 2 step: [9686/14785], loss: 8.2409, overflow: False, scale: 2199023255552, lr: 0.002820, time: 594.52
- 2022-09-23 06:58:46,098:INFO:epoch: 2 step: [9742/14785], loss: 8.6332, overflow: False, scale: 2199023255552, lr: 0.002828, time: 598.24
- 2022-09-23 06:59:19,593:INFO:epoch: 2 step: [9798/14785], loss: 8.4920, overflow: False, scale: 2199023255552, lr: 0.002836, time: 598.07
- 2022-09-23 06:59:53,004:INFO:epoch: 2 step: [9854/14785], loss: 9.1503, overflow: False, scale: 2199023255552, lr: 0.002844, time: 596.58
- 2022-09-23 07:00:26,504:INFO:epoch: 2 step: [9910/14785], loss: 9.0082, overflow: False, scale: 2199023255552, lr: 0.002852, time: 598.18
- 2022-09-23 07:01:00,001:INFO:epoch: 2 step: [9966/14785], loss: 8.0877, overflow: False, scale: 2199023255552, lr: 0.002860, time: 598.11
- 2022-09-23 07:01:33,469:INFO:epoch: 2 step: [10022/14785], loss: 8.5167, overflow: False, scale: 2199023255552, lr: 0.002868, time: 597.61
- 2022-09-23 07:02:07,005:INFO:epoch: 2 step: [10078/14785], loss: 8.1587, overflow: False, scale: 2199023255552, lr: 0.002877, time: 598.83
- 2022-09-23 07:02:40,398:INFO:epoch: 2 step: [10134/14785], loss: 8.3003, overflow: False, scale: 2199023255552, lr: 0.002885, time: 596.25
- 2022-09-23 07:03:13,898:INFO:epoch: 2 step: [10190/14785], loss: 8.7537, overflow: False, scale: 2199023255552, lr: 0.002893, time: 598.17
- 2022-09-23 07:03:47,414:INFO:epoch: 2 step: [10246/14785], loss: 7.6989, overflow: False, scale: 2199023255552, lr: 0.002901, time: 598.47
- 2022-09-23 07:04:20,912:INFO:epoch: 2 step: [10302/14785], loss: 7.9370, overflow: False, scale: 2199023255552, lr: 0.002909, time: 598.14
- 2022-09-23 07:04:54,298:INFO:epoch: 2 step: [10358/14785], loss: 7.8111, overflow: False, scale: 2199023255552, lr: 0.002917, time: 596.13
- 2022-09-23 07:05:27,807:INFO:epoch: 2 step: [10414/14785], loss: 8.4511, overflow: False, scale: 4398046511104, lr: 0.002926, time: 598.33
- 2022-09-23 07:06:01,206:INFO:epoch: 2 step: [10470/14785], loss: 8.8819, overflow: False, scale: 4398046511104, lr: 0.002934, time: 596.36
- 2022-09-23 07:06:34,696:INFO:epoch: 2 step: [10526/14785], loss: 8.0852, overflow: False, scale: 4398046511104, lr: 0.002942, time: 598.01
- 2022-09-23 07:07:08,308:INFO:epoch: 2 step: [10582/14785], loss: 8.1357, overflow: False, scale: 4398046511104, lr: 0.002950, time: 600.17
- 2022-09-23 07:07:41,800:INFO:epoch: 2 step: [10638/14785], loss: 7.7354, overflow: False, scale: 4398046511104, lr: 0.002958, time: 598.03
- 2022-09-23 07:08:15,408:INFO:epoch: 2 step: [10694/14785], loss: 7.7319, overflow: False, scale: 4398046511104, lr: 0.002967, time: 600.10
- 2022-09-23 07:08:49,007:INFO:epoch: 2 step: [10750/14785], loss: 7.6585, overflow: False, scale: 4398046511104, lr: 0.002975, time: 599.96
- 2022-09-23 07:09:22,310:INFO:epoch: 2 step: [10806/14785], loss: 7.5222, overflow: False, scale: 4398046511104, lr: 0.002983, time: 594.64
- 2022-09-23 07:09:55,797:INFO:epoch: 2 step: [10862/14785], loss: 7.9663, overflow: False, scale: 4398046511104, lr: 0.002992, time: 597.96
- 2022-09-23 07:10:29,206:INFO:epoch: 2 step: [10918/14785], loss: 8.1606, overflow: False, scale: 4398046511104, lr: 0.003000, time: 596.52
- 2022-09-23 07:11:02,806:INFO:epoch: 2 step: [10974/14785], loss: 8.7833, overflow: False, scale: 4398046511104, lr: 0.003008, time: 599.97
- 2022-09-23 07:11:36,409:INFO:epoch: 2 step: [11030/14785], loss: 8.4437, overflow: False, scale: 4398046511104, lr: 0.003016, time: 600.02
- 2022-09-23 07:12:10,012:INFO:epoch: 2 step: [11086/14785], loss: 8.7033, overflow: False, scale: 4398046511104, lr: 0.003025, time: 600.03
- 2022-09-23 07:12:43,502:INFO:epoch: 2 step: [11142/14785], loss: 8.1016, overflow: False, scale: 4398046511104, lr: 0.003033, time: 597.99
- 2022-09-23 07:13:16,793:INFO:epoch: 2 step: [11198/14785], loss: 8.1872, overflow: False, scale: 4398046511104, lr: 0.003041, time: 594.45
- 2022-09-23 07:13:50,108:INFO:epoch: 2 step: [11254/14785], loss: 8.2344, overflow: False, scale: 4398046511104, lr: 0.003050, time: 594.88
- 2022-09-23 07:14:23,704:INFO:epoch: 2 step: [11310/14785], loss: 7.7251, overflow: False, scale: 4398046511104, lr: 0.003058, time: 599.88
- 2022-09-23 07:14:57,109:INFO:epoch: 2 step: [11366/14785], loss: 7.6802, overflow: False, scale: 4398046511104, lr: 0.003067, time: 596.47
- 2022-09-23 07:15:30,701:INFO:epoch: 2 step: [11422/14785], loss: 7.4442, overflow: False, scale: 4398046511104, lr: 0.003075, time: 599.83
- 2022-09-23 07:16:04,206:INFO:epoch: 2 step: [11478/14785], loss: 8.3256, overflow: False, scale: 4398046511104, lr: 0.003083, time: 598.25
- 2022-09-23 07:16:37,606:INFO:epoch: 2 step: [11534/14785], loss: 7.5317, overflow: False, scale: 4398046511104, lr: 0.003092, time: 596.39
- 2022-09-23 07:17:11,202:INFO:epoch: 2 step: [11590/14785], loss: 8.0832, overflow: False, scale: 4398046511104, lr: 0.003100, time: 599.88
- 2022-09-23 07:17:44,794:INFO:epoch: 2 step: [11646/14785], loss: 7.4542, overflow: False, scale: 4398046511104, lr: 0.003109, time: 599.81
- 2022-09-23 07:18:18,306:INFO:epoch: 2 step: [11702/14785], loss: 8.2493, overflow: False, scale: 4398046511104, lr: 0.003117, time: 598.40
- 2022-09-23 07:18:51,693:INFO:epoch: 2 step: [11758/14785], loss: 8.2194, overflow: False, scale: 4398046511104, lr: 0.003126, time: 596.18
- 2022-09-23 07:19:25,300:INFO:epoch: 2 step: [11814/14785], loss: 7.6567, overflow: False, scale: 4398046511104, lr: 0.003134, time: 600.07
- 2022-09-23 07:19:58,885:INFO:epoch: 2 step: [11870/14785], loss: 8.3092, overflow: False, scale: 4398046511104, lr: 0.003143, time: 599.69
- 2022-09-23 07:20:32,398:INFO:epoch: 2 step: [11926/14785], loss: 8.0968, overflow: False, scale: 4398046511104, lr: 0.003151, time: 598.41
- 2022-09-23 07:21:05,900:INFO:epoch: 2 step: [11982/14785], loss: 8.0801, overflow: False, scale: 4398046511104, lr: 0.003160, time: 598.23
- 2022-09-23 07:21:39,405:INFO:epoch: 2 step: [12038/14785], loss: 8.0914, overflow: False, scale: 4398046511104, lr: 0.003168, time: 598.27
- 2022-09-23 07:22:12,901:INFO:epoch: 2 step: [12094/14785], loss: 8.0873, overflow: False, scale: 4398046511104, lr: 0.003177, time: 598.11
- 2022-09-23 07:22:46,497:INFO:epoch: 2 step: [12150/14785], loss: 7.9032, overflow: False, scale: 4398046511104, lr: 0.003185, time: 599.89
- 2022-09-23 07:23:20,001:INFO:epoch: 2 step: [12206/14785], loss: 7.8224, overflow: False, scale: 4398046511104, lr: 0.003194, time: 598.25
- 2022-09-23 07:23:53,605:INFO:epoch: 2 step: [12262/14785], loss: 9.0335, overflow: False, scale: 4398046511104, lr: 0.003202, time: 600.02
- 2022-09-23 07:24:27,003:INFO:epoch: 2 step: [12318/14785], loss: 7.6576, overflow: False, scale: 4398046511104, lr: 0.003211, time: 596.37
- 2022-09-23 07:25:00,502:INFO:epoch: 2 step: [12374/14785], loss: 8.0723, overflow: False, scale: 8796093022208, lr: 0.003219, time: 598.17
- 2022-09-23 07:25:33,999:INFO:epoch: 2 step: [12430/14785], loss: 7.6742, overflow: False, scale: 8796093022208, lr: 0.003228, time: 598.11
- 2022-09-23 07:26:07,395:INFO:epoch: 2 step: [12486/14785], loss: 8.1286, overflow: False, scale: 8796093022208, lr: 0.003237, time: 596.29
- 2022-09-23 07:26:40,998:INFO:epoch: 2 step: [12542/14785], loss: 7.5492, overflow: False, scale: 8796093022208, lr: 0.003245, time: 600.01
- 2022-09-23 07:27:14,514:INFO:epoch: 2 step: [12598/14785], loss: 8.4293, overflow: False, scale: 8796093022208, lr: 0.003254, time: 598.45
- 2022-09-23 07:27:48,006:INFO:epoch: 2 step: [12654/14785], loss: 8.0734, overflow: False, scale: 8796093022208, lr: 0.003263, time: 598.04
- 2022-09-23 07:28:21,612:INFO:epoch: 2 step: [12710/14785], loss: 8.8160, overflow: False, scale: 8796093022208, lr: 0.003271, time: 600.07
- 2022-09-23 07:28:55,208:INFO:epoch: 2 step: [12766/14785], loss: 8.0542, overflow: False, scale: 8796093022208, lr: 0.003280, time: 599.90
- 2022-09-23 07:29:28,804:INFO:epoch: 2 step: [12822/14785], loss: 8.3231, overflow: False, scale: 8796093022208, lr: 0.003289, time: 599.88
- 2022-09-23 07:30:02,406:INFO:epoch: 2 step: [12878/14785], loss: 8.9164, overflow: False, scale: 8796093022208, lr: 0.003297, time: 599.97
- 2022-09-23 07:30:35,999:INFO:epoch: 2 step: [12934/14785], loss: 8.4117, overflow: False, scale: 8796093022208, lr: 0.003306, time: 599.83
- 2022-09-23 07:31:09,505:INFO:epoch: 2 step: [12990/14785], loss: 8.0391, overflow: False, scale: 8796093022208, lr: 0.003315, time: 598.28
- 2022-09-23 07:31:43,006:INFO:epoch: 2 step: [13046/14785], loss: 8.4928, overflow: False, scale: 8796093022208, lr: 0.003323, time: 598.18
- 2022-09-23 07:32:16,501:INFO:epoch: 2 step: [13102/14785], loss: 7.9634, overflow: False, scale: 8796093022208, lr: 0.003332, time: 598.10
- 2022-09-23 07:32:50,104:INFO:epoch: 2 step: [13158/14785], loss: 8.3897, overflow: False, scale: 8796093022208, lr: 0.003341, time: 600.02
- 2022-09-23 07:33:23,705:INFO:epoch: 2 step: [13214/14785], loss: 8.5690, overflow: False, scale: 8796093022208, lr: 0.003350, time: 599.98
- 2022-09-23 07:33:57,255:INFO:epoch: 2 step: [13270/14785], loss: 8.0601, overflow: False, scale: 8796093022208, lr: 0.003358, time: 599.04
- 2022-09-23 07:34:30,802:INFO:epoch: 2 step: [13326/14785], loss: 8.4462, overflow: False, scale: 8796093022208, lr: 0.003367, time: 599.00
- 2022-09-23 07:35:04,212:INFO:epoch: 2 step: [13382/14785], loss: 7.8790, overflow: False, scale: 8796093022208, lr: 0.003376, time: 596.57
- 2022-09-23 07:35:37,395:INFO:epoch: 2 step: [13438/14785], loss: 7.9485, overflow: False, scale: 8796093022208, lr: 0.003385, time: 592.52
- 2022-09-23 07:36:11,002:INFO:epoch: 2 step: [13494/14785], loss: 7.9178, overflow: False, scale: 8796093022208, lr: 0.003394, time: 600.04
- 2022-09-23 07:36:44,301:INFO:epoch: 2 step: [13550/14785], loss: 8.0136, overflow: False, scale: 8796093022208, lr: 0.003402, time: 594.59
- 2022-09-23 07:37:17,904:INFO:epoch: 2 step: [13606/14785], loss: 8.2976, overflow: False, scale: 8796093022208, lr: 0.003411, time: 600.01
- 2022-09-23 07:37:51,405:INFO:epoch: 2 step: [13662/14785], loss: 7.5978, overflow: False, scale: 8796093022208, lr: 0.003420, time: 598.16
- 2022-09-23 07:38:24,703:INFO:epoch: 2 step: [13718/14785], loss: 8.3217, overflow: False, scale: 8796093022208, lr: 0.003429, time: 594.56
- 2022-09-23 07:38:58,303:INFO:epoch: 2 step: [13774/14785], loss: 8.0739, overflow: False, scale: 8796093022208, lr: 0.003438, time: 599.93
- 2022-09-23 07:39:31,800:INFO:epoch: 2 step: [13830/14785], loss: 8.0632, overflow: False, scale: 8796093022208, lr: 0.003447, time: 598.13
- 2022-09-23 07:40:05,311:INFO:epoch: 2 step: [13886/14785], loss: 7.9676, overflow: False, scale: 8796093022208, lr: 0.003456, time: 598.36
- 2022-09-23 07:40:38,741:INFO:epoch: 2 step: [13942/14785], loss: 7.7935, overflow: False, scale: 8796093022208, lr: 0.003465, time: 596.94
- 2022-09-23 07:41:12,102:INFO:epoch: 2 step: [13998/14785], loss: 7.9007, overflow: False, scale: 8796093022208, lr: 0.003474, time: 595.69
- 2022-09-23 07:41:45,599:INFO:epoch: 2 step: [14054/14785], loss: 8.2485, overflow: False, scale: 8796093022208, lr: 0.003482, time: 598.09
- 2022-09-23 07:42:19,105:INFO:epoch: 2 step: [14110/14785], loss: 8.1614, overflow: False, scale: 8796093022208, lr: 0.003491, time: 598.27
- 2022-09-23 07:42:52,701:INFO:epoch: 2 step: [14166/14785], loss: 8.4888, overflow: False, scale: 8796093022208, lr: 0.003500, time: 599.89
- 2022-09-23 07:43:26,102:INFO:epoch: 2 step: [14222/14785], loss: 8.4557, overflow: False, scale: 8796093022208, lr: 0.003509, time: 596.41
- 2022-09-23 07:43:59,636:INFO:epoch: 2 step: [14278/14785], loss: 8.4993, overflow: False, scale: 8796093022208, lr: 0.003518, time: 598.78
- 2022-09-23 07:44:33,203:INFO:epoch: 2 step: [14334/14785], loss: 7.8991, overflow: False, scale: 8796093022208, lr: 0.003527, time: 599.37
- 2022-09-23 07:45:06,708:INFO:epoch: 2 step: [14390/14785], loss: 8.2570, overflow: False, scale: 17592186044416, lr: 0.003536, time: 598.25
- 2022-09-23 07:45:40,297:INFO:epoch: 2 step: [14446/14785], loss: 7.8106, overflow: False, scale: 17592186044416, lr: 0.003545, time: 599.78
- 2022-09-23 07:46:13,902:INFO:epoch: 2 step: [14502/14785], loss: 7.9461, overflow: False, scale: 17592186044416, lr: 0.003554, time: 600.04
- 2022-09-23 07:46:47,501:INFO:epoch: 2 step: [14558/14785], loss: 9.1582, overflow: False, scale: 17592186044416, lr: 0.003563, time: 599.95
- 2022-09-23 07:47:21,102:INFO:epoch: 2 step: [14614/14785], loss: 7.8870, overflow: False, scale: 17592186044416, lr: 0.003572, time: 599.95
- 2022-09-23 07:47:54,499:INFO:epoch: 2 step: [14670/14785], loss: 8.4806, overflow: False, scale: 17592186044416, lr: 0.003582, time: 596.34
- 2022-09-23 07:48:27,914:INFO:epoch: 2 step: [14726/14785], loss: 7.7803, overflow: False, scale: 17592186044416, lr: 0.003591, time: 596.66
- 2022-09-23 07:49:01,500:INFO:epoch: 2 step: [14782/14785], loss: 8.1621, overflow: False, scale: 17592186044416, lr: 0.003600, time: 599.70
- 2022-09-23 07:49:35,098:INFO:epoch: 3 step: [53/14785], loss: 8.9873, overflow: False, scale: 17592186044416, lr: 0.003609, time: 599.93
- 2022-09-23 07:50:10,594:INFO:epoch: 3 step: [109/14785], loss: 8.4713, overflow: False, scale: 17592186044416, lr: 0.003618, time: 633.82
- 2022-09-23 07:50:43,804:INFO:epoch: 3 step: [165/14785], loss: 7.6440, overflow: False, scale: 17592186044416, lr: 0.003627, time: 593.00
- 2022-09-23 07:51:17,303:INFO:epoch: 3 step: [221/14785], loss: 8.1341, overflow: False, scale: 17592186044416, lr: 0.003636, time: 598.13
- 2022-09-23 07:51:50,897:INFO:epoch: 3 step: [277/14785], loss: 7.6728, overflow: False, scale: 17592186044416, lr: 0.003645, time: 599.84
- 2022-09-23 07:52:24,397:INFO:epoch: 3 step: [333/14785], loss: 8.4928, overflow: False, scale: 17592186044416, lr: 0.003654, time: 598.19
- 2022-09-23 07:52:58,005:INFO:epoch: 3 step: [389/14785], loss: 7.8681, overflow: False, scale: 17592186044416, lr: 0.003664, time: 600.12
- 2022-09-23 07:53:31,597:INFO:epoch: 3 step: [445/14785], loss: 8.1238, overflow: False, scale: 17592186044416, lr: 0.003673, time: 599.77
- 2022-09-23 07:54:05,156:INFO:epoch: 3 step: [501/14785], loss: 7.9441, overflow: False, scale: 17592186044416, lr: 0.003682, time: 599.21
- 2022-09-23 07:54:38,698:INFO:epoch: 3 step: [557/14785], loss: 7.9390, overflow: False, scale: 17592186044416, lr: 0.003691, time: 598.92
- 2022-09-23 07:55:12,197:INFO:epoch: 3 step: [613/14785], loss: 8.3677, overflow: False, scale: 17592186044416, lr: 0.003700, time: 598.17
- 2022-09-23 07:55:45,497:INFO:epoch: 3 step: [669/14785], loss: 7.8319, overflow: False, scale: 17592186044416, lr: 0.003710, time: 594.59
- 2022-09-23 07:56:19,004:INFO:epoch: 3 step: [725/14785], loss: 8.3687, overflow: False, scale: 17592186044416, lr: 0.003719, time: 598.30
- 2022-09-23 07:56:52,604:INFO:epoch: 3 step: [781/14785], loss: 8.1727, overflow: False, scale: 17592186044416, lr: 0.003728, time: 599.97
- 2022-09-23 07:57:26,199:INFO:epoch: 3 step: [837/14785], loss: 8.1735, overflow: False, scale: 17592186044416, lr: 0.003737, time: 599.87
- 2022-09-23 07:57:59,699:INFO:epoch: 3 step: [893/14785], loss: 7.5125, overflow: False, scale: 17592186044416, lr: 0.003747, time: 598.18
- 2022-09-23 07:58:33,302:INFO:epoch: 3 step: [949/14785], loss: 7.8804, overflow: False, scale: 17592186044416, lr: 0.003756, time: 600.03
- 2022-09-23 07:59:06,804:INFO:epoch: 3 step: [1005/14785], loss: 7.8816, overflow: False, scale: 17592186044416, lr: 0.003765, time: 598.22
- 2022-09-23 07:59:40,404:INFO:epoch: 3 step: [1061/14785], loss: 7.7671, overflow: False, scale: 17592186044416, lr: 0.003774, time: 599.96
- 2022-09-23 08:00:13,993:INFO:epoch: 3 step: [1117/14785], loss: 8.3539, overflow: False, scale: 17592186044416, lr: 0.003784, time: 599.76
- 2022-09-23 08:00:47,585:INFO:epoch: 3 step: [1173/14785], loss: 7.8714, overflow: False, scale: 17592186044416, lr: 0.003793, time: 599.82
- 2022-09-23 08:01:21,101:INFO:epoch: 3 step: [1229/14785], loss: 8.1295, overflow: False, scale: 17592186044416, lr: 0.003802, time: 598.47
- 2022-09-23 08:01:54,597:INFO:epoch: 3 step: [1285/14785], loss: 8.3944, overflow: False, scale: 17592186044416, lr: 0.003812, time: 598.11
- 2022-09-23 08:02:28,104:INFO:epoch: 3 step: [1341/14785], loss: 7.7683, overflow: False, scale: 17592186044416, lr: 0.003821, time: 598.30
- 2022-09-23 08:03:01,612:INFO:epoch: 3 step: [1397/14785], loss: 8.8425, overflow: False, scale: 17592186044416, lr: 0.003831, time: 598.31
- 2022-09-23 08:03:35,192:INFO:epoch: 3 step: [1453/14785], loss: 7.7409, overflow: False, scale: 17592186044416, lr: 0.003840, time: 599.61
- 2022-09-23 08:04:08,645:INFO:epoch: 3 step: [1509/14785], loss: 7.6313, overflow: False, scale: 17592186044416, lr: 0.003849, time: 597.33
- 2022-09-23 08:04:41,903:INFO:epoch: 3 step: [1565/14785], loss: 8.4691, overflow: False, scale: 17592186044416, lr: 0.003859, time: 593.85
- 2022-09-23 08:05:15,401:INFO:epoch: 3 step: [1621/14785], loss: 7.2954, overflow: False, scale: 35184372088832, lr: 0.003868, time: 598.13
- 2022-09-23 08:05:48,801:INFO:epoch: 3 step: [1677/14785], loss: 8.1136, overflow: False, scale: 35184372088832, lr: 0.003878, time: 596.38
- 2022-09-23 08:06:22,399:INFO:epoch: 3 step: [1733/14785], loss: 8.2779, overflow: False, scale: 35184372088832, lr: 0.003887, time: 599.92
- 2022-09-23 08:06:56,007:INFO:epoch: 3 step: [1789/14785], loss: 7.8997, overflow: False, scale: 35184372088832, lr: 0.003896, time: 600.05
- 2022-09-23 08:07:29,504:INFO:epoch: 3 step: [1845/14785], loss: 7.9540, overflow: False, scale: 35184372088832, lr: 0.003906, time: 598.12
- 2022-09-23 08:08:03,101:INFO:epoch: 3 step: [1901/14785], loss: 8.1390, overflow: False, scale: 35184372088832, lr: 0.003915, time: 599.91
- 2022-09-23 08:08:36,698:INFO:epoch: 3 step: [1957/14785], loss: 7.7080, overflow: False, scale: 35184372088832, lr: 0.003925, time: 599.91
- 2022-09-23 08:09:10,202:INFO:epoch: 3 step: [2013/14785], loss: 8.0684, overflow: False, scale: 35184372088832, lr: 0.003934, time: 598.25
- 2022-09-23 08:09:43,811:INFO:epoch: 3 step: [2069/14785], loss: 8.3030, overflow: False, scale: 35184372088832, lr: 0.003944, time: 600.13
- 2022-09-23 08:10:17,396:INFO:epoch: 3 step: [2125/14785], loss: 7.8652, overflow: False, scale: 35184372088832, lr: 0.003953, time: 599.70
- 2022-09-23 08:10:51,003:INFO:epoch: 3 step: [2181/14785], loss: 7.7528, overflow: False, scale: 35184372088832, lr: 0.003963, time: 600.06
- 2022-09-23 08:11:24,502:INFO:epoch: 3 step: [2237/14785], loss: 8.3960, overflow: False, scale: 35184372088832, lr: 0.003972, time: 598.15
- 2022-09-23 08:11:57,898:INFO:epoch: 3 step: [2293/14785], loss: 7.9892, overflow: False, scale: 35184372088832, lr: 0.003982, time: 596.30
- 2022-09-23 08:12:31,505:INFO:epoch: 3 step: [2349/14785], loss: 7.2502, overflow: False, scale: 35184372088832, lr: 0.003992, time: 600.06
- 2022-09-23 08:13:05,001:INFO:epoch: 3 step: [2405/14785], loss: 8.4162, overflow: False, scale: 35184372088832, lr: 0.004001, time: 598.11
- 2022-09-23 08:13:38,502:INFO:epoch: 3 step: [2461/14785], loss: 8.0423, overflow: False, scale: 35184372088832, lr: 0.004011, time: 598.18
- 2022-09-23 08:14:12,013:INFO:epoch: 3 step: [2517/14785], loss: 8.6227, overflow: False, scale: 35184372088832, lr: 0.004020, time: 598.36
- 2022-09-23 08:14:45,429:INFO:epoch: 3 step: [2573/14785], loss: 7.8151, overflow: False, scale: 35184372088832, lr: 0.004030, time: 596.65
- 2022-09-23 08:15:18,903:INFO:epoch: 3 step: [2629/14785], loss: 7.9266, overflow: False, scale: 35184372088832, lr: 0.004040, time: 597.70
- 2022-09-23 08:15:52,400:INFO:epoch: 3 step: [2685/14785], loss: 7.4292, overflow: False, scale: 35184372088832, lr: 0.004049, time: 598.11
- 2022-09-23 08:16:25,801:INFO:epoch: 3 step: [2741/14785], loss: 8.3216, overflow: False, scale: 35184372088832, lr: 0.004059, time: 596.39
- 2022-09-23 08:16:59,305:INFO:epoch: 3 step: [2797/14785], loss: 7.6187, overflow: False, scale: 35184372088832, lr: 0.004069, time: 598.25
- 2022-09-23 08:17:32,701:INFO:epoch: 3 step: [2853/14785], loss: 8.2369, overflow: False, scale: 35184372088832, lr: 0.004078, time: 596.32
- 2022-09-23 08:18:06,101:INFO:epoch: 3 step: [2909/14785], loss: 7.8522, overflow: False, scale: 35184372088832, lr: 0.004088, time: 596.38
- 2022-09-23 08:18:39,704:INFO:epoch: 3 step: [2965/14785], loss: 7.7968, overflow: False, scale: 35184372088832, lr: 0.004098, time: 600.00
- 2022-09-23 08:19:13,107:INFO:epoch: 3 step: [3021/14785], loss: 7.5548, overflow: False, scale: 35184372088832, lr: 0.004107, time: 596.45
- 2022-09-23 08:19:46,597:INFO:epoch: 3 step: [3077/14785], loss: 7.9566, overflow: False, scale: 35184372088832, lr: 0.004117, time: 598.00
- 2022-09-23 08:20:20,098:INFO:epoch: 3 step: [3133/14785], loss: 7.6571, overflow: False, scale: 35184372088832, lr: 0.004127, time: 598.20
- 2022-09-23 08:20:53,496:INFO:epoch: 3 step: [3189/14785], loss: 7.6651, overflow: False, scale: 35184372088832, lr: 0.004136, time: 596.35
- 2022-09-23 08:21:27,012:INFO:epoch: 3 step: [3245/14785], loss: 7.5660, overflow: False, scale: 35184372088832, lr: 0.004146, time: 598.47
- 2022-09-23 08:22:00,501:INFO:epoch: 3 step: [3301/14785], loss: 8.1831, overflow: False, scale: 35184372088832, lr: 0.004156, time: 597.95
- 2022-09-23 08:22:34,102:INFO:epoch: 3 step: [3357/14785], loss: 7.5474, overflow: False, scale: 35184372088832, lr: 0.004166, time: 599.97
- 2022-09-23 08:23:07,499:INFO:epoch: 3 step: [3413/14785], loss: 8.2359, overflow: False, scale: 35184372088832, lr: 0.004176, time: 596.35
- 2022-09-23 08:23:41,100:INFO:epoch: 3 step: [3469/14785], loss: 8.3008, overflow: False, scale: 35184372088832, lr: 0.004185, time: 599.98
- 2022-09-23 08:24:14,603:INFO:epoch: 3 step: [3525/14785], loss: 7.6504, overflow: False, scale: 35184372088832, lr: 0.004195, time: 598.24
- 2022-09-23 08:24:47,811:INFO:epoch: 3 step: [3581/14785], loss: 7.9263, overflow: False, scale: 35184372088832, lr: 0.004205, time: 592.94
- 2022-09-23 08:25:21,393:INFO:epoch: 3 step: [3637/14785], loss: 7.8920, overflow: False, scale: 70368744177664, lr: 0.004215, time: 599.62
- 2022-09-23 08:25:54,802:INFO:epoch: 3 step: [3693/14785], loss: 7.7425, overflow: False, scale: 70368744177664, lr: 0.004225, time: 596.56
- 2022-09-23 08:26:28,202:INFO:epoch: 3 step: [3749/14785], loss: 7.5706, overflow: False, scale: 70368744177664, lr: 0.004234, time: 596.40
- 2022-09-23 08:27:01,704:INFO:epoch: 3 step: [3805/14785], loss: 7.7810, overflow: False, scale: 70368744177664, lr: 0.004244, time: 598.20
- 2022-09-23 08:27:35,303:INFO:epoch: 3 step: [3861/14785], loss: 7.7617, overflow: False, scale: 70368744177664, lr: 0.004254, time: 599.95
- 2022-09-23 08:28:08,905:INFO:epoch: 3 step: [3917/14785], loss: 9.2110, overflow: False, scale: 70368744177664, lr: 0.004264, time: 600.01
- 2022-09-23 08:28:42,472:INFO:epoch: 3 step: [3973/14785], loss: 7.9032, overflow: False, scale: 70368744177664, lr: 0.004274, time: 599.37
- 2022-09-23 08:29:16,003:INFO:epoch: 3 step: [4029/14785], loss: 8.0373, overflow: False, scale: 70368744177664, lr: 0.004284, time: 598.74
- 2022-09-23 08:29:49,343:INFO:epoch: 3 step: [4085/14785], loss: 8.1247, overflow: False, scale: 70368744177664, lr: 0.004294, time: 595.30
- 2022-09-23 08:30:22,750:INFO:epoch: 3 step: [4141/14785], loss: 7.5327, overflow: False, scale: 70368744177664, lr: 0.004304, time: 596.50
- 2022-09-23 08:30:56,202:INFO:epoch: 3 step: [4197/14785], loss: 8.7255, overflow: False, scale: 70368744177664, lr: 0.004314, time: 597.32
- 2022-09-23 08:31:29,708:INFO:epoch: 3 step: [4253/14785], loss: 7.3365, overflow: False, scale: 70368744177664, lr: 0.004324, time: 598.28
- 2022-09-23 08:32:03,200:INFO:epoch: 3 step: [4309/14785], loss: 8.5059, overflow: False, scale: 70368744177664, lr: 0.004334, time: 598.02
- 2022-09-23 08:32:36,811:INFO:epoch: 3 step: [4365/14785], loss: 7.9243, overflow: False, scale: 70368744177664, lr: 0.004344, time: 600.15
- 2022-09-23 08:33:10,306:INFO:epoch: 3 step: [4421/14785], loss: 7.4992, overflow: False, scale: 70368744177664, lr: 0.004354, time: 598.07
- 2022-09-23 08:33:43,699:INFO:epoch: 3 step: [4477/14785], loss: 7.7766, overflow: False, scale: 70368744177664, lr: 0.004364, time: 596.25
- 2022-09-23 08:34:16,989:INFO:epoch: 3 step: [4533/14785], loss: 8.3288, overflow: False, scale: 70368744177664, lr: 0.004374, time: 594.42
- 2022-09-23 08:34:50,403:INFO:epoch: 3 step: [4589/14785], loss: 7.8698, overflow: False, scale: 70368744177664, lr: 0.004384, time: 596.64
- 2022-09-23 08:35:24,000:INFO:epoch: 3 step: [4645/14785], loss: 8.2168, overflow: False, scale: 70368744177664, lr: 0.004394, time: 599.92
- 2022-09-23 08:35:57,401:INFO:epoch: 3 step: [4701/14785], loss: 7.5791, overflow: False, scale: 70368744177664, lr: 0.004404, time: 596.40
- 2022-09-23 08:36:30,999:INFO:epoch: 3 step: [4757/14785], loss: 7.5852, overflow: False, scale: 70368744177664, lr: 0.004414, time: 599.93
- 2022-09-23 08:37:04,409:INFO:epoch: 3 step: [4813/14785], loss: 7.7378, overflow: False, scale: 70368744177664, lr: 0.004424, time: 596.56
- 2022-09-23 08:37:37,903:INFO:epoch: 3 step: [4869/14785], loss: 7.4293, overflow: False, scale: 70368744177664, lr: 0.004434, time: 598.05
- 2022-09-23 08:38:11,420:INFO:epoch: 3 step: [4925/14785], loss: 8.0609, overflow: False, scale: 70368744177664, lr: 0.004444, time: 598.47
- 2022-09-23 08:38:44,804:INFO:epoch: 3 step: [4981/14785], loss: 8.4511, overflow: False, scale: 70368744177664, lr: 0.004454, time: 596.11
- 2022-09-23 08:39:18,398:INFO:epoch: 3 step: [5037/14785], loss: 8.2756, overflow: False, scale: 70368744177664, lr: 0.004464, time: 599.84
- 2022-09-23 08:39:52,005:INFO:epoch: 3 step: [5093/14785], loss: 7.4059, overflow: False, scale: 70368744177664, lr: 0.004474, time: 600.08
- 2022-09-23 08:40:25,302:INFO:epoch: 3 step: [5149/14785], loss: 8.2564, overflow: False, scale: 70368744177664, lr: 0.004485, time: 594.55
- 2022-09-23 08:40:58,797:INFO:epoch: 3 step: [5205/14785], loss: 8.2676, overflow: False, scale: 70368744177664, lr: 0.004495, time: 598.10
- 2022-09-23 08:41:32,309:INFO:epoch: 3 step: [5261/14785], loss: 8.3226, overflow: False, scale: 70368744177664, lr: 0.004505, time: 598.38
- 2022-09-23 08:42:05,698:INFO:epoch: 3 step: [5317/14785], loss: 8.7964, overflow: False, scale: 70368744177664, lr: 0.004515, time: 596.19
- 2022-09-23 08:42:39,205:INFO:epoch: 3 step: [5373/14785], loss: 8.3960, overflow: False, scale: 70368744177664, lr: 0.004525, time: 598.32
- 2022-09-23 08:43:12,638:INFO:epoch: 3 step: [5429/14785], loss: 7.2621, overflow: False, scale: 70368744177664, lr: 0.004535, time: 596.98
- 2022-09-23 08:43:46,006:INFO:epoch: 3 step: [5485/14785], loss: 8.0490, overflow: False, scale: 70368744177664, lr: 0.004546, time: 595.79
- 2022-09-23 08:44:19,509:INFO:epoch: 3 step: [5541/14785], loss: 7.7109, overflow: False, scale: 70368744177664, lr: 0.004556, time: 598.24
- 2022-09-23 08:44:53,113:INFO:epoch: 3 step: [5597/14785], loss: 8.0946, overflow: False, scale: 140737488355328, lr: 0.004566, time: 599.99
- 2022-09-23 08:45:26,593:INFO:epoch: 3 step: [5653/14785], loss: 6.9604, overflow: False, scale: 140737488355328, lr: 0.004576, time: 597.81
- 2022-09-23 08:46:00,097:INFO:epoch: 3 step: [5709/14785], loss: 7.8083, overflow: False, scale: 140737488355328, lr: 0.004587, time: 598.25
- 2022-09-23 08:46:33,602:INFO:epoch: 3 step: [5765/14785], loss: 8.1077, overflow: False, scale: 140737488355328, lr: 0.004597, time: 598.27
- 2022-09-23 08:47:07,104:INFO:epoch: 3 step: [5821/14785], loss: 8.5436, overflow: False, scale: 140737488355328, lr: 0.004607, time: 598.21
- 2022-09-23 08:47:40,497:INFO:epoch: 3 step: [5877/14785], loss: 7.6906, overflow: False, scale: 140737488355328, lr: 0.004617, time: 596.26
- 2022-09-23 08:48:14,105:INFO:epoch: 3 step: [5933/14785], loss: 7.9475, overflow: False, scale: 140737488355328, lr: 0.004628, time: 600.12
- 2022-09-23 08:48:47,605:INFO:epoch: 3 step: [5989/14785], loss: 7.7492, overflow: False, scale: 140737488355328, lr: 0.004638, time: 598.18
- 2022-09-23 08:49:21,007:INFO:epoch: 3 step: [6045/14785], loss: 8.0148, overflow: False, scale: 140737488355328, lr: 0.004648, time: 596.42
- 2022-09-23 08:49:54,593:INFO:epoch: 3 step: [6101/14785], loss: 7.7977, overflow: False, scale: 140737488355328, lr: 0.004659, time: 599.71
- 2022-09-23 08:50:28,008:INFO:epoch: 3 step: [6157/14785], loss: 7.8615, overflow: False, scale: 140737488355328, lr: 0.004669, time: 596.65
- 2022-09-23 08:51:01,596:INFO:epoch: 3 step: [6213/14785], loss: 7.8746, overflow: False, scale: 140737488355328, lr: 0.004679, time: 599.76
- 2022-09-23 08:51:35,120:INFO:epoch: 3 step: [6269/14785], loss: 7.5681, overflow: False, scale: 140737488355328, lr: 0.004690, time: 598.61
- 2022-09-23 08:52:08,511:INFO:epoch: 3 step: [6325/14785], loss: 7.7431, overflow: False, scale: 140737488355328, lr: 0.004700, time: 596.19
- 2022-09-23 08:52:42,114:INFO:epoch: 3 step: [6381/14785], loss: 7.4964, overflow: False, scale: 140737488355328, lr: 0.004710, time: 600.03
- 2022-09-23 08:53:15,607:INFO:epoch: 3 step: [6437/14785], loss: 7.8416, overflow: False, scale: 140737488355328, lr: 0.004721, time: 598.05
- 2022-09-23 08:53:49,004:INFO:epoch: 3 step: [6493/14785], loss: 8.0705, overflow: False, scale: 140737488355328, lr: 0.004731, time: 596.34
- 2022-09-23 08:54:22,607:INFO:epoch: 3 step: [6549/14785], loss: 7.8886, overflow: False, scale: 140737488355328, lr: 0.004742, time: 600.00
- 2022-09-23 08:54:56,092:INFO:epoch: 3 step: [6605/14785], loss: 8.1607, overflow: False, scale: 140737488355328, lr: 0.004752, time: 597.90
- 2022-09-23 08:55:29,604:INFO:epoch: 3 step: [6661/14785], loss: 7.8203, overflow: False, scale: 140737488355328, lr: 0.004763, time: 598.38
- 2022-09-23 08:56:02,901:INFO:epoch: 3 step: [6717/14785], loss: 7.5477, overflow: False, scale: 140737488355328, lr: 0.004773, time: 594.55
- 2022-09-23 08:56:36,297:INFO:epoch: 3 step: [6773/14785], loss: 7.8754, overflow: False, scale: 140737488355328, lr: 0.004784, time: 596.32
- 2022-09-23 08:57:09,798:INFO:epoch: 3 step: [6829/14785], loss: 8.4466, overflow: False, scale: 140737488355328, lr: 0.004794, time: 598.18
- 2022-09-23 08:57:43,303:INFO:epoch: 3 step: [6885/14785], loss: 7.2785, overflow: False, scale: 140737488355328, lr: 0.004805, time: 598.26
- 2022-09-23 08:58:16,698:INFO:epoch: 3 step: [6941/14785], loss: 7.8774, overflow: False, scale: 140737488355328, lr: 0.004815, time: 596.29
- 2022-09-23 08:58:50,192:INFO:epoch: 3 step: [6997/14785], loss: 8.4721, overflow: False, scale: 140737488355328, lr: 0.004826, time: 598.07
- 2022-09-23 08:59:23,741:INFO:epoch: 3 step: [7053/14785], loss: 8.1676, overflow: False, scale: 140737488355328, lr: 0.004836, time: 599.05
- 2022-09-23 08:59:57,300:INFO:epoch: 3 step: [7109/14785], loss: 7.7157, overflow: False, scale: 140737488355328, lr: 0.004847, time: 599.23
- 2022-09-23 09:00:30,740:INFO:epoch: 3 step: [7165/14785], loss: 7.1154, overflow: False, scale: 140737488355328, lr: 0.004857, time: 597.07
- 2022-09-23 09:01:04,305:INFO:epoch: 3 step: [7221/14785], loss: 7.7274, overflow: False, scale: 140737488355328, lr: 0.004868, time: 599.34
- 2022-09-23 09:01:37,704:INFO:epoch: 3 step: [7277/14785], loss: 7.2275, overflow: False, scale: 140737488355328, lr: 0.004878, time: 596.38
- 2022-09-23 09:02:11,091:INFO:epoch: 3 step: [7333/14785], loss: 7.7084, overflow: False, scale: 140737488355328, lr: 0.004889, time: 596.14
- 2022-09-23 09:02:44,503:INFO:epoch: 3 step: [7389/14785], loss: 7.8185, overflow: False, scale: 140737488355328, lr: 0.004900, time: 596.59
- 2022-09-23 09:03:18,105:INFO:epoch: 3 step: [7445/14785], loss: 8.1332, overflow: False, scale: 140737488355328, lr: 0.004910, time: 600.00
- 2022-09-23 09:03:51,299:INFO:epoch: 3 step: [7501/14785], loss: 7.9648, overflow: False, scale: 140737488355328, lr: 0.004921, time: 592.70
- 2022-09-23 09:04:24,794:INFO:epoch: 3 step: [7557/14785], loss: 8.1156, overflow: False, scale: 140737488355328, lr: 0.004931, time: 598.09
- 2022-09-23 09:04:58,205:INFO:epoch: 3 step: [7613/14785], loss: 7.6913, overflow: False, scale: 281474976710656, lr: 0.004942, time: 596.59
- 2022-09-23 09:05:31,427:INFO:epoch: 3 step: [7669/14785], loss: 7.3895, overflow: False, scale: 281474976710656, lr: 0.004953, time: 593.23
- 2022-09-23 09:06:04,801:INFO:epoch: 3 step: [7725/14785], loss: 7.7700, overflow: False, scale: 281474976710656, lr: 0.004963, time: 595.92
- 2022-09-23 09:06:38,401:INFO:epoch: 3 step: [7781/14785], loss: 8.2672, overflow: False, scale: 281474976710656, lr: 0.004974, time: 599.96
- 2022-09-23 09:07:11,897:INFO:epoch: 3 step: [7837/14785], loss: 7.3003, overflow: False, scale: 281474976710656, lr: 0.004985, time: 598.11
- 2022-09-23 09:07:45,403:INFO:epoch: 3 step: [7893/14785], loss: 7.7949, overflow: False, scale: 281474976710656, lr: 0.004995, time: 598.28
- 2022-09-23 09:08:18,891:INFO:epoch: 3 step: [7949/14785], loss: 8.1513, overflow: False, scale: 281474976710656, lr: 0.005006, time: 597.97
- 2022-09-23 09:08:52,350:INFO:epoch: 3 step: [8005/14785], loss: 8.3106, overflow: False, scale: 281474976710656, lr: 0.005017, time: 597.45
- 2022-09-23 09:09:25,736:INFO:epoch: 3 step: [8061/14785], loss: 8.2889, overflow: False, scale: 281474976710656, lr: 0.005028, time: 596.15
- 2022-09-23 09:09:59,197:INFO:epoch: 3 step: [8117/14785], loss: 7.6538, overflow: False, scale: 281474976710656, lr: 0.005038, time: 597.47
- 2022-09-23 09:10:32,702:INFO:epoch: 3 step: [8173/14785], loss: 7.6376, overflow: False, scale: 281474976710656, lr: 0.005049, time: 598.28
- 2022-09-23 09:11:06,203:INFO:epoch: 3 step: [8229/14785], loss: 7.4103, overflow: False, scale: 281474976710656, lr: 0.005060, time: 598.17
- 2022-09-23 09:11:39,702:INFO:epoch: 3 step: [8285/14785], loss: 8.1295, overflow: False, scale: 281474976710656, lr: 0.005071, time: 598.16
- 2022-09-23 09:12:13,106:INFO:epoch: 3 step: [8341/14785], loss: 8.1455, overflow: False, scale: 281474976710656, lr: 0.005081, time: 596.46
- 2022-09-23 09:12:46,699:INFO:epoch: 3 step: [8397/14785], loss: 8.4289, overflow: False, scale: 281474976710656, lr: 0.005092, time: 599.83
- 2022-09-23 09:13:20,207:INFO:epoch: 3 step: [8453/14785], loss: 8.0361, overflow: False, scale: 281474976710656, lr: 0.005103, time: 598.33
- 2022-09-23 09:13:53,699:INFO:epoch: 3 step: [8509/14785], loss: 7.5352, overflow: False, scale: 281474976710656, lr: 0.005114, time: 598.01
- 2022-09-23 09:14:27,098:INFO:epoch: 3 step: [8565/14785], loss: 6.9701, overflow: False, scale: 281474976710656, lr: 0.005125, time: 596.38
- 2022-09-23 09:15:00,605:INFO:epoch: 3 step: [8621/14785], loss: 7.4480, overflow: False, scale: 281474976710656, lr: 0.005136, time: 598.30
- 2022-09-23 09:15:34,073:INFO:epoch: 3 step: [8677/14785], loss: 7.4090, overflow: False, scale: 281474976710656, lr: 0.005146, time: 597.61
- 2022-09-23 09:16:07,706:INFO:epoch: 3 step: [8733/14785], loss: 7.6934, overflow: False, scale: 281474976710656, lr: 0.005157, time: 600.56
- 2022-09-23 09:16:41,100:INFO:epoch: 3 step: [8789/14785], loss: 7.5939, overflow: False, scale: 281474976710656, lr: 0.005168, time: 596.27
- 2022-09-23 09:17:14,405:INFO:epoch: 3 step: [8845/14785], loss: 8.0759, overflow: False, scale: 281474976710656, lr: 0.005179, time: 594.69
- 2022-09-23 09:17:47,805:INFO:epoch: 3 step: [8901/14785], loss: 7.6618, overflow: False, scale: 281474976710656, lr: 0.005190, time: 596.36
- 2022-09-23 09:18:21,297:INFO:epoch: 3 step: [8957/14785], loss: 8.1196, overflow: False, scale: 281474976710656, lr: 0.005201, time: 598.01
- 2022-09-23 09:18:54,901:INFO:epoch: 3 step: [9013/14785], loss: 7.2952, overflow: False, scale: 281474976710656, lr: 0.005212, time: 600.04
- 2022-09-23 09:19:28,494:INFO:epoch: 3 step: [9069/14785], loss: 7.2716, overflow: False, scale: 281474976710656, lr: 0.005223, time: 599.85
- 2022-09-23 09:20:02,093:INFO:epoch: 3 step: [9125/14785], loss: 7.7438, overflow: False, scale: 281474976710656, lr: 0.005234, time: 599.95
- 2022-09-23 09:20:35,602:INFO:epoch: 3 step: [9181/14785], loss: 8.1173, overflow: False, scale: 281474976710656, lr: 0.005245, time: 598.32
- 2022-09-23 09:21:09,107:INFO:epoch: 3 step: [9237/14785], loss: 7.0854, overflow: False, scale: 281474976710656, lr: 0.005256, time: 598.27
- 2022-09-23 09:21:42,695:INFO:epoch: 3 step: [9293/14785], loss: 8.1466, overflow: False, scale: 281474976710656, lr: 0.005267, time: 599.74
- 2022-09-23 09:22:16,210:INFO:epoch: 3 step: [9349/14785], loss: 8.1263, overflow: False, scale: 281474976710656, lr: 0.005278, time: 598.44
- 2022-09-23 09:22:49,804:INFO:epoch: 3 step: [9405/14785], loss: 7.9009, overflow: False, scale: 281474976710656, lr: 0.005289, time: 599.87
- 2022-09-23 09:23:23,312:INFO:epoch: 3 step: [9461/14785], loss: 7.3733, overflow: False, scale: 281474976710656, lr: 0.005300, time: 598.32
- 2022-09-23 09:23:56,796:INFO:epoch: 3 step: [9517/14785], loss: 7.2113, overflow: False, scale: 281474976710656, lr: 0.005311, time: 597.87
- 2022-09-23 09:24:30,407:INFO:epoch: 3 step: [9573/14785], loss: 7.9221, overflow: False, scale: 281474976710656, lr: 0.005322, time: 600.18
- 2022-09-23 09:25:03,997:INFO:epoch: 3 step: [9629/14785], loss: 7.7532, overflow: False, scale: 562949953421312, lr: 0.005333, time: 599.79
- 2022-09-23 09:25:37,507:INFO:epoch: 3 step: [9685/14785], loss: 7.4033, overflow: False, scale: 562949953421312, lr: 0.005344, time: 598.35
- 2022-09-23 09:26:10,907:INFO:epoch: 3 step: [9741/14785], loss: 7.5950, overflow: False, scale: 562949953421312, lr: 0.005355, time: 596.39
- 2022-09-23 09:26:44,503:INFO:epoch: 3 step: [9797/14785], loss: 8.0397, overflow: False, scale: 562949953421312, lr: 0.005366, time: 599.89
- 2022-09-23 09:27:18,091:INFO:epoch: 3 step: [9853/14785], loss: 7.6292, overflow: False, scale: 562949953421312, lr: 0.005377, time: 599.70
- 2022-09-23 09:27:51,295:INFO:epoch: 3 step: [9909/14785], loss: 8.4338, overflow: False, scale: 562949953421312, lr: 0.005388, time: 592.89
- 2022-09-23 09:28:24,903:INFO:epoch: 3 step: [9965/14785], loss: 7.6972, overflow: False, scale: 562949953421312, lr: 0.005399, time: 600.09
- 2022-09-23 09:28:58,511:INFO:epoch: 3 step: [10021/14785], loss: 8.0339, overflow: False, scale: 562949953421312, lr: 0.005411, time: 600.09
- 2022-09-23 09:29:32,095:INFO:epoch: 3 step: [10077/14785], loss: 7.7092, overflow: False, scale: 562949953421312, lr: 0.005422, time: 599.69
- 2022-09-23 09:30:05,596:INFO:epoch: 3 step: [10133/14785], loss: 8.5311, overflow: False, scale: 562949953421312, lr: 0.005433, time: 598.18
- 2022-09-23 09:30:39,102:INFO:epoch: 3 step: [10189/14785], loss: 7.1055, overflow: False, scale: 562949953421312, lr: 0.005444, time: 598.30
- 2022-09-23 09:31:12,601:INFO:epoch: 3 step: [10245/14785], loss: 7.2167, overflow: False, scale: 562949953421312, lr: 0.005455, time: 598.16
- 2022-09-23 09:31:46,201:INFO:epoch: 3 step: [10301/14785], loss: 8.1514, overflow: False, scale: 562949953421312, lr: 0.005466, time: 599.96
- 2022-09-23 09:32:19,710:INFO:epoch: 3 step: [10357/14785], loss: 7.8543, overflow: False, scale: 562949953421312, lr: 0.005478, time: 598.35
- 2022-09-23 09:32:53,293:INFO:epoch: 3 step: [10413/14785], loss: 7.3372, overflow: False, scale: 562949953421312, lr: 0.005489, time: 599.65
- 2022-09-23 09:33:26,904:INFO:epoch: 3 step: [10469/14785], loss: 7.0783, overflow: False, scale: 562949953421312, lr: 0.005500, time: 600.16
- 2022-09-23 09:34:00,505:INFO:epoch: 3 step: [10525/14785], loss: 7.4147, overflow: False, scale: 562949953421312, lr: 0.005511, time: 599.99
- 2022-09-23 09:34:33,998:INFO:epoch: 3 step: [10581/14785], loss: 7.5174, overflow: False, scale: 562949953421312, lr: 0.005523, time: 598.05
- 2022-09-23 09:35:07,600:INFO:epoch: 3 step: [10637/14785], loss: 7.6998, overflow: False, scale: 562949953421312, lr: 0.005534, time: 600.00
- 2022-09-23 09:35:40,998:INFO:epoch: 3 step: [10693/14785], loss: 8.1337, overflow: False, scale: 562949953421312, lr: 0.005545, time: 596.34
- 2022-09-23 09:36:14,609:INFO:epoch: 3 step: [10749/14785], loss: 8.4911, overflow: False, scale: 562949953421312, lr: 0.005556, time: 600.16
- 2022-09-23 09:36:48,036:INFO:epoch: 3 step: [10805/14785], loss: 7.5142, overflow: False, scale: 562949953421312, lr: 0.005568, time: 596.87
- 2022-09-23 09:37:21,601:INFO:epoch: 3 step: [10861/14785], loss: 7.8894, overflow: False, scale: 562949953421312, lr: 0.005579, time: 599.34
- 2022-09-23 09:37:55,199:INFO:epoch: 3 step: [10917/14785], loss: 8.0198, overflow: False, scale: 562949953421312, lr: 0.005590, time: 599.93
- 2022-09-23 09:38:28,804:INFO:epoch: 3 step: [10973/14785], loss: 7.3866, overflow: False, scale: 562949953421312, lr: 0.005602, time: 600.03
- 2022-09-23 09:39:02,400:INFO:epoch: 3 step: [11029/14785], loss: 7.8302, overflow: False, scale: 562949953421312, lr: 0.005613, time: 599.91
- 2022-09-23 09:39:35,899:INFO:epoch: 3 step: [11085/14785], loss: 7.7090, overflow: False, scale: 562949953421312, lr: 0.005624, time: 598.16
- 2022-09-23 09:40:09,502:INFO:epoch: 3 step: [11141/14785], loss: 7.0556, overflow: False, scale: 562949953421312, lr: 0.005636, time: 599.99
- 2022-09-23 09:40:42,993:INFO:epoch: 3 step: [11197/14785], loss: 7.3633, overflow: False, scale: 562949953421312, lr: 0.005647, time: 598.03
- 2022-09-23 09:41:16,607:INFO:epoch: 3 step: [11253/14785], loss: 7.7138, overflow: False, scale: 562949953421312, lr: 0.005659, time: 600.19
- 2022-09-23 09:41:50,099:INFO:epoch: 3 step: [11309/14785], loss: 7.7421, overflow: False, scale: 562949953421312, lr: 0.005670, time: 598.04
- 2022-09-23 09:42:23,506:INFO:epoch: 3 step: [11365/14785], loss: 7.5455, overflow: False, scale: 562949953421312, lr: 0.005681, time: 596.50
- 2022-09-23 09:42:56,897:INFO:epoch: 3 step: [11421/14785], loss: 7.5035, overflow: False, scale: 562949953421312, lr: 0.005693, time: 596.24
- 2022-09-23 09:43:30,310:INFO:epoch: 3 step: [11477/14785], loss: 7.8766, overflow: False, scale: 562949953421312, lr: 0.005704, time: 596.61
- 2022-09-23 09:44:03,770:INFO:epoch: 3 step: [11533/14785], loss: 8.0438, overflow: False, scale: 562949953421312, lr: 0.005716, time: 597.46
- 2022-09-23 09:44:37,302:INFO:epoch: 3 step: [11589/14785], loss: 7.6512, overflow: False, scale: 1125899906842624, lr: 0.005727, time: 598.74
- 2022-09-23 09:45:10,807:INFO:epoch: 3 step: [11645/14785], loss: 7.9007, overflow: False, scale: 1125899906842624, lr: 0.005739, time: 598.27
- 2022-09-23 09:45:44,399:INFO:epoch: 3 step: [11701/14785], loss: 7.6309, overflow: False, scale: 1125899906842624, lr: 0.005750, time: 599.82
- 2022-09-23 09:46:17,922:INFO:epoch: 3 step: [11757/14785], loss: 8.0143, overflow: False, scale: 1125899906842624, lr: 0.005762, time: 598.59
- 2022-09-23 09:46:51,506:INFO:epoch: 3 step: [11813/14785], loss: 8.4843, overflow: False, scale: 1125899906842624, lr: 0.005773, time: 599.59
- 2022-09-23 09:47:24,998:INFO:epoch: 3 step: [11869/14785], loss: 8.1050, overflow: False, scale: 1125899906842624, lr: 0.005785, time: 598.04
- 2022-09-23 09:47:58,597:INFO:epoch: 3 step: [11925/14785], loss: 7.9870, overflow: False, scale: 1125899906842624, lr: 0.005796, time: 599.95
- 2022-09-23 09:48:32,201:INFO:epoch: 3 step: [11981/14785], loss: 8.1738, overflow: False, scale: 1125899906842624, lr: 0.005808, time: 600.04
- 2022-09-23 09:49:05,794:INFO:epoch: 3 step: [12037/14785], loss: 7.9493, overflow: False, scale: 1125899906842624, lr: 0.005819, time: 599.83
- 2022-09-23 09:49:39,305:INFO:epoch: 3 step: [12093/14785], loss: 8.0919, overflow: False, scale: 1125899906842624, lr: 0.005831, time: 598.31
- 2022-09-23 09:50:12,911:INFO:epoch: 3 step: [12149/14785], loss: 7.9952, overflow: False, scale: 1125899906842624, lr: 0.005842, time: 600.07
- 2022-09-23 09:50:46,504:INFO:epoch: 3 step: [12205/14785], loss: 7.4228, overflow: False, scale: 1125899906842624, lr: 0.005854, time: 599.85
- 2022-09-23 09:51:20,113:INFO:epoch: 3 step: [12261/14785], loss: 7.6135, overflow: False, scale: 1125899906842624, lr: 0.005866, time: 600.11
- 2022-09-23 09:51:53,709:INFO:epoch: 3 step: [12317/14785], loss: 7.8765, overflow: False, scale: 1125899906842624, lr: 0.005877, time: 599.88
- 2022-09-23 09:52:27,105:INFO:epoch: 3 step: [12373/14785], loss: 7.7789, overflow: False, scale: 1125899906842624, lr: 0.005889, time: 596.31
- 2022-09-23 09:53:00,604:INFO:epoch: 3 step: [12429/14785], loss: 8.1402, overflow: False, scale: 1125899906842624, lr: 0.005900, time: 598.17
- 2022-09-23 09:53:34,192:INFO:epoch: 3 step: [12485/14785], loss: 7.2919, overflow: False, scale: 1125899906842624, lr: 0.005912, time: 599.75
- 2022-09-23 09:54:07,715:INFO:epoch: 3 step: [12541/14785], loss: 7.4219, overflow: False, scale: 1125899906842624, lr: 0.005924, time: 598.57
- 2022-09-23 09:54:41,296:INFO:epoch: 3 step: [12597/14785], loss: 7.9015, overflow: False, scale: 1125899906842624, lr: 0.005935, time: 599.64
- 2022-09-23 09:55:14,901:INFO:epoch: 3 step: [12653/14785], loss: 7.5108, overflow: False, scale: 1125899906842624, lr: 0.005947, time: 600.05
- 2022-09-23 09:55:48,403:INFO:epoch: 3 step: [12709/14785], loss: 8.2776, overflow: False, scale: 1125899906842624, lr: 0.005959, time: 598.22
- 2022-09-23 09:56:21,600:INFO:epoch: 3 step: [12765/14785], loss: 7.8473, overflow: False, scale: 1125899906842624, lr: 0.005970, time: 592.75
- 2022-09-23 09:56:55,002:INFO:epoch: 3 step: [12821/14785], loss: 7.9643, overflow: False, scale: 1125899906842624, lr: 0.005982, time: 596.42
- 2022-09-23 09:57:28,499:INFO:epoch: 3 step: [12877/14785], loss: 8.1497, overflow: False, scale: 1125899906842624, lr: 0.005994, time: 598.11
- 2022-09-23 09:58:02,101:INFO:epoch: 3 step: [12933/14785], loss: 7.6737, overflow: False, scale: 1125899906842624, lr: 0.006006, time: 600.01
- 2022-09-23 09:58:35,707:INFO:epoch: 3 step: [12989/14785], loss: 7.8460, overflow: False, scale: 1125899906842624, lr: 0.006017, time: 600.02
- 2022-09-23 09:59:09,302:INFO:epoch: 3 step: [13045/14785], loss: 6.8598, overflow: False, scale: 1125899906842624, lr: 0.006029, time: 599.87
- 2022-09-23 09:59:42,796:INFO:epoch: 3 step: [13101/14785], loss: 7.7931, overflow: False, scale: 1125899906842624, lr: 0.006041, time: 598.05
- 2022-09-23 10:00:16,299:INFO:epoch: 3 step: [13157/14785], loss: 7.6455, overflow: False, scale: 1125899906842624, lr: 0.006053, time: 598.24
- 2022-09-23 10:00:49,901:INFO:epoch: 3 step: [13213/14785], loss: 7.6061, overflow: False, scale: 1125899906842624, lr: 0.006064, time: 599.99
- 2022-09-23 10:01:23,492:INFO:epoch: 3 step: [13269/14785], loss: 7.8975, overflow: False, scale: 1125899906842624, lr: 0.006076, time: 599.80
- 2022-09-23 10:01:56,903:INFO:epoch: 3 step: [13325/14785], loss: 7.5450, overflow: False, scale: 1125899906842624, lr: 0.006088, time: 596.59
- 2022-09-23 10:02:30,408:INFO:epoch: 3 step: [13381/14785], loss: 7.5289, overflow: False, scale: 1125899906842624, lr: 0.006100, time: 598.25
- 2022-09-23 10:03:03,903:INFO:epoch: 3 step: [13437/14785], loss: 7.6992, overflow: False, scale: 1125899906842624, lr: 0.006112, time: 598.08
- 2022-09-23 10:03:37,509:INFO:epoch: 3 step: [13493/14785], loss: 7.1409, overflow: False, scale: 1125899906842624, lr: 0.006124, time: 600.08
- 2022-09-23 10:04:11,006:INFO:epoch: 3 step: [13549/14785], loss: 8.2995, overflow: False, scale: 1125899906842624, lr: 0.006135, time: 598.13
- 2022-09-23 10:04:44,501:INFO:epoch: 3 step: [13605/14785], loss: 7.6074, overflow: False, scale: 2251799813685248, lr: 0.006147, time: 598.08
- 2022-09-23 10:05:18,098:INFO:epoch: 3 step: [13661/14785], loss: 7.9699, overflow: False, scale: 2251799813685248, lr: 0.006159, time: 599.90
- 2022-09-23 10:05:51,699:INFO:epoch: 3 step: [13717/14785], loss: 7.9385, overflow: False, scale: 2251799813685248, lr: 0.006171, time: 599.99
- 2022-09-23 10:06:25,299:INFO:epoch: 3 step: [13773/14785], loss: 8.2858, overflow: False, scale: 2251799813685248, lr: 0.006183, time: 599.95
- 2022-09-23 10:06:58,899:INFO:epoch: 3 step: [13829/14785], loss: 7.4713, overflow: False, scale: 2251799813685248, lr: 0.006195, time: 599.95
- 2022-09-23 10:07:32,170:INFO:epoch: 3 step: [13885/14785], loss: 8.1484, overflow: False, scale: 2251799813685248, lr: 0.006207, time: 594.09
- 2022-09-23 10:08:05,714:INFO:epoch: 3 step: [13941/14785], loss: 7.4568, overflow: False, scale: 2251799813685248, lr: 0.006219, time: 598.96
- 2022-09-23 10:08:39,309:INFO:epoch: 3 step: [13997/14785], loss: 7.5732, overflow: False, scale: 2251799813685248, lr: 0.006231, time: 599.89
- 2022-09-23 10:09:12,696:INFO:epoch: 3 step: [14053/14785], loss: 7.8093, overflow: False, scale: 2251799813685248, lr: 0.006243, time: 596.15
- 2022-09-23 10:09:46,205:INFO:epoch: 3 step: [14109/14785], loss: 7.4875, overflow: False, scale: 2251799813685248, lr: 0.006255, time: 598.34
- 2022-09-23 10:10:19,709:INFO:epoch: 3 step: [14165/14785], loss: 8.3403, overflow: False, scale: 2251799813685248, lr: 0.006267, time: 598.24
- 2022-09-23 10:10:53,311:INFO:epoch: 3 step: [14221/14785], loss: 7.8039, overflow: False, scale: 2251799813685248, lr: 0.006279, time: 600.00
- 2022-09-23 10:11:26,697:INFO:epoch: 3 step: [14277/14785], loss: 7.8858, overflow: False, scale: 2251799813685248, lr: 0.006291, time: 596.13
- 2022-09-23 10:12:00,319:INFO:epoch: 3 step: [14333/14785], loss: 7.1339, overflow: False, scale: 2251799813685248, lr: 0.006303, time: 600.34
- 2022-09-23 10:12:33,801:INFO:epoch: 3 step: [14389/14785], loss: 7.3460, overflow: False, scale: 2251799813685248, lr: 0.006315, time: 597.86
- 2022-09-23 10:13:07,205:INFO:epoch: 3 step: [14445/14785], loss: 7.6939, overflow: False, scale: 2251799813685248, lr: 0.006327, time: 596.47
- 2022-09-23 10:13:40,797:INFO:epoch: 3 step: [14501/14785], loss: 7.2273, overflow: False, scale: 2251799813685248, lr: 0.006339, time: 599.82
- 2022-09-23 10:14:14,399:INFO:epoch: 3 step: [14557/14785], loss: 7.3744, overflow: False, scale: 2251799813685248, lr: 0.006351, time: 600.00
- 2022-09-23 10:14:48,006:INFO:epoch: 3 step: [14613/14785], loss: 8.0774, overflow: False, scale: 2251799813685248, lr: 0.006363, time: 600.09
- 2022-09-23 10:15:21,403:INFO:epoch: 3 step: [14669/14785], loss: 7.8654, overflow: False, scale: 2251799813685248, lr: 0.006375, time: 596.30
- 2022-09-23 10:15:55,009:INFO:epoch: 3 step: [14725/14785], loss: 7.8628, overflow: False, scale: 2251799813685248, lr: 0.006387, time: 600.06
- 2022-09-23 10:16:28,600:INFO:epoch: 3 step: [14781/14785], loss: 8.0557, overflow: False, scale: 2251799813685248, lr: 0.006399, time: 599.80
- 2022-09-23 10:17:02,197:INFO:epoch: 4 step: [52/14785], loss: 8.1698, overflow: False, scale: 2251799813685248, lr: 0.006411, time: 599.92
- 2022-09-23 10:17:35,811:INFO:epoch: 4 step: [108/14785], loss: 8.2502, overflow: False, scale: 2251799813685248, lr: 0.006424, time: 600.21
- 2022-09-23 10:18:11,276:INFO:epoch: 4 step: [164/14785], loss: 7.4554, overflow: False, scale: 2251799813685248, lr: 0.006436, time: 633.26
- 2022-09-23 10:18:44,900:INFO:epoch: 4 step: [220/14785], loss: 7.9990, overflow: False, scale: 2251799813685248, lr: 0.006448, time: 600.41
- 2022-09-23 10:19:18,512:INFO:epoch: 4 step: [276/14785], loss: 7.5975, overflow: False, scale: 2251799813685248, lr: 0.006460, time: 600.17
- 2022-09-23 10:19:52,010:INFO:epoch: 4 step: [332/14785], loss: 7.7407, overflow: False, scale: 2251799813685248, lr: 0.006472, time: 598.13
- 2022-09-23 10:20:25,600:INFO:epoch: 4 step: [388/14785], loss: 7.1797, overflow: False, scale: 2251799813685248, lr: 0.006484, time: 599.79
- 2022-09-23 10:20:59,102:INFO:epoch: 4 step: [444/14785], loss: 7.5294, overflow: False, scale: 2251799813685248, lr: 0.006497, time: 598.21
- 2022-09-23 10:21:32,594:INFO:epoch: 4 step: [500/14785], loss: 7.5416, overflow: False, scale: 2251799813685248, lr: 0.006509, time: 598.03
- 2022-09-23 10:22:06,195:INFO:epoch: 4 step: [556/14785], loss: 7.5429, overflow: False, scale: 2251799813685248, lr: 0.006521, time: 599.99
- 2022-09-23 10:22:39,500:INFO:epoch: 4 step: [612/14785], loss: 7.1578, overflow: False, scale: 2251799813685248, lr: 0.006533, time: 594.68
- 2022-09-23 10:23:12,801:INFO:epoch: 4 step: [668/14785], loss: 7.5387, overflow: False, scale: 2251799813685248, lr: 0.006546, time: 594.63
- 2022-09-23 10:23:46,107:INFO:epoch: 4 step: [724/14785], loss: 7.8868, overflow: False, scale: 2251799813685248, lr: 0.006558, time: 594.73
- 2022-09-23 10:24:19,599:INFO:epoch: 4 step: [780/14785], loss: 7.8721, overflow: False, scale: 2251799813685248, lr: 0.006570, time: 598.03
- 2022-09-23 10:24:53,200:INFO:epoch: 4 step: [836/14785], loss: 7.4028, overflow: False, scale: 4503599627370496, lr: 0.006582, time: 599.97
- 2022-09-23 10:25:26,796:INFO:epoch: 4 step: [892/14785], loss: 7.7758, overflow: False, scale: 4503599627370496, lr: 0.006595, time: 599.90
- 2022-09-23 10:26:00,307:INFO:epoch: 4 step: [948/14785], loss: 8.0831, overflow: False, scale: 4503599627370496, lr: 0.006607, time: 598.37
- 2022-09-23 10:26:33,592:INFO:epoch: 4 step: [1004/14785], loss: 7.6105, overflow: False, scale: 4503599627370496, lr: 0.006619, time: 594.34
- 2022-09-23 10:27:06,997:INFO:epoch: 4 step: [1060/14785], loss: 7.5247, overflow: False, scale: 4503599627370496, lr: 0.006632, time: 596.49
- 2022-09-23 10:27:40,501:INFO:epoch: 4 step: [1116/14785], loss: 7.6434, overflow: False, scale: 4503599627370496, lr: 0.006644, time: 598.25
- 2022-09-23 10:28:13,900:INFO:epoch: 4 step: [1172/14785], loss: 7.2710, overflow: False, scale: 4503599627370496, lr: 0.006656, time: 596.38
- 2022-09-23 10:28:47,321:INFO:epoch: 4 step: [1228/14785], loss: 7.2411, overflow: False, scale: 4503599627370496, lr: 0.006669, time: 596.77
- 2022-09-23 10:29:20,600:INFO:epoch: 4 step: [1284/14785], loss: 8.2686, overflow: False, scale: 4503599627370496, lr: 0.006681, time: 594.24
- 2022-09-23 10:29:54,100:INFO:epoch: 4 step: [1340/14785], loss: 7.7404, overflow: False, scale: 4503599627370496, lr: 0.006694, time: 598.18
- 2022-09-23 10:30:27,624:INFO:epoch: 4 step: [1396/14785], loss: 8.1513, overflow: False, scale: 4503599627370496, lr: 0.006706, time: 598.60
- 2022-09-23 10:31:01,094:INFO:epoch: 4 step: [1452/14785], loss: 7.5975, overflow: False, scale: 4503599627370496, lr: 0.006718, time: 597.65
- 2022-09-23 10:31:34,706:INFO:epoch: 4 step: [1508/14785], loss: 7.8067, overflow: False, scale: 4503599627370496, lr: 0.006731, time: 600.18
- 2022-09-23 10:32:08,099:INFO:epoch: 4 step: [1564/14785], loss: 7.7881, overflow: False, scale: 4503599627370496, lr: 0.006743, time: 596.26
- 2022-09-23 10:32:41,599:INFO:epoch: 4 step: [1620/14785], loss: 7.3041, overflow: False, scale: 4503599627370496, lr: 0.006756, time: 598.15
- 2022-09-23 10:33:15,004:INFO:epoch: 4 step: [1676/14785], loss: 8.0858, overflow: False, scale: 4503599627370496, lr: 0.006768, time: 596.48
- 2022-09-23 10:33:48,491:INFO:epoch: 4 step: [1732/14785], loss: 7.7170, overflow: False, scale: 4503599627370496, lr: 0.006781, time: 597.95
- 2022-09-23 10:34:21,999:INFO:epoch: 4 step: [1788/14785], loss: 7.5149, overflow: False, scale: 4503599627370496, lr: 0.006793, time: 598.29
- 2022-09-23 10:34:55,605:INFO:epoch: 4 step: [1844/14785], loss: 7.7376, overflow: False, scale: 4503599627370496, lr: 0.006806, time: 600.09
- 2022-09-23 10:35:29,005:INFO:epoch: 4 step: [1900/14785], loss: 7.6456, overflow: False, scale: 4503599627370496, lr: 0.006818, time: 596.39
- 2022-09-23 10:36:02,502:INFO:epoch: 4 step: [1956/14785], loss: 7.5228, overflow: False, scale: 4503599627370496, lr: 0.006831, time: 598.13
- 2022-09-23 10:36:36,101:INFO:epoch: 4 step: [2012/14785], loss: 8.0658, overflow: False, scale: 4503599627370496, lr: 0.006843, time: 599.96
- 2022-09-23 10:37:09,604:INFO:epoch: 4 step: [2068/14785], loss: 8.2319, overflow: False, scale: 4503599627370496, lr: 0.006856, time: 598.22
- 2022-09-23 10:37:43,112:INFO:epoch: 4 step: [2124/14785], loss: 7.6770, overflow: False, scale: 4503599627370496, lr: 0.006868, time: 598.32
- 2022-09-23 10:38:16,603:INFO:epoch: 4 step: [2180/14785], loss: 8.2719, overflow: False, scale: 4503599627370496, lr: 0.006881, time: 598.01
- 2022-09-23 10:38:50,004:INFO:epoch: 4 step: [2236/14785], loss: 9.0970, overflow: False, scale: 4503599627370496, lr: 0.006893, time: 596.42
- 2022-09-23 10:39:23,498:INFO:epoch: 4 step: [2292/14785], loss: 8.0632, overflow: False, scale: 4503599627370496, lr: 0.006906, time: 598.06
- 2022-09-23 10:39:56,998:INFO:epoch: 4 step: [2348/14785], loss: 7.6642, overflow: False, scale: 4503599627370496, lr: 0.006919, time: 598.18
- 2022-09-23 10:40:30,501:INFO:epoch: 4 step: [2404/14785], loss: 7.9436, overflow: False, scale: 4503599627370496, lr: 0.006931, time: 598.22
- 2022-09-23 10:41:03,899:INFO:epoch: 4 step: [2460/14785], loss: 7.7621, overflow: False, scale: 4503599627370496, lr: 0.006944, time: 596.35
- 2022-09-23 10:41:37,402:INFO:epoch: 4 step: [2516/14785], loss: 7.6894, overflow: False, scale: 4503599627370496, lr: 0.006956, time: 598.24
- 2022-09-23 10:42:10,800:INFO:epoch: 4 step: [2572/14785], loss: 7.7824, overflow: False, scale: 4503599627370496, lr: 0.006969, time: 596.36
- 2022-09-23 10:42:44,297:INFO:epoch: 4 step: [2628/14785], loss: 7.7665, overflow: False, scale: 4503599627370496, lr: 0.006982, time: 598.12
- 2022-09-23 10:43:17,798:INFO:epoch: 4 step: [2684/14785], loss: 7.6140, overflow: False, scale: 4503599627370496, lr: 0.006994, time: 598.18
- 2022-09-23 10:43:51,197:INFO:epoch: 4 step: [2740/14785], loss: 7.9054, overflow: False, scale: 4503599627370496, lr: 0.007007, time: 596.38
- 2022-09-23 10:44:24,613:INFO:epoch: 4 step: [2796/14785], loss: 6.9942, overflow: False, scale: 4503599627370496, lr: 0.007020, time: 596.65
- 2022-09-23 10:44:58,206:INFO:epoch: 4 step: [2852/14785], loss: 7.4309, overflow: False, scale: 9007199254740992, lr: 0.007032, time: 599.84
- 2022-09-23 10:45:31,597:INFO:epoch: 4 step: [2908/14785], loss: 7.7582, overflow: False, scale: 9007199254740992, lr: 0.007045, time: 596.24
- 2022-09-23 10:46:04,997:INFO:epoch: 4 step: [2964/14785], loss: 8.3042, overflow: False, scale: 9007199254740992, lr: 0.007058, time: 596.40
- 2022-09-23 10:46:38,405:INFO:epoch: 4 step: [3020/14785], loss: 7.5475, overflow: False, scale: 9007199254740992, lr: 0.007071, time: 596.53
- 2022-09-23 10:47:11,706:INFO:epoch: 4 step: [3076/14785], loss: 8.0018, overflow: False, scale: 9007199254740992, lr: 0.007083, time: 594.58
- 2022-09-23 10:47:45,205:INFO:epoch: 4 step: [3132/14785], loss: 7.6469, overflow: False, scale: 9007199254740992, lr: 0.007096, time: 598.18
- 2022-09-23 10:48:18,702:INFO:epoch: 4 step: [3188/14785], loss: 7.3528, overflow: False, scale: 9007199254740992, lr: 0.007109, time: 598.11
- 2022-09-23 10:48:52,295:INFO:epoch: 4 step: [3244/14785], loss: 7.0762, overflow: False, scale: 9007199254740992, lr: 0.007122, time: 599.84
- 2022-09-23 10:49:25,897:INFO:epoch: 4 step: [3300/14785], loss: 8.6157, overflow: False, scale: 9007199254740992, lr: 0.007134, time: 599.98
- 2022-09-23 10:49:59,407:INFO:epoch: 4 step: [3356/14785], loss: 8.1479, overflow: False, scale: 9007199254740992, lr: 0.007147, time: 598.35
- 2022-09-23 10:50:32,991:INFO:epoch: 4 step: [3412/14785], loss: 7.0311, overflow: False, scale: 9007199254740992, lr: 0.007160, time: 599.68
- 2022-09-23 10:51:06,504:INFO:epoch: 4 step: [3468/14785], loss: 8.2124, overflow: False, scale: 9007199254740992, lr: 0.007173, time: 598.40
- 2022-09-23 10:51:39,808:INFO:epoch: 4 step: [3524/14785], loss: 7.4175, overflow: False, scale: 9007199254740992, lr: 0.007186, time: 594.67
- 2022-09-23 10:52:13,208:INFO:epoch: 4 step: [3580/14785], loss: 7.5775, overflow: False, scale: 9007199254740992, lr: 0.007199, time: 596.38
- 2022-09-23 10:52:46,596:INFO:epoch: 4 step: [3636/14785], loss: 7.4100, overflow: False, scale: 9007199254740992, lr: 0.007211, time: 596.16
- 2022-09-23 10:53:20,095:INFO:epoch: 4 step: [3692/14785], loss: 7.8757, overflow: False, scale: 9007199254740992, lr: 0.007224, time: 598.18
- 2022-09-23 10:53:53,595:INFO:epoch: 4 step: [3748/14785], loss: 7.8256, overflow: False, scale: 9007199254740992, lr: 0.007237, time: 598.18
- 2022-09-23 10:54:27,100:INFO:epoch: 4 step: [3804/14785], loss: 7.6583, overflow: False, scale: 9007199254740992, lr: 0.007250, time: 598.26
- 2022-09-23 10:55:00,300:INFO:epoch: 4 step: [3860/14785], loss: 8.0542, overflow: False, scale: 9007199254740992, lr: 0.007263, time: 592.79
- 2022-09-23 10:55:33,524:INFO:epoch: 4 step: [3916/14785], loss: 7.5698, overflow: False, scale: 9007199254740992, lr: 0.007276, time: 593.25
- 2022-09-23 10:56:06,794:INFO:epoch: 4 step: [3972/14785], loss: 7.7710, overflow: False, scale: 9007199254740992, lr: 0.007289, time: 594.07
- 2022-09-23 10:56:40,327:INFO:epoch: 4 step: [4028/14785], loss: 7.9415, overflow: False, scale: 9007199254740992, lr: 0.007302, time: 598.77
- 2022-09-23 10:57:13,697:INFO:epoch: 4 step: [4084/14785], loss: 7.6188, overflow: False, scale: 9007199254740992, lr: 0.007315, time: 595.84
- 2022-09-23 10:57:47,103:INFO:epoch: 4 step: [4140/14785], loss: 8.3125, overflow: False, scale: 9007199254740992, lr: 0.007328, time: 596.50
- 2022-09-23 10:58:20,615:INFO:epoch: 4 step: [4196/14785], loss: 7.1147, overflow: False, scale: 9007199254740992, lr: 0.007341, time: 598.38
- 2022-09-23 10:58:53,998:INFO:epoch: 4 step: [4252/14785], loss: 6.7903, overflow: False, scale: 9007199254740992, lr: 0.007354, time: 596.09
- 2022-09-23 10:59:27,234:INFO:epoch: 4 step: [4308/14785], loss: 7.8935, overflow: False, scale: 9007199254740992, lr: 0.007367, time: 593.46
- 2022-09-23 11:00:00,602:INFO:epoch: 4 step: [4364/14785], loss: 7.9908, overflow: False, scale: 9007199254740992, lr: 0.007380, time: 595.83
- 2022-09-23 11:00:33,969:INFO:epoch: 4 step: [4420/14785], loss: 8.3536, overflow: False, scale: 9007199254740992, lr: 0.007393, time: 595.78
- 2022-09-23 11:01:07,302:INFO:epoch: 4 step: [4476/14785], loss: 7.1306, overflow: False, scale: 9007199254740992, lr: 0.007406, time: 595.20
- 2022-09-23 11:01:40,705:INFO:epoch: 4 step: [4532/14785], loss: 8.5359, overflow: False, scale: 9007199254740992, lr: 0.007419, time: 596.43
- 2022-09-23 11:02:14,296:INFO:epoch: 4 step: [4588/14785], loss: 7.8927, overflow: False, scale: 9007199254740992, lr: 0.007432, time: 599.79
- 2022-09-23 11:02:47,611:INFO:epoch: 4 step: [4644/14785], loss: 8.1395, overflow: False, scale: 9007199254740992, lr: 0.007445, time: 594.86
- 2022-09-23 11:03:20,899:INFO:epoch: 4 step: [4700/14785], loss: 7.1061, overflow: False, scale: 9007199254740992, lr: 0.007458, time: 594.39
- 2022-09-23 11:03:54,402:INFO:epoch: 4 step: [4756/14785], loss: 7.5811, overflow: False, scale: 9007199254740992, lr: 0.007471, time: 598.22
- 2022-09-23 11:04:27,701:INFO:epoch: 4 step: [4812/14785], loss: 7.7640, overflow: False, scale: 18014398509481984, lr: 0.007484, time: 594.61
- 2022-09-23 11:05:01,198:INFO:epoch: 4 step: [4868/14785], loss: 7.8421, overflow: False, scale: 18014398509481984, lr: 0.007497, time: 598.11
- 2022-09-23 11:05:34,600:INFO:epoch: 4 step: [4924/14785], loss: 8.1060, overflow: False, scale: 18014398509481984, lr: 0.007510, time: 596.43
- 2022-09-23 11:06:08,114:INFO:epoch: 4 step: [4980/14785], loss: 8.1742, overflow: False, scale: 18014398509481984, lr: 0.007523, time: 598.41
- 2022-09-23 11:06:41,506:INFO:epoch: 4 step: [5036/14785], loss: 7.3624, overflow: False, scale: 18014398509481984, lr: 0.007537, time: 596.23
- 2022-09-23 11:07:15,010:INFO:epoch: 4 step: [5092/14785], loss: 6.9597, overflow: False, scale: 18014398509481984, lr: 0.007550, time: 598.26
- 2022-09-23 11:07:48,497:INFO:epoch: 4 step: [5148/14785], loss: 7.2502, overflow: False, scale: 18014398509481984, lr: 0.007563, time: 597.93
- 2022-09-23 11:08:22,005:INFO:epoch: 4 step: [5204/14785], loss: 7.3416, overflow: False, scale: 18014398509481984, lr: 0.007576, time: 598.31
- 2022-09-23 11:08:55,604:INFO:epoch: 4 step: [5260/14785], loss: 7.5743, overflow: False, scale: 18014398509481984, lr: 0.007589, time: 599.95
- 2022-09-23 11:09:29,104:INFO:epoch: 4 step: [5316/14785], loss: 8.1268, overflow: False, scale: 18014398509481984, lr: 0.007603, time: 598.18
- 2022-09-23 11:10:02,504:INFO:epoch: 4 step: [5372/14785], loss: 7.4765, overflow: False, scale: 18014398509481984, lr: 0.007616, time: 596.39
- 2022-09-23 11:10:36,103:INFO:epoch: 4 step: [5428/14785], loss: 8.1455, overflow: False, scale: 18014398509481984, lr: 0.007629, time: 599.94
- 2022-09-23 11:11:09,613:INFO:epoch: 4 step: [5484/14785], loss: 7.9964, overflow: False, scale: 18014398509481984, lr: 0.007642, time: 598.33
- 2022-09-23 11:11:43,002:INFO:epoch: 4 step: [5540/14785], loss: 7.8240, overflow: False, scale: 18014398509481984, lr: 0.007655, time: 596.21
- 2022-09-23 11:12:16,604:INFO:epoch: 4 step: [5596/14785], loss: 7.9914, overflow: False, scale: 18014398509481984, lr: 0.007669, time: 600.00
- 2022-09-23 11:12:50,020:INFO:epoch: 4 step: [5652/14785], loss: 7.2331, overflow: False, scale: 18014398509481984, lr: 0.007682, time: 596.66
- 2022-09-23 11:13:23,506:INFO:epoch: 4 step: [5708/14785], loss: 7.8934, overflow: False, scale: 18014398509481984, lr: 0.007695, time: 597.91
- 2022-09-23 11:13:57,002:INFO:epoch: 4 step: [5764/14785], loss: 9.1231, overflow: False, scale: 18014398509481984, lr: 0.007709, time: 598.09
- 2022-09-23 11:14:30,601:INFO:epoch: 4 step: [5820/14785], loss: 8.2552, overflow: False, scale: 18014398509481984, lr: 0.007722, time: 599.95
- 2022-09-23 11:15:04,004:INFO:epoch: 4 step: [5876/14785], loss: 7.8988, overflow: False, scale: 18014398509481984, lr: 0.007735, time: 596.46
- 2022-09-23 11:15:37,496:INFO:epoch: 4 step: [5932/14785], loss: 7.2860, overflow: False, scale: 18014398509481984, lr: 0.007749, time: 598.02
- 2022-09-23 11:16:11,001:INFO:epoch: 4 step: [5988/14785], loss: 7.6631, overflow: False, scale: 18014398509481984, lr: 0.007762, time: 598.25
- 2022-09-23 11:16:44,505:INFO:epoch: 4 step: [6044/14785], loss: 7.4820, overflow: False, scale: 18014398509481984, lr: 0.007775, time: 598.22
- 2022-09-23 11:17:18,008:INFO:epoch: 4 step: [6100/14785], loss: 7.5705, overflow: False, scale: 18014398509481984, lr: 0.007789, time: 598.23
- 2022-09-23 11:17:51,462:INFO:epoch: 4 step: [6156/14785], loss: 7.9639, overflow: False, scale: 18014398509481984, lr: 0.007802, time: 597.35
- 2022-09-23 11:18:24,901:INFO:epoch: 4 step: [6212/14785], loss: 7.9861, overflow: False, scale: 18014398509481984, lr: 0.007815, time: 597.08
- 2022-09-23 11:18:58,248:INFO:epoch: 4 step: [6268/14785], loss: 8.7722, overflow: False, scale: 18014398509481984, lr: 0.007829, time: 595.45
- 2022-09-23 11:19:31,700:INFO:epoch: 4 step: [6324/14785], loss: 7.6058, overflow: False, scale: 18014398509481984, lr: 0.007842, time: 597.31
- 2022-09-23 11:20:05,102:INFO:epoch: 4 step: [6380/14785], loss: 7.3750, overflow: False, scale: 18014398509481984, lr: 0.007856, time: 596.43
- 2022-09-23 11:20:38,599:INFO:epoch: 4 step: [6436/14785], loss: 7.5915, overflow: False, scale: 18014398509481984, lr: 0.007869, time: 598.13
- 2022-09-23 11:21:12,209:INFO:epoch: 4 step: [6492/14785], loss: 7.9257, overflow: False, scale: 18014398509481984, lr: 0.007882, time: 600.13
- 2022-09-23 11:21:45,700:INFO:epoch: 4 step: [6548/14785], loss: 7.4168, overflow: False, scale: 18014398509481984, lr: 0.007896, time: 598.01
- 2022-09-23 11:22:19,109:INFO:epoch: 4 step: [6604/14785], loss: 7.3796, overflow: False, scale: 18014398509481984, lr: 0.007909, time: 596.55
- 2022-09-23 11:22:52,702:INFO:epoch: 4 step: [6660/14785], loss: 7.3040, overflow: False, scale: 18014398509481984, lr: 0.007923, time: 599.85
- 2022-09-23 11:23:26,104:INFO:epoch: 4 step: [6716/14785], loss: 7.3933, overflow: False, scale: 18014398509481984, lr: 0.007936, time: 596.42
- 2022-09-23 11:23:59,598:INFO:epoch: 4 step: [6772/14785], loss: 7.4504, overflow: False, scale: 18014398509481984, lr: 0.007950, time: 598.08
- 2022-09-23 11:24:32,995:INFO:epoch: 4 step: [6828/14785], loss: 7.9336, overflow: False, scale: 36028797018963968, lr: 0.007963, time: 596.31
- 2022-09-23 11:25:06,602:INFO:epoch: 4 step: [6884/14785], loss: 7.6026, overflow: False, scale: 36028797018963968, lr: 0.007977, time: 600.08
- 2022-09-23 11:25:40,097:INFO:epoch: 4 step: [6940/14785], loss: 7.3812, overflow: False, scale: 36028797018963968, lr: 0.007990, time: 598.10
- 2022-09-23 11:26:13,701:INFO:epoch: 4 step: [6996/14785], loss: 8.2406, overflow: False, scale: 36028797018963968, lr: 0.008004, time: 600.03
- 2022-09-23 11:26:47,195:INFO:epoch: 4 step: [7052/14785], loss: 7.2763, overflow: False, scale: 36028797018963968, lr: 0.008018, time: 598.05
- 2022-09-23 11:27:20,706:INFO:epoch: 4 step: [7108/14785], loss: 7.7429, overflow: False, scale: 36028797018963968, lr: 0.008031, time: 598.39
- 2022-09-23 11:27:54,305:INFO:epoch: 4 step: [7164/14785], loss: 8.2412, overflow: False, scale: 36028797018963968, lr: 0.008045, time: 599.92
- 2022-09-23 11:28:27,805:INFO:epoch: 4 step: [7220/14785], loss: 7.8223, overflow: False, scale: 36028797018963968, lr: 0.008058, time: 598.19
- 2022-09-23 11:29:01,404:INFO:epoch: 4 step: [7276/14785], loss: 7.6823, overflow: False, scale: 36028797018963968, lr: 0.008072, time: 599.94
- 2022-09-23 11:29:35,001:INFO:epoch: 4 step: [7332/14785], loss: 8.5656, overflow: False, scale: 36028797018963968, lr: 0.008086, time: 599.91
- 2022-09-23 11:30:08,500:INFO:epoch: 4 step: [7388/14785], loss: 8.7967, overflow: False, scale: 36028797018963968, lr: 0.008099, time: 598.14
- 2022-09-23 11:30:41,703:INFO:epoch: 4 step: [7444/14785], loss: 7.7907, overflow: False, scale: 36028797018963968, lr: 0.008113, time: 592.88
- 2022-09-23 11:31:15,110:INFO:epoch: 4 step: [7500/14785], loss: 7.7314, overflow: False, scale: 36028797018963968, lr: 0.008126, time: 596.51
- 2022-09-23 11:31:48,506:INFO:epoch: 4 step: [7556/14785], loss: 8.0847, overflow: False, scale: 36028797018963968, lr: 0.008140, time: 596.34
- 2022-09-23 11:32:22,009:INFO:epoch: 4 step: [7612/14785], loss: 7.6628, overflow: False, scale: 36028797018963968, lr: 0.008154, time: 598.23
- 2022-09-23 11:32:55,510:INFO:epoch: 4 step: [7668/14785], loss: 7.5476, overflow: False, scale: 36028797018963968, lr: 0.008167, time: 598.20
- 2022-09-23 11:33:29,003:INFO:epoch: 4 step: [7724/14785], loss: 7.6441, overflow: False, scale: 36028797018963968, lr: 0.008181, time: 598.05
- 2022-09-23 11:34:02,492:INFO:epoch: 4 step: [7780/14785], loss: 8.1367, overflow: False, scale: 36028797018963968, lr: 0.008195, time: 597.97
- 2022-09-23 11:34:36,101:INFO:epoch: 4 step: [7836/14785], loss: 8.1759, overflow: False, scale: 36028797018963968, lr: 0.008209, time: 600.11
- 2022-09-23 11:35:09,712:INFO:epoch: 4 step: [7892/14785], loss: 8.0601, overflow: False, scale: 36028797018963968, lr: 0.008222, time: 600.15
- 2022-09-23 11:35:43,196:INFO:epoch: 4 step: [7948/14785], loss: 7.5058, overflow: False, scale: 36028797018963968, lr: 0.008236, time: 597.87
- 2022-09-23 11:36:16,498:INFO:epoch: 4 step: [8004/14785], loss: 7.9468, overflow: False, scale: 36028797018963968, lr: 0.008250, time: 594.62
- 2022-09-23 11:36:49,910:INFO:epoch: 4 step: [8060/14785], loss: 7.7423, overflow: False, scale: 36028797018963968, lr: 0.008264, time: 596.55
- 2022-09-23 11:37:23,212:INFO:epoch: 4 step: [8116/14785], loss: 7.3982, overflow: False, scale: 36028797018963968, lr: 0.008277, time: 594.64
- 2022-09-23 11:37:56,807:INFO:epoch: 4 step: [8172/14785], loss: 7.3238, overflow: False, scale: 36028797018963968, lr: 0.008291, time: 599.86
- 2022-09-23 11:38:30,406:INFO:epoch: 4 step: [8228/14785], loss: 7.3074, overflow: False, scale: 36028797018963968, lr: 0.008305, time: 599.95
- 2022-09-23 11:39:04,000:INFO:epoch: 4 step: [8284/14785], loss: 7.8159, overflow: False, scale: 36028797018963968, lr: 0.008319, time: 599.86
- 2022-09-23 11:39:37,603:INFO:epoch: 4 step: [8340/14785], loss: 8.1597, overflow: False, scale: 36028797018963968, lr: 0.008333, time: 600.03
- 2022-09-23 11:40:11,191:INFO:epoch: 4 step: [8396/14785], loss: 7.3122, overflow: False, scale: 36028797018963968, lr: 0.008346, time: 599.75
- 2022-09-23 11:40:44,433:INFO:epoch: 4 step: [8452/14785], loss: 7.2171, overflow: False, scale: 36028797018963968, lr: 0.008360, time: 593.58
- 2022-09-23 11:41:17,801:INFO:epoch: 4 step: [8508/14785], loss: 8.1041, overflow: False, scale: 36028797018963968, lr: 0.008374, time: 595.81
- 2022-09-23 11:41:51,304:INFO:epoch: 4 step: [8564/14785], loss: 7.3065, overflow: False, scale: 36028797018963968, lr: 0.008388, time: 598.20
- 2022-09-23 11:42:24,901:INFO:epoch: 4 step: [8620/14785], loss: 7.4586, overflow: False, scale: 36028797018963968, lr: 0.008402, time: 599.93
- 2022-09-23 11:42:58,192:INFO:epoch: 4 step: [8676/14785], loss: 7.5243, overflow: False, scale: 36028797018963968, lr: 0.008416, time: 594.42
- 2022-09-23 11:43:31,601:INFO:epoch: 4 step: [8732/14785], loss: 7.4743, overflow: False, scale: 36028797018963968, lr: 0.008430, time: 596.54
- 2022-09-23 11:44:05,094:INFO:epoch: 4 step: [8788/14785], loss: 7.4466, overflow: False, scale: 36028797018963968, lr: 0.008444, time: 598.06
- 2022-09-23 11:44:38,706:INFO:epoch: 4 step: [8844/14785], loss: 7.6074, overflow: False, scale: 72057594037927936, lr: 0.008458, time: 600.18
- 2022-09-23 11:45:11,909:INFO:epoch: 4 step: [8900/14785], loss: 7.5162, overflow: False, scale: 72057594037927936, lr: 0.008471, time: 592.87
- 2022-09-23 11:45:45,402:INFO:epoch: 4 step: [8956/14785], loss: 7.8894, overflow: False, scale: 72057594037927936, lr: 0.008485, time: 598.05
- 2022-09-23 11:46:18,800:INFO:epoch: 4 step: [9012/14785], loss: 8.1095, overflow: False, scale: 72057594037927936, lr: 0.008499, time: 596.33
- 2022-09-23 11:46:52,310:INFO:epoch: 4 step: [9068/14785], loss: 7.2566, overflow: False, scale: 72057594037927936, lr: 0.008513, time: 598.36
- 2022-09-23 11:47:25,705:INFO:epoch: 4 step: [9124/14785], loss: 7.2896, overflow: False, scale: 72057594037927936, lr: 0.008527, time: 596.27
- 2022-09-23 11:47:59,101:INFO:epoch: 4 step: [9180/14785], loss: 7.6775, overflow: False, scale: 72057594037927936, lr: 0.008541, time: 596.31
- 2022-09-23 11:48:32,465:INFO:epoch: 4 step: [9236/14785], loss: 7.2886, overflow: False, scale: 72057594037927936, lr: 0.008555, time: 595.74
- 2022-09-23 11:49:06,100:INFO:epoch: 4 step: [9292/14785], loss: 7.5481, overflow: False, scale: 72057594037927936, lr: 0.008569, time: 600.57
- 2022-09-23 11:49:39,709:INFO:epoch: 4 step: [9348/14785], loss: 7.3532, overflow: False, scale: 72057594037927936, lr: 0.008583, time: 600.11
- 2022-09-23 11:50:13,144:INFO:epoch: 4 step: [9404/14785], loss: 7.6594, overflow: False, scale: 72057594037927936, lr: 0.008597, time: 597.02
- 2022-09-23 11:50:46,705:INFO:epoch: 4 step: [9460/14785], loss: 8.1305, overflow: False, scale: 72057594037927936, lr: 0.008611, time: 599.25
- 2022-09-23 11:51:20,194:INFO:epoch: 4 step: [9516/14785], loss: 7.4807, overflow: False, scale: 72057594037927936, lr: 0.008626, time: 597.97
- 2022-09-23 11:51:53,596:INFO:epoch: 4 step: [9572/14785], loss: 8.1032, overflow: False, scale: 72057594037927936, lr: 0.008640, time: 596.45
- 2022-09-23 11:52:27,104:INFO:epoch: 4 step: [9628/14785], loss: 7.8981, overflow: False, scale: 72057594037927936, lr: 0.008654, time: 598.32
- 2022-09-23 11:53:00,498:INFO:epoch: 4 step: [9684/14785], loss: 7.6368, overflow: False, scale: 72057594037927936, lr: 0.008668, time: 596.29
- 2022-09-23 11:53:33,994:INFO:epoch: 4 step: [9740/14785], loss: 8.1714, overflow: False, scale: 72057594037927936, lr: 0.008682, time: 598.09
- 2022-09-23 11:54:07,601:INFO:epoch: 4 step: [9796/14785], loss: 7.8770, overflow: False, scale: 72057594037927936, lr: 0.008696, time: 600.08
- 2022-09-23 11:54:41,206:INFO:epoch: 4 step: [9852/14785], loss: 7.4940, overflow: False, scale: 72057594037927936, lr: 0.008710, time: 600.05
- 2022-09-23 11:55:14,803:INFO:epoch: 4 step: [9908/14785], loss: 6.9927, overflow: False, scale: 72057594037927936, lr: 0.008724, time: 599.92
- 2022-09-23 11:55:48,310:INFO:epoch: 4 step: [9964/14785], loss: 7.5985, overflow: False, scale: 72057594037927936, lr: 0.008738, time: 598.28
- 2022-09-23 11:56:21,701:INFO:epoch: 4 step: [10020/14785], loss: 7.2443, overflow: False, scale: 72057594037927936, lr: 0.008753, time: 596.24
- 2022-09-23 11:56:55,114:INFO:epoch: 4 step: [10076/14785], loss: 8.1369, overflow: False, scale: 72057594037927936, lr: 0.008767, time: 596.62
- 2022-09-23 11:57:28,806:INFO:epoch: 4 step: [10132/14785], loss: 7.9510, overflow: False, scale: 72057594037927936, lr: 0.008781, time: 601.60
- 2022-09-23 11:58:02,301:INFO:epoch: 4 step: [10188/14785], loss: 7.8609, overflow: False, scale: 72057594037927936, lr: 0.008795, time: 598.08
- 2022-09-23 11:58:35,901:INFO:epoch: 4 step: [10244/14785], loss: 7.0770, overflow: False, scale: 72057594037927936, lr: 0.008809, time: 599.96
- 2022-09-23 11:59:09,305:INFO:epoch: 4 step: [10300/14785], loss: 7.8191, overflow: False, scale: 72057594037927936, lr: 0.008824, time: 596.47
- 2022-09-23 11:59:42,802:INFO:epoch: 4 step: [10356/14785], loss: 8.2266, overflow: False, scale: 72057594037927936, lr: 0.008838, time: 598.10
- 2022-09-23 12:00:16,205:INFO:epoch: 4 step: [10412/14785], loss: 7.6349, overflow: False, scale: 72057594037927936, lr: 0.008852, time: 596.42
- 2022-09-23 12:00:49,710:INFO:epoch: 4 step: [10468/14785], loss: 8.2964, overflow: False, scale: 72057594037927936, lr: 0.008866, time: 598.26
- 2022-09-23 12:01:23,301:INFO:epoch: 4 step: [10524/14785], loss: 7.6119, overflow: False, scale: 72057594037927936, lr: 0.008881, time: 599.81
- 2022-09-23 12:01:56,898:INFO:epoch: 4 step: [10580/14785], loss: 8.0623, overflow: False, scale: 72057594037927936, lr: 0.008895, time: 599.88
- 2022-09-23 12:02:30,390:INFO:epoch: 4 step: [10636/14785], loss: 7.7764, overflow: False, scale: 72057594037927936, lr: 0.008909, time: 598.04
- 2022-09-23 12:03:04,009:INFO:epoch: 4 step: [10692/14785], loss: 7.4997, overflow: False, scale: 72057594037927936, lr: 0.008924, time: 600.31
- 2022-09-23 12:03:37,516:INFO:epoch: 4 step: [10748/14785], loss: 9.1100, overflow: False, scale: 72057594037927936, lr: 0.008938, time: 598.30
- 2022-09-23 12:04:11,105:INFO:epoch: 4 step: [10804/14785], loss: 8.4552, overflow: False, scale: 144115188075855872, lr: 0.008952, time: 599.77
- 2022-09-23 12:04:44,503:INFO:epoch: 4 step: [10860/14785], loss: 7.9760, overflow: False, scale: 144115188075855872, lr: 0.008967, time: 596.35
- 2022-09-23 12:05:17,705:INFO:epoch: 4 step: [10916/14785], loss: 7.5971, overflow: False, scale: 144115188075855872, lr: 0.008981, time: 592.84
- 2022-09-23 12:05:51,201:INFO:epoch: 4 step: [10972/14785], loss: 7.7254, overflow: False, scale: 144115188075855872, lr: 0.008995, time: 598.11
- 2022-09-23 12:06:24,596:INFO:epoch: 4 step: [11028/14785], loss: 7.4734, overflow: False, scale: 144115188075855872, lr: 0.009010, time: 596.29
- 2022-09-23 12:06:58,106:INFO:epoch: 4 step: [11084/14785], loss: 7.3275, overflow: False, scale: 144115188075855872, lr: 0.009024, time: 598.36
- 2022-09-23 12:07:31,599:INFO:epoch: 4 step: [11140/14785], loss: 7.5072, overflow: False, scale: 144115188075855872, lr: 0.009038, time: 598.06
- 2022-09-23 12:08:05,198:INFO:epoch: 4 step: [11196/14785], loss: 7.6452, overflow: False, scale: 144115188075855872, lr: 0.009053, time: 599.94
- 2022-09-23 12:08:38,804:INFO:epoch: 4 step: [11252/14785], loss: 7.7930, overflow: False, scale: 144115188075855872, lr: 0.009067, time: 600.06
- 2022-09-23 12:09:12,304:INFO:epoch: 4 step: [11308/14785], loss: 8.0090, overflow: False, scale: 144115188075855872, lr: 0.009082, time: 598.19
- 2022-09-23 12:09:45,700:INFO:epoch: 4 step: [11364/14785], loss: 7.8237, overflow: False, scale: 144115188075855872, lr: 0.009096, time: 596.33
- 2022-09-23 12:10:19,207:INFO:epoch: 4 step: [11420/14785], loss: 8.0672, overflow: False, scale: 144115188075855872, lr: 0.009111, time: 598.29
- 2022-09-23 12:10:52,793:INFO:epoch: 4 step: [11476/14785], loss: 7.2396, overflow: False, scale: 144115188075855872, lr: 0.009125, time: 599.71
- 2022-09-23 12:11:26,304:INFO:epoch: 4 step: [11532/14785], loss: 7.4777, overflow: False, scale: 144115188075855872, lr: 0.009140, time: 598.39
- 2022-09-23 12:11:59,907:INFO:epoch: 4 step: [11588/14785], loss: 7.1247, overflow: False, scale: 144115188075855872, lr: 0.009154, time: 600.00
- 2022-09-23 12:12:33,298:INFO:epoch: 4 step: [11644/14785], loss: 8.4224, overflow: False, scale: 144115188075855872, lr: 0.009169, time: 596.23
- 2022-09-23 12:13:06,704:INFO:epoch: 4 step: [11700/14785], loss: 8.2490, overflow: False, scale: 144115188075855872, lr: 0.009183, time: 596.49
- 2022-09-23 12:13:40,301:INFO:epoch: 4 step: [11756/14785], loss: 7.7570, overflow: False, scale: 144115188075855872, lr: 0.009198, time: 599.91
- 2022-09-23 12:14:13,605:INFO:epoch: 4 step: [11812/14785], loss: 7.2719, overflow: False, scale: 144115188075855872, lr: 0.009212, time: 594.68
- 2022-09-23 12:14:47,002:INFO:epoch: 4 step: [11868/14785], loss: 7.5572, overflow: False, scale: 144115188075855872, lr: 0.009227, time: 596.34
- 2022-09-23 12:15:20,601:INFO:epoch: 4 step: [11924/14785], loss: 7.7664, overflow: False, scale: 144115188075855872, lr: 0.009241, time: 599.94
- 2022-09-23 12:15:54,106:INFO:epoch: 4 step: [11980/14785], loss: 7.6763, overflow: False, scale: 144115188075855872, lr: 0.009256, time: 598.23
- 2022-09-23 12:16:27,605:INFO:epoch: 4 step: [12036/14785], loss: 7.6662, overflow: False, scale: 144115188075855872, lr: 0.009270, time: 598.18
- 2022-09-23 12:17:01,205:INFO:epoch: 4 step: [12092/14785], loss: 7.3300, overflow: False, scale: 144115188075855872, lr: 0.009285, time: 599.97
- 2022-09-23 12:17:34,708:INFO:epoch: 4 step: [12148/14785], loss: 7.6591, overflow: False, scale: 144115188075855872, lr: 0.009300, time: 598.21
- 2022-09-23 12:18:08,100:INFO:epoch: 4 step: [12204/14785], loss: 7.0702, overflow: False, scale: 144115188075855872, lr: 0.009314, time: 596.26
- 2022-09-23 12:18:41,601:INFO:epoch: 4 step: [12260/14785], loss: 7.2323, overflow: False, scale: 144115188075855872, lr: 0.009329, time: 598.19
- 2022-09-23 12:19:15,197:INFO:epoch: 4 step: [12316/14785], loss: 7.6242, overflow: False, scale: 144115188075855872, lr: 0.009343, time: 599.89
- 2022-09-23 12:19:48,810:INFO:epoch: 4 step: [12372/14785], loss: 7.7029, overflow: False, scale: 144115188075855872, lr: 0.009358, time: 600.20
- 2022-09-23 12:20:22,400:INFO:epoch: 4 step: [12428/14785], loss: 6.9800, overflow: False, scale: 144115188075855872, lr: 0.009373, time: 599.79
- 2022-09-23 12:20:55,897:INFO:epoch: 4 step: [12484/14785], loss: 7.4734, overflow: False, scale: 144115188075855872, lr: 0.009387, time: 598.12
- 2022-09-23 12:21:29,301:INFO:epoch: 4 step: [12540/14785], loss: 8.0371, overflow: False, scale: 144115188075855872, lr: 0.009402, time: 596.47
- 2022-09-23 12:22:02,596:INFO:epoch: 4 step: [12596/14785], loss: 6.9025, overflow: False, scale: 144115188075855872, lr: 0.009417, time: 594.49
- 2022-09-23 12:22:36,195:INFO:epoch: 4 step: [12652/14785], loss: 7.6787, overflow: False, scale: 144115188075855872, lr: 0.009432, time: 599.94
- 2022-09-23 12:23:09,801:INFO:epoch: 4 step: [12708/14785], loss: 7.4361, overflow: False, scale: 144115188075855872, lr: 0.009446, time: 600.06
- 2022-09-23 12:23:43,399:INFO:epoch: 4 step: [12764/14785], loss: 8.3277, overflow: False, scale: 144115188075855872, lr: 0.009461, time: 599.93
- 2022-09-23 12:24:16,798:INFO:epoch: 4 step: [12820/14785], loss: 7.4233, overflow: False, scale: 288230376151711744, lr: 0.009476, time: 596.38
- 2022-09-23 12:24:50,274:INFO:epoch: 4 step: [12876/14785], loss: 7.2905, overflow: False, scale: 288230376151711744, lr: 0.009490, time: 597.74
- 2022-09-23 12:25:23,606:INFO:epoch: 4 step: [12932/14785], loss: 7.0465, overflow: False, scale: 288230376151711744, lr: 0.009505, time: 595.19
- 2022-09-23 12:25:57,204:INFO:epoch: 4 step: [12988/14785], loss: 7.6562, overflow: False, scale: 288230376151711744, lr: 0.009520, time: 599.93
- 2022-09-23 12:26:30,811:INFO:epoch: 4 step: [13044/14785], loss: 7.8122, overflow: False, scale: 288230376151711744, lr: 0.009535, time: 600.08
- 2022-09-23 12:27:04,300:INFO:epoch: 4 step: [13100/14785], loss: 7.3925, overflow: False, scale: 288230376151711744, lr: 0.009550, time: 597.98
- 2022-09-23 12:27:37,814:INFO:epoch: 4 step: [13156/14785], loss: 7.8582, overflow: False, scale: 288230376151711744, lr: 0.009564, time: 598.42
- 2022-09-23 12:28:11,276:INFO:epoch: 4 step: [13212/14785], loss: 7.6567, overflow: False, scale: 288230376151711744, lr: 0.009579, time: 597.50
- 2022-09-23 12:28:44,700:INFO:epoch: 4 step: [13268/14785], loss: 7.9387, overflow: False, scale: 288230376151711744, lr: 0.009594, time: 596.83
- 2022-09-23 12:29:18,211:INFO:epoch: 4 step: [13324/14785], loss: 7.7969, overflow: False, scale: 288230376151711744, lr: 0.009609, time: 598.37
- 2022-09-23 12:29:51,504:INFO:epoch: 4 step: [13380/14785], loss: 7.0993, overflow: False, scale: 288230376151711744, lr: 0.009624, time: 594.46
- 2022-09-23 12:30:25,112:INFO:epoch: 4 step: [13436/14785], loss: 7.8118, overflow: False, scale: 288230376151711744, lr: 0.009639, time: 600.12
- 2022-09-23 12:30:58,611:INFO:epoch: 4 step: [13492/14785], loss: 7.3794, overflow: False, scale: 288230376151711744, lr: 0.009654, time: 598.17
- 2022-09-23 12:31:31,816:INFO:epoch: 4 step: [13548/14785], loss: 7.4724, overflow: False, scale: 288230376151711744, lr: 0.009668, time: 592.91
- 2022-09-23 12:32:05,203:INFO:epoch: 4 step: [13604/14785], loss: 7.7035, overflow: False, scale: 288230376151711744, lr: 0.009683, time: 596.15
- 2022-09-23 12:32:38,698:INFO:epoch: 4 step: [13660/14785], loss: 7.5271, overflow: False, scale: 288230376151711744, lr: 0.009698, time: 598.08
- 2022-09-23 12:33:12,096:INFO:epoch: 4 step: [13716/14785], loss: 6.9615, overflow: False, scale: 288230376151711744, lr: 0.009713, time: 596.34
- 2022-09-23 12:33:45,479:INFO:epoch: 4 step: [13772/14785], loss: 7.1186, overflow: False, scale: 288230376151711744, lr: 0.009728, time: 596.08
- 2022-09-23 12:34:19,003:INFO:epoch: 4 step: [13828/14785], loss: 7.3956, overflow: False, scale: 288230376151711744, lr: 0.009743, time: 598.61
- 2022-09-23 12:34:52,501:INFO:epoch: 4 step: [13884/14785], loss: 7.6596, overflow: False, scale: 288230376151711744, lr: 0.009758, time: 598.14
- 2022-09-23 12:35:25,790:INFO:epoch: 4 step: [13940/14785], loss: 8.0211, overflow: False, scale: 288230376151711744, lr: 0.009773, time: 594.40
- 2022-09-23 12:35:59,402:INFO:epoch: 4 step: [13996/14785], loss: 7.9722, overflow: False, scale: 288230376151711744, lr: 0.009788, time: 600.17
- 2022-09-23 12:36:32,894:INFO:epoch: 4 step: [14052/14785], loss: 7.3620, overflow: False, scale: 288230376151711744, lr: 0.009803, time: 598.03
- 2022-09-23 12:37:06,406:INFO:epoch: 4 step: [14108/14785], loss: 8.0710, overflow: False, scale: 288230376151711744, lr: 0.009818, time: 598.39
- 2022-09-23 12:37:40,008:INFO:epoch: 4 step: [14164/14785], loss: 7.8799, overflow: False, scale: 288230376151711744, lr: 0.009833, time: 599.99
- 2022-09-23 12:38:13,396:INFO:epoch: 4 step: [14220/14785], loss: 7.7440, overflow: False, scale: 288230376151711744, lr: 0.009848, time: 596.17
- 2022-09-23 12:38:46,712:INFO:epoch: 4 step: [14276/14785], loss: 7.9227, overflow: False, scale: 288230376151711744, lr: 0.009863, time: 594.88
- 2022-09-23 12:39:20,286:INFO:epoch: 4 step: [14332/14785], loss: 7.4171, overflow: False, scale: 288230376151711744, lr: 0.009878, time: 599.49
- 2022-09-23 12:39:53,906:INFO:epoch: 4 step: [14388/14785], loss: 8.1945, overflow: False, scale: 288230376151711744, lr: 0.009893, time: 600.30
- 2022-09-23 12:40:27,501:INFO:epoch: 4 step: [14444/14785], loss: 7.1478, overflow: False, scale: 288230376151711744, lr: 0.009908, time: 599.86
- 2022-09-23 12:41:01,109:INFO:epoch: 4 step: [14500/14785], loss: 7.5511, overflow: False, scale: 288230376151711744, lr: 0.009923, time: 600.11
- 2022-09-23 12:41:34,506:INFO:epoch: 4 step: [14556/14785], loss: 7.7657, overflow: False, scale: 288230376151711744, lr: 0.009938, time: 596.34
- 2022-09-23 12:42:08,005:INFO:epoch: 4 step: [14612/14785], loss: 6.8297, overflow: False, scale: 288230376151711744, lr: 0.009954, time: 598.16
- 2022-09-23 12:42:41,312:INFO:epoch: 4 step: [14668/14785], loss: 8.1030, overflow: False, scale: 288230376151711744, lr: 0.009969, time: 594.73
- 2022-09-23 12:43:14,908:INFO:epoch: 4 step: [14724/14785], loss: 7.3608, overflow: False, scale: 288230376151711744, lr: 0.009984, time: 599.88
- 2022-09-23 12:43:14,979:INFO:Start inference...
- 2022-09-23 12:43:18,582:INFO:eval dataset size, 625
- 2022-09-23 12:48:28,434:INFO:Calculating mAP...625
- Evaluate in main process...
- 2022-09-23 12:48:49,861:INFO:result file path: ./2022-09-23_time_00_16_23/predict_2022_09_23_12_48_28.json
- Loading and preparing results...
- DONE (t=9.55s)
- creating index...
- index created!
- Running per image evaluation...
- Evaluate annotation type *bbox*
- DONE (t=79.83s).
- Accumulating evaluation results...
- DONE (t=27.07s).
- =====================file_name: ./2022-09-23_time_00_16_23/ckpt_0/best.ckpt
- 2022-09-23 12:50:50,334:INFO:Best result 0.07428548350796693 at 1320 epoch
- 2022-09-23 12:50:50,337:INFO:
- =============coco eval result=========
- Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.074
- Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.147
- Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.068
- Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.050
- Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.101
- Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.076
- Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.105
- Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.187
- Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.205
- Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.097
- Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.242
- Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.230
-
- 2022-09-23 12:50:50,337:INFO:Ending inference...
- 2022-09-23 12:51:23,607:INFO:epoch: 4 step: [14780/14785], loss: 8.2460, overflow: False, scale: 288230376151711744, lr: 0.009999, time: 8726.73
- 2022-09-23 12:51:57,010:INFO:epoch: 5 step: [51/14785], loss: 8.4899, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 596.44
- 2022-09-23 12:52:30,603:INFO:epoch: 5 step: [107/14785], loss: 7.5757, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 599.83
- 2022-09-23 12:53:04,104:INFO:epoch: 5 step: [163/14785], loss: 8.0620, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 598.18
- 2022-09-23 12:53:39,375:INFO:epoch: 5 step: [219/14785], loss: 6.7666, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 629.80
- 2022-09-23 12:54:12,908:INFO:epoch: 5 step: [275/14785], loss: 7.5971, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 598.78
- 2022-09-23 12:54:46,381:INFO:epoch: 5 step: [331/14785], loss: 7.4100, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 597.65
- 2022-09-23 12:55:19,901:INFO:epoch: 5 step: [387/14785], loss: 7.5730, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 598.52
- 2022-09-23 12:55:52,889:INFO:epoch: 5 step: [443/14785], loss: 7.6361, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 589.02
- 2022-09-23 12:56:26,091:INFO:epoch: 5 step: [499/14785], loss: 7.4013, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 592.84
- 2022-09-23 12:56:58,902:INFO:epoch: 5 step: [555/14785], loss: 7.8492, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 585.85
- 2022-09-23 12:57:32,005:INFO:epoch: 5 step: [611/14785], loss: 7.7421, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 591.07
- 2022-09-23 12:58:05,266:INFO:epoch: 5 step: [667/14785], loss: 7.6930, overflow: False, scale: 576460752303423488, lr: 0.010000, time: 593.91
- 2022-09-23 12:58:38,197:INFO:epoch: 5 step: [723/14785], loss: 7.6498, overflow: False, scale: 576460752303423488, lr: 0.009999, time: 588.01
- 2022-09-23 12:59:11,189:INFO:epoch: 5 step: [779/14785], loss: 8.3869, overflow: False, scale: 576460752303423488, lr: 0.009999, time: 589.10
- 2022-09-23 12:59:44,396:INFO:epoch: 5 step: [835/14785], loss: 7.6017, overflow: False, scale: 576460752303423488, lr: 0.009999, time: 592.94
- 2022-09-23 13:00:17,505:INFO:epoch: 5 step: [891/14785], loss: 7.1774, overflow: False, scale: 576460752303423488, lr: 0.009999, time: 591.18
- 2022-09-23 13:00:50,302:INFO:epoch: 5 step: [947/14785], loss: 7.3915, overflow: False, scale: 576460752303423488, lr: 0.009999, time: 585.62
- 2022-09-23 13:01:23,104:INFO:epoch: 5 step: [1003/14785], loss: 7.5121, overflow: False, scale: 576460752303423488, lr: 0.009999, time: 585.70
- 2022-09-23 13:01:56,097:INFO:epoch: 5 step: [1059/14785], loss: 7.3280, overflow: False, scale: 576460752303423488, lr: 0.009999, time: 589.12
- 2022-09-23 13:02:29,257:INFO:epoch: 5 step: [1115/14785], loss: 7.2136, overflow: False, scale: 576460752303423488, lr: 0.009999, time: 592.09
- 2022-09-23 13:03:02,401:INFO:epoch: 5 step: [1171/14785], loss: 7.0959, overflow: False, scale: 576460752303423488, lr: 0.009998, time: 591.82
- 2022-09-23 13:03:35,297:INFO:epoch: 5 step: [1227/14785], loss: 8.0077, overflow: False, scale: 576460752303423488, lr: 0.009998, time: 587.40
- 2022-09-23 13:04:08,095:INFO:epoch: 5 step: [1283/14785], loss: 7.2634, overflow: False, scale: 576460752303423488, lr: 0.009998, time: 585.64
- 2022-09-23 13:04:40,991:INFO:epoch: 5 step: [1339/14785], loss: 6.8226, overflow: False, scale: 576460752303423488, lr: 0.009998, time: 587.38
- 2022-09-23 13:05:14,068:INFO:epoch: 5 step: [1395/14785], loss: 7.8122, overflow: False, scale: 576460752303423488, lr: 0.009998, time: 590.61
- 2022-09-23 13:05:46,949:INFO:epoch: 5 step: [1451/14785], loss: 7.0142, overflow: False, scale: 576460752303423488, lr: 0.009998, time: 587.10
- 2022-09-23 13:06:20,099:INFO:epoch: 5 step: [1507/14785], loss: 7.4382, overflow: False, scale: 576460752303423488, lr: 0.009997, time: 591.92
- 2022-09-23 13:06:53,401:INFO:epoch: 5 step: [1563/14785], loss: 6.9521, overflow: False, scale: 576460752303423488, lr: 0.009997, time: 594.64
- 2022-09-23 13:07:26,498:INFO:epoch: 5 step: [1619/14785], loss: 7.9279, overflow: False, scale: 576460752303423488, lr: 0.009997, time: 590.97
- 2022-09-23 13:07:59,446:INFO:epoch: 5 step: [1675/14785], loss: 8.1234, overflow: False, scale: 576460752303423488, lr: 0.009997, time: 588.33
- 2022-09-23 13:08:32,189:INFO:epoch: 5 step: [1731/14785], loss: 7.5731, overflow: False, scale: 576460752303423488, lr: 0.009997, time: 584.65
- 2022-09-23 13:09:05,139:INFO:epoch: 5 step: [1787/14785], loss: 7.8834, overflow: False, scale: 576460752303423488, lr: 0.009996, time: 588.33
- 2022-09-23 13:09:38,196:INFO:epoch: 5 step: [1843/14785], loss: 6.6215, overflow: False, scale: 576460752303423488, lr: 0.009996, time: 590.27
- 2022-09-23 13:10:11,158:INFO:epoch: 5 step: [1899/14785], loss: 7.1004, overflow: False, scale: 576460752303423488, lr: 0.009996, time: 588.55
- 2022-09-23 13:10:44,499:INFO:epoch: 5 step: [1955/14785], loss: 7.8949, overflow: False, scale: 576460752303423488, lr: 0.009996, time: 595.34
- 2022-09-23 13:11:17,699:INFO:epoch: 5 step: [2011/14785], loss: 7.5890, overflow: False, scale: 576460752303423488, lr: 0.009995, time: 592.80
- 2022-09-23 13:11:50,545:INFO:epoch: 5 step: [2067/14785], loss: 7.6109, overflow: False, scale: 576460752303423488, lr: 0.009995, time: 586.49
- 2022-09-23 13:12:23,691:INFO:epoch: 5 step: [2123/14785], loss: 7.8677, overflow: False, scale: 576460752303423488, lr: 0.009995, time: 591.84
- 2022-09-23 13:12:56,597:INFO:epoch: 5 step: [2179/14785], loss: 8.4522, overflow: False, scale: 576460752303423488, lr: 0.009995, time: 587.56
- 2022-09-23 13:13:29,497:INFO:epoch: 5 step: [2235/14785], loss: 7.3896, overflow: False, scale: 576460752303423488, lr: 0.009994, time: 587.45
- 2022-09-23 13:14:02,395:INFO:epoch: 5 step: [2291/14785], loss: 7.2422, overflow: False, scale: 576460752303423488, lr: 0.009994, time: 587.43
- 2022-09-23 13:14:35,593:INFO:epoch: 5 step: [2347/14785], loss: 8.3914, overflow: False, scale: 576460752303423488, lr: 0.009994, time: 592.77
- 2022-09-23 13:15:08,522:INFO:epoch: 5 step: [2403/14785], loss: 7.4264, overflow: False, scale: 576460752303423488, lr: 0.009994, time: 587.96
- 2022-09-23 13:15:41,428:INFO:epoch: 5 step: [2459/14785], loss: 7.5503, overflow: False, scale: 576460752303423488, lr: 0.009993, time: 587.56
- 2022-09-23 13:16:14,485:INFO:epoch: 5 step: [2515/14785], loss: 7.4714, overflow: False, scale: 576460752303423488, lr: 0.009993, time: 590.26
- 2022-09-23 13:16:47,404:INFO:epoch: 5 step: [2571/14785], loss: 7.9643, overflow: False, scale: 576460752303423488, lr: 0.009993, time: 587.78
- 2022-09-23 13:17:20,490:INFO:epoch: 5 step: [2627/14785], loss: 7.4967, overflow: False, scale: 576460752303423488, lr: 0.009992, time: 590.78
- 2022-09-23 13:17:53,698:INFO:epoch: 5 step: [2683/14785], loss: 7.5087, overflow: False, scale: 576460752303423488, lr: 0.009992, time: 592.95
- 2022-09-23 13:18:26,801:INFO:epoch: 5 step: [2739/14785], loss: 7.0859, overflow: False, scale: 576460752303423488, lr: 0.009992, time: 591.07
- 2022-09-23 13:18:59,899:INFO:epoch: 5 step: [2795/14785], loss: 8.3587, overflow: False, scale: 576460752303423488, lr: 0.009991, time: 591.00
- 2022-09-23 13:19:32,915:INFO:epoch: 5 step: [2851/14785], loss: 7.0523, overflow: False, scale: 576460752303423488, lr: 0.009991, time: 589.53
- 2022-09-23 13:20:05,897:INFO:epoch: 5 step: [2907/14785], loss: 6.9255, overflow: False, scale: 576460752303423488, lr: 0.009991, time: 588.92
- 2022-09-23 13:20:38,899:INFO:epoch: 5 step: [2963/14785], loss: 7.0900, overflow: False, scale: 576460752303423488, lr: 0.009990, time: 589.26
- 2022-09-23 13:21:11,899:INFO:epoch: 5 step: [3019/14785], loss: 7.2082, overflow: False, scale: 576460752303423488, lr: 0.009990, time: 589.26
- 2022-09-23 13:21:44,901:INFO:epoch: 5 step: [3075/14785], loss: 7.5060, overflow: False, scale: 576460752303423488, lr: 0.009989, time: 589.28
- 2022-09-23 13:22:18,307:INFO:epoch: 5 step: [3131/14785], loss: 8.0521, overflow: False, scale: 576460752303423488, lr: 0.009989, time: 596.47
- 2022-09-23 13:22:51,506:INFO:epoch: 5 step: [3187/14785], loss: 6.9682, overflow: False, scale: 576460752303423488, lr: 0.009989, time: 592.79
- 2022-09-23 13:23:24,816:INFO:epoch: 5 step: [3243/14785], loss: 7.8089, overflow: False, scale: 576460752303423488, lr: 0.009988, time: 594.77
- 2022-09-23 13:23:58,106:INFO:epoch: 5 step: [3299/14785], loss: 7.5650, overflow: False, scale: 576460752303423488, lr: 0.009988, time: 594.42
- 2022-09-23 13:24:31,300:INFO:epoch: 5 step: [3355/14785], loss: 8.0065, overflow: False, scale: 576460752303423488, lr: 0.009987, time: 592.71
- 2022-09-23 13:25:04,405:INFO:epoch: 5 step: [3411/14785], loss: 7.2881, overflow: False, scale: 576460752303423488, lr: 0.009987, time: 591.11
- 2022-09-23 13:25:37,401:INFO:epoch: 5 step: [3467/14785], loss: 7.2093, overflow: False, scale: 576460752303423488, lr: 0.009987, time: 589.17
- 2022-09-23 13:26:10,904:INFO:epoch: 5 step: [3523/14785], loss: 8.2399, overflow: False, scale: 576460752303423488, lr: 0.009986, time: 598.23
- 2022-09-23 13:26:43,904:INFO:epoch: 5 step: [3579/14785], loss: 7.0074, overflow: False, scale: 576460752303423488, lr: 0.009986, time: 589.23
- 2022-09-23 13:27:16,901:INFO:epoch: 5 step: [3635/14785], loss: 7.8259, overflow: False, scale: 576460752303423488, lr: 0.009985, time: 589.16
- 2022-09-23 13:27:50,097:INFO:epoch: 5 step: [3691/14785], loss: 8.1177, overflow: False, scale: 576460752303423488, lr: 0.009985, time: 592.73
- 2022-09-23 13:28:23,105:INFO:epoch: 5 step: [3747/14785], loss: 8.4058, overflow: False, scale: 576460752303423488, lr: 0.009984, time: 589.37
- 2022-09-23 13:28:56,203:INFO:epoch: 5 step: [3803/14785], loss: 7.9069, overflow: False, scale: 576460752303423488, lr: 0.009984, time: 590.97
- 2022-09-23 13:29:29,395:INFO:epoch: 5 step: [3859/14785], loss: 6.8241, overflow: False, scale: 576460752303423488, lr: 0.009983, time: 592.66
- 2022-09-23 13:30:02,308:INFO:epoch: 5 step: [3915/14785], loss: 7.2017, overflow: False, scale: 576460752303423488, lr: 0.009983, time: 587.68
- 2022-09-23 13:30:35,222:INFO:epoch: 5 step: [3971/14785], loss: 7.9996, overflow: False, scale: 576460752303423488, lr: 0.009982, time: 587.71
- 2022-09-23 13:31:08,332:INFO:epoch: 5 step: [4027/14785], loss: 7.3788, overflow: False, scale: 576460752303423488, lr: 0.009982, time: 591.20
- 2022-09-23 13:31:41,508:INFO:epoch: 5 step: [4083/14785], loss: 7.9437, overflow: False, scale: 1152921504606846976, lr: 0.009981, time: 592.39
- 2022-09-23 13:32:14,411:INFO:epoch: 5 step: [4139/14785], loss: 7.5749, overflow: False, scale: 576460752303423488, lr: 0.009981, time: 587.52
- 2022-09-23 13:32:47,466:INFO:epoch: 5 step: [4195/14785], loss: 7.8015, overflow: False, scale: 576460752303423488, lr: 0.009980, time: 590.23
- 2022-09-23 13:33:20,491:INFO:epoch: 5 step: [4251/14785], loss: 7.0654, overflow: False, scale: 576460752303423488, lr: 0.009980, time: 589.67
- 2022-09-23 13:33:53,297:INFO:epoch: 5 step: [4307/14785], loss: 7.2211, overflow: False, scale: 576460752303423488, lr: 0.009979, time: 585.78
- 2022-09-23 13:34:26,405:INFO:epoch: 5 step: [4363/14785], loss: 7.8442, overflow: False, scale: 576460752303423488, lr: 0.009979, time: 591.16
- 2022-09-23 13:34:59,422:INFO:epoch: 5 step: [4419/14785], loss: 7.9805, overflow: False, scale: 576460752303423488, lr: 0.009978, time: 589.55
- 2022-09-23 13:35:32,524:INFO:epoch: 5 step: [4475/14785], loss: 8.1211, overflow: False, scale: 576460752303423488, lr: 0.009978, time: 591.08
- 2022-09-23 13:36:05,791:INFO:epoch: 5 step: [4531/14785], loss: 7.2694, overflow: False, scale: 576460752303423488, lr: 0.009977, time: 594.00
- 2022-09-23 13:36:39,102:INFO:epoch: 5 step: [4587/14785], loss: 8.1189, overflow: False, scale: 576460752303423488, lr: 0.009977, time: 594.80
- 2022-09-23 13:37:12,304:INFO:epoch: 5 step: [4643/14785], loss: 7.6487, overflow: False, scale: 576460752303423488, lr: 0.009976, time: 592.85
- 2022-09-23 13:37:45,502:INFO:epoch: 5 step: [4699/14785], loss: 7.7231, overflow: False, scale: 576460752303423488, lr: 0.009975, time: 592.79
- 2022-09-23 13:38:18,393:INFO:epoch: 5 step: [4755/14785], loss: 7.9806, overflow: False, scale: 576460752303423488, lr: 0.009975, time: 587.29
- 2022-09-23 13:38:51,490:INFO:epoch: 5 step: [4811/14785], loss: 7.8457, overflow: False, scale: 576460752303423488, lr: 0.009974, time: 590.96
- 2022-09-23 13:39:24,602:INFO:epoch: 5 step: [4867/14785], loss: 7.1969, overflow: False, scale: 576460752303423488, lr: 0.009974, time: 591.27
- 2022-09-23 13:39:57,751:INFO:epoch: 5 step: [4923/14785], loss: 7.2909, overflow: False, scale: 576460752303423488, lr: 0.009973, time: 591.88
- 2022-09-23 13:40:30,731:INFO:epoch: 5 step: [4979/14785], loss: 7.4156, overflow: False, scale: 576460752303423488, lr: 0.009972, time: 588.88
- 2022-09-23 13:41:03,808:INFO:epoch: 5 step: [5035/14785], loss: 7.3036, overflow: False, scale: 576460752303423488, lr: 0.009972, time: 590.62
- 2022-09-23 13:41:37,008:INFO:epoch: 5 step: [5091/14785], loss: 7.3820, overflow: False, scale: 576460752303423488, lr: 0.009971, time: 592.81
- 2022-09-23 13:42:10,200:INFO:epoch: 5 step: [5147/14785], loss: 8.1927, overflow: False, scale: 576460752303423488, lr: 0.009970, time: 592.66
- 2022-09-23 13:42:43,278:INFO:epoch: 5 step: [5203/14785], loss: 7.4894, overflow: False, scale: 576460752303423488, lr: 0.009970, time: 590.64
- 2022-09-23 13:43:16,502:INFO:epoch: 5 step: [5259/14785], loss: 7.7981, overflow: False, scale: 576460752303423488, lr: 0.009969, time: 593.25
- 2022-09-23 13:43:49,806:INFO:epoch: 5 step: [5315/14785], loss: 7.3909, overflow: False, scale: 576460752303423488, lr: 0.009968, time: 594.64
- 2022-09-23 13:44:23,011:INFO:epoch: 5 step: [5371/14785], loss: 8.6340, overflow: False, scale: 576460752303423488, lr: 0.009968, time: 592.90
- 2022-09-23 13:44:56,304:INFO:epoch: 5 step: [5427/14785], loss: 8.1797, overflow: False, scale: 576460752303423488, lr: 0.009967, time: 594.47
- 2022-09-23 13:45:29,419:INFO:epoch: 5 step: [5483/14785], loss: 7.1876, overflow: False, scale: 576460752303423488, lr: 0.009966, time: 591.30
- 2022-09-23 13:46:02,806:INFO:epoch: 5 step: [5539/14785], loss: 7.0700, overflow: False, scale: 288230376151711744, lr: 0.009966, time: 596.15
- 2022-09-23 13:46:36,122:INFO:epoch: 5 step: [5595/14785], loss: 7.8495, overflow: False, scale: 288230376151711744, lr: 0.009965, time: 594.86
- 2022-09-23 13:47:09,300:INFO:epoch: 5 step: [5651/14785], loss: 7.6650, overflow: False, scale: 288230376151711744, lr: 0.009964, time: 592.43
- 2022-09-23 13:47:42,390:INFO:epoch: 5 step: [5707/14785], loss: 7.5270, overflow: False, scale: 288230376151711744, lr: 0.009964, time: 590.84
- 2022-09-23 13:48:15,498:INFO:epoch: 5 step: [5763/14785], loss: 7.5973, overflow: False, scale: 288230376151711744, lr: 0.009963, time: 591.17
- 2022-09-23 13:48:48,791:INFO:epoch: 5 step: [5819/14785], loss: 8.4166, overflow: False, scale: 288230376151711744, lr: 0.009962, time: 594.46
- 2022-09-23 13:49:21,893:INFO:epoch: 5 step: [5875/14785], loss: 7.5448, overflow: False, scale: 288230376151711744, lr: 0.009961, time: 591.07
- 2022-09-23 13:49:55,171:INFO:epoch: 5 step: [5931/14785], loss: 7.6211, overflow: False, scale: 288230376151711744, lr: 0.009961, time: 594.23
- 2022-09-23 13:50:28,469:INFO:epoch: 5 step: [5987/14785], loss: 7.4978, overflow: False, scale: 288230376151711744, lr: 0.009960, time: 594.56
- 2022-09-23 13:51:01,604:INFO:epoch: 5 step: [6043/14785], loss: 6.8955, overflow: False, scale: 288230376151711744, lr: 0.009959, time: 591.66
- 2022-09-23 13:51:34,702:INFO:epoch: 5 step: [6099/14785], loss: 7.1719, overflow: False, scale: 288230376151711744, lr: 0.009958, time: 591.00
- 2022-09-23 13:52:07,790:INFO:epoch: 5 step: [6155/14785], loss: 7.7816, overflow: False, scale: 288230376151711744, lr: 0.009958, time: 590.80
- 2022-09-23 13:52:41,003:INFO:epoch: 5 step: [6211/14785], loss: 7.3397, overflow: False, scale: 288230376151711744, lr: 0.009957, time: 593.05
- 2022-09-23 13:53:14,306:INFO:epoch: 5 step: [6267/14785], loss: 8.2664, overflow: False, scale: 288230376151711744, lr: 0.009956, time: 594.65
- 2022-09-23 13:53:47,625:INFO:epoch: 5 step: [6323/14785], loss: 7.7229, overflow: False, scale: 288230376151711744, lr: 0.009955, time: 594.92
- 2022-09-23 13:54:20,703:INFO:epoch: 5 step: [6379/14785], loss: 7.3162, overflow: False, scale: 288230376151711744, lr: 0.009955, time: 590.63
- 2022-09-23 13:54:54,030:INFO:epoch: 5 step: [6435/14785], loss: 7.6105, overflow: False, scale: 288230376151711744, lr: 0.009954, time: 595.07
- 2022-09-23 13:55:27,189:INFO:epoch: 5 step: [6491/14785], loss: 6.5706, overflow: False, scale: 288230376151711744, lr: 0.009953, time: 592.08
- 2022-09-23 13:56:00,487:INFO:epoch: 5 step: [6547/14785], loss: 8.0298, overflow: False, scale: 288230376151711744, lr: 0.009952, time: 594.57
- 2022-09-23 13:56:33,897:INFO:epoch: 5 step: [6603/14785], loss: 6.9575, overflow: False, scale: 288230376151711744, lr: 0.009951, time: 596.56
- 2022-09-23 13:57:07,100:INFO:epoch: 5 step: [6659/14785], loss: 7.5058, overflow: False, scale: 288230376151711744, lr: 0.009951, time: 592.87
- 2022-09-23 13:57:40,605:INFO:epoch: 5 step: [6715/14785], loss: 7.4131, overflow: False, scale: 288230376151711744, lr: 0.009950, time: 598.29
- 2022-09-23 13:58:14,106:INFO:epoch: 5 step: [6771/14785], loss: 7.5391, overflow: False, scale: 288230376151711744, lr: 0.009949, time: 598.19
- 2022-09-23 13:58:47,502:INFO:epoch: 5 step: [6827/14785], loss: 6.9683, overflow: False, scale: 288230376151711744, lr: 0.009948, time: 596.31
- 2022-09-23 13:59:20,899:INFO:epoch: 5 step: [6883/14785], loss: 8.6543, overflow: False, scale: 288230376151711744, lr: 0.009947, time: 596.26
- 2022-09-23 13:59:54,204:INFO:epoch: 5 step: [6939/14785], loss: 6.9219, overflow: False, scale: 288230376151711744, lr: 0.009946, time: 594.38
- 2022-09-23 14:00:27,499:INFO:epoch: 5 step: [6995/14785], loss: 7.8899, overflow: False, scale: 288230376151711744, lr: 0.009945, time: 594.51
- 2022-09-23 14:01:00,896:INFO:epoch: 5 step: [7051/14785], loss: 7.6415, overflow: False, scale: 288230376151711744, lr: 0.009945, time: 596.31
- 2022-09-23 14:01:34,184:INFO:epoch: 5 step: [7107/14785], loss: 7.1775, overflow: False, scale: 288230376151711744, lr: 0.009944, time: 594.39
- 2022-09-23 14:02:07,389:INFO:epoch: 5 step: [7163/14785], loss: 7.5933, overflow: False, scale: 288230376151711744, lr: 0.009943, time: 592.91
- 2022-09-23 14:02:40,598:INFO:epoch: 5 step: [7219/14785], loss: 7.1161, overflow: False, scale: 288230376151711744, lr: 0.009942, time: 592.98
- 2022-09-23 14:03:13,905:INFO:epoch: 5 step: [7275/14785], loss: 7.5602, overflow: False, scale: 288230376151711744, lr: 0.009941, time: 594.71
- 2022-09-23 14:03:47,194:INFO:epoch: 5 step: [7331/14785], loss: 7.7526, overflow: False, scale: 288230376151711744, lr: 0.009940, time: 594.42
- 2022-09-23 14:04:20,495:INFO:epoch: 5 step: [7387/14785], loss: 8.3045, overflow: False, scale: 288230376151711744, lr: 0.009939, time: 594.64
- 2022-09-23 14:04:53,760:INFO:epoch: 5 step: [7443/14785], loss: 7.1469, overflow: False, scale: 288230376151711744, lr: 0.009938, time: 593.96
- 2022-09-23 14:05:27,206:INFO:epoch: 5 step: [7499/14785], loss: 7.2681, overflow: False, scale: 288230376151711744, lr: 0.009937, time: 597.20
- 2022-09-23 14:06:00,299:INFO:epoch: 5 step: [7555/14785], loss: 7.5964, overflow: False, scale: 576460752303423488, lr: 0.009936, time: 590.91
- 2022-09-23 14:06:33,595:INFO:epoch: 5 step: [7611/14785], loss: 7.3726, overflow: False, scale: 576460752303423488, lr: 0.009935, time: 594.52
- 2022-09-23 14:07:06,811:INFO:epoch: 5 step: [7667/14785], loss: 8.3655, overflow: False, scale: 576460752303423488, lr: 0.009934, time: 593.08
- 2022-09-23 14:07:39,989:INFO:epoch: 5 step: [7723/14785], loss: 8.0532, overflow: False, scale: 576460752303423488, lr: 0.009933, time: 592.44
- 2022-09-23 14:08:13,005:INFO:epoch: 5 step: [7779/14785], loss: 7.3561, overflow: False, scale: 576460752303423488, lr: 0.009933, time: 589.52
- 2022-09-23 14:08:46,009:INFO:epoch: 5 step: [7835/14785], loss: 8.3106, overflow: False, scale: 576460752303423488, lr: 0.009932, time: 589.31
- 2022-09-23 14:09:19,398:INFO:epoch: 5 step: [7891/14785], loss: 7.8121, overflow: False, scale: 576460752303423488, lr: 0.009931, time: 596.20
- 2022-09-23 14:09:52,794:INFO:epoch: 5 step: [7947/14785], loss: 7.7699, overflow: False, scale: 576460752303423488, lr: 0.009930, time: 596.29
- 2022-09-23 14:10:26,196:INFO:epoch: 5 step: [8003/14785], loss: 7.4007, overflow: False, scale: 576460752303423488, lr: 0.009929, time: 596.42
- 2022-09-23 14:10:59,235:INFO:epoch: 5 step: [8059/14785], loss: 7.7011, overflow: False, scale: 576460752303423488, lr: 0.009928, time: 589.93
- 2022-09-23 14:11:32,602:INFO:epoch: 5 step: [8115/14785], loss: 7.0818, overflow: False, scale: 576460752303423488, lr: 0.009927, time: 595.80
- 2022-09-23 14:12:05,700:INFO:epoch: 5 step: [8171/14785], loss: 7.3430, overflow: False, scale: 576460752303423488, lr: 0.009926, time: 590.98
- 2022-09-23 14:12:38,747:INFO:epoch: 5 step: [8227/14785], loss: 7.0365, overflow: False, scale: 576460752303423488, lr: 0.009925, time: 590.08
- 2022-09-23 14:13:12,027:INFO:epoch: 5 step: [8283/14785], loss: 7.6389, overflow: False, scale: 576460752303423488, lr: 0.009924, time: 594.23
- 2022-09-23 14:13:45,099:INFO:epoch: 5 step: [8339/14785], loss: 7.3107, overflow: False, scale: 576460752303423488, lr: 0.009922, time: 590.52
- 2022-09-23 14:14:18,296:INFO:epoch: 5 step: [8395/14785], loss: 7.0060, overflow: False, scale: 576460752303423488, lr: 0.009921, time: 592.75
- 2022-09-23 14:14:51,603:INFO:epoch: 5 step: [8451/14785], loss: 6.9972, overflow: False, scale: 576460752303423488, lr: 0.009920, time: 594.72
- 2022-09-23 14:15:24,791:INFO:epoch: 5 step: [8507/14785], loss: 7.7734, overflow: False, scale: 576460752303423488, lr: 0.009919, time: 592.62
- 2022-09-23 14:15:58,201:INFO:epoch: 5 step: [8563/14785], loss: 7.3237, overflow: False, scale: 576460752303423488, lr: 0.009918, time: 596.56
- 2022-09-23 14:16:31,088:INFO:epoch: 5 step: [8619/14785], loss: 7.8811, overflow: False, scale: 576460752303423488, lr: 0.009917, time: 587.22
- 2022-09-23 14:17:04,304:INFO:epoch: 5 step: [8675/14785], loss: 7.5676, overflow: False, scale: 576460752303423488, lr: 0.009916, time: 593.09
- 2022-09-23 14:17:37,595:INFO:epoch: 5 step: [8731/14785], loss: 7.6435, overflow: False, scale: 576460752303423488, lr: 0.009915, time: 594.44
- 2022-09-23 14:18:10,798:INFO:epoch: 5 step: [8787/14785], loss: 6.8394, overflow: False, scale: 576460752303423488, lr: 0.009914, time: 592.86
- 2022-09-23 14:18:44,097:INFO:epoch: 5 step: [8843/14785], loss: 7.5755, overflow: False, scale: 576460752303423488, lr: 0.009913, time: 594.57
- 2022-09-23 14:19:17,397:INFO:epoch: 5 step: [8899/14785], loss: 7.4968, overflow: False, scale: 576460752303423488, lr: 0.009912, time: 594.59
- 2022-09-23 14:19:50,495:INFO:epoch: 5 step: [8955/14785], loss: 7.3099, overflow: False, scale: 576460752303423488, lr: 0.009911, time: 591.00
- 2022-09-23 14:20:23,704:INFO:epoch: 5 step: [9011/14785], loss: 7.1477, overflow: False, scale: 576460752303423488, lr: 0.009910, time: 592.98
- 2022-09-23 14:20:56,801:INFO:epoch: 5 step: [9067/14785], loss: 7.8064, overflow: False, scale: 576460752303423488, lr: 0.009908, time: 590.96
- 2022-09-23 14:21:30,153:INFO:epoch: 5 step: [9123/14785], loss: 7.6021, overflow: False, scale: 576460752303423488, lr: 0.009907, time: 595.53
- 2022-09-23 14:22:03,601:INFO:epoch: 5 step: [9179/14785], loss: 7.2893, overflow: False, scale: 576460752303423488, lr: 0.009906, time: 597.24
- 2022-09-23 14:22:36,798:INFO:epoch: 5 step: [9235/14785], loss: 7.8980, overflow: False, scale: 576460752303423488, lr: 0.009905, time: 592.76
- 2022-09-23 14:23:10,200:INFO:epoch: 5 step: [9291/14785], loss: 7.2011, overflow: False, scale: 576460752303423488, lr: 0.009904, time: 596.42
- 2022-09-23 14:23:43,601:INFO:epoch: 5 step: [9347/14785], loss: 7.6925, overflow: False, scale: 576460752303423488, lr: 0.009903, time: 596.39
- 2022-09-23 14:24:16,905:INFO:epoch: 5 step: [9403/14785], loss: 7.5320, overflow: False, scale: 576460752303423488, lr: 0.009902, time: 594.68
- 2022-09-23 14:24:50,109:INFO:epoch: 5 step: [9459/14785], loss: 8.5405, overflow: False, scale: 576460752303423488, lr: 0.009900, time: 592.89
- 2022-09-23 14:25:23,502:INFO:epoch: 5 step: [9515/14785], loss: 7.5059, overflow: False, scale: 576460752303423488, lr: 0.009899, time: 596.26
- 2022-09-23 14:25:56,833:INFO:epoch: 5 step: [9571/14785], loss: 7.8898, overflow: False, scale: 288230376151711744, lr: 0.009898, time: 595.15
- 2022-09-23 14:26:30,301:INFO:epoch: 5 step: [9627/14785], loss: 7.6068, overflow: False, scale: 288230376151711744, lr: 0.009897, time: 597.59
- 2022-09-23 14:27:03,699:INFO:epoch: 5 step: [9683/14785], loss: 7.0117, overflow: False, scale: 288230376151711744, lr: 0.009896, time: 596.35
- 2022-09-23 14:27:37,298:INFO:epoch: 5 step: [9739/14785], loss: 8.0315, overflow: False, scale: 288230376151711744, lr: 0.009894, time: 599.92
- 2022-09-23 14:28:10,604:INFO:epoch: 5 step: [9795/14785], loss: 7.2039, overflow: False, scale: 288230376151711744, lr: 0.009893, time: 594.70
- 2022-09-23 14:28:43,701:INFO:epoch: 5 step: [9851/14785], loss: 7.7796, overflow: False, scale: 288230376151711744, lr: 0.009892, time: 590.95
- 2022-09-23 14:29:17,099:INFO:epoch: 5 step: [9907/14785], loss: 7.6380, overflow: False, scale: 288230376151711744, lr: 0.009891, time: 596.34
- 2022-09-23 14:29:50,408:INFO:epoch: 5 step: [9963/14785], loss: 7.6972, overflow: False, scale: 288230376151711744, lr: 0.009889, time: 594.77
- 2022-09-23 14:30:23,504:INFO:epoch: 5 step: [10019/14785], loss: 6.7547, overflow: False, scale: 288230376151711744, lr: 0.009888, time: 590.96
- 2022-09-23 14:30:56,800:INFO:epoch: 5 step: [10075/14785], loss: 6.4360, overflow: False, scale: 288230376151711744, lr: 0.009887, time: 594.53
- 2022-09-23 14:31:29,902:INFO:epoch: 5 step: [10131/14785], loss: 7.2390, overflow: False, scale: 288230376151711744, lr: 0.009886, time: 591.06
- 2022-09-23 14:32:03,107:INFO:epoch: 5 step: [10187/14785], loss: 7.4041, overflow: False, scale: 288230376151711744, lr: 0.009884, time: 592.91
- 2022-09-23 14:32:36,299:INFO:epoch: 5 step: [10243/14785], loss: 7.4273, overflow: False, scale: 288230376151711744, lr: 0.009883, time: 592.66
- 2022-09-23 14:33:09,607:INFO:epoch: 5 step: [10299/14785], loss: 7.5998, overflow: False, scale: 288230376151711744, lr: 0.009882, time: 594.73
- 2022-09-23 14:33:42,806:INFO:epoch: 5 step: [10355/14785], loss: 7.0808, overflow: False, scale: 288230376151711744, lr: 0.009881, time: 592.80
- 2022-09-23 14:34:16,226:INFO:epoch: 5 step: [10411/14785], loss: 7.0017, overflow: False, scale: 288230376151711744, lr: 0.009879, time: 596.72
- 2022-09-23 14:34:49,604:INFO:epoch: 5 step: [10467/14785], loss: 7.9497, overflow: False, scale: 288230376151711744, lr: 0.009878, time: 596.00
- 2022-09-23 14:35:22,695:INFO:epoch: 5 step: [10523/14785], loss: 7.2432, overflow: False, scale: 288230376151711744, lr: 0.009877, time: 590.86
- 2022-09-23 14:35:56,003:INFO:epoch: 5 step: [10579/14785], loss: 7.4348, overflow: False, scale: 288230376151711744, lr: 0.009875, time: 594.73
- 2022-09-23 14:36:28,999:INFO:epoch: 5 step: [10635/14785], loss: 7.4807, overflow: False, scale: 288230376151711744, lr: 0.009874, time: 589.18
- 2022-09-23 14:37:02,500:INFO:epoch: 5 step: [10691/14785], loss: 7.8076, overflow: False, scale: 288230376151711744, lr: 0.009873, time: 598.19
- 2022-09-23 14:37:35,844:INFO:epoch: 5 step: [10747/14785], loss: 7.4116, overflow: False, scale: 288230376151711744, lr: 0.009871, time: 595.38
- 2022-09-23 14:38:08,996:INFO:epoch: 5 step: [10803/14785], loss: 7.0933, overflow: False, scale: 288230376151711744, lr: 0.009870, time: 591.95
- 2022-09-23 14:38:42,103:INFO:epoch: 5 step: [10859/14785], loss: 7.8137, overflow: False, scale: 288230376151711744, lr: 0.009869, time: 591.16
- 2022-09-23 14:39:15,604:INFO:epoch: 5 step: [10915/14785], loss: 7.0381, overflow: False, scale: 288230376151711744, lr: 0.009867, time: 598.18
- 2022-09-23 14:39:48,806:INFO:epoch: 5 step: [10971/14785], loss: 8.1372, overflow: False, scale: 288230376151711744, lr: 0.009866, time: 592.85
- 2022-09-23 14:40:22,009:INFO:epoch: 5 step: [11027/14785], loss: 7.1012, overflow: False, scale: 288230376151711744, lr: 0.009865, time: 592.86
- 2022-09-23 14:40:55,199:INFO:epoch: 5 step: [11083/14785], loss: 7.6192, overflow: False, scale: 288230376151711744, lr: 0.009863, time: 592.61
- 2022-09-23 14:41:28,399:INFO:epoch: 5 step: [11139/14785], loss: 7.3086, overflow: False, scale: 288230376151711744, lr: 0.009862, time: 592.82
- 2022-09-23 14:42:01,802:INFO:epoch: 5 step: [11195/14785], loss: 6.9933, overflow: False, scale: 288230376151711744, lr: 0.009861, time: 596.43
- 2022-09-23 14:42:35,002:INFO:epoch: 5 step: [11251/14785], loss: 7.9492, overflow: False, scale: 288230376151711744, lr: 0.009859, time: 592.82
- 2022-09-23 14:43:08,254:INFO:epoch: 5 step: [11307/14785], loss: 7.0828, overflow: False, scale: 288230376151711744, lr: 0.009858, time: 593.75
- 2022-09-23 14:43:41,791:INFO:epoch: 5 step: [11363/14785], loss: 7.4143, overflow: False, scale: 288230376151711744, lr: 0.009856, time: 598.83
- 2022-09-23 14:44:15,101:INFO:epoch: 5 step: [11419/14785], loss: 6.7950, overflow: False, scale: 288230376151711744, lr: 0.009855, time: 594.79
- 2022-09-23 14:44:48,404:INFO:epoch: 5 step: [11475/14785], loss: 7.2114, overflow: False, scale: 288230376151711744, lr: 0.009854, time: 594.65
- 2022-09-23 14:45:21,803:INFO:epoch: 5 step: [11531/14785], loss: 7.5335, overflow: False, scale: 576460752303423488, lr: 0.009852, time: 596.37
- 2022-09-23 14:45:55,395:INFO:epoch: 5 step: [11587/14785], loss: 7.3527, overflow: False, scale: 576460752303423488, lr: 0.009851, time: 599.80
- 2022-09-23 14:46:28,882:INFO:epoch: 5 step: [11643/14785], loss: 6.8999, overflow: False, scale: 576460752303423488, lr: 0.009849, time: 597.94
- 2022-09-23 14:47:02,106:INFO:epoch: 5 step: [11699/14785], loss: 7.1295, overflow: False, scale: 576460752303423488, lr: 0.009848, time: 593.24
- 2022-09-23 14:47:35,406:INFO:epoch: 5 step: [11755/14785], loss: 7.5324, overflow: False, scale: 576460752303423488, lr: 0.009846, time: 594.59
- 2022-09-23 14:48:08,796:INFO:epoch: 5 step: [11811/14785], loss: 7.0313, overflow: False, scale: 576460752303423488, lr: 0.009845, time: 596.22
- 2022-09-23 14:48:41,804:INFO:epoch: 5 step: [11867/14785], loss: 7.0395, overflow: False, scale: 576460752303423488, lr: 0.009843, time: 589.38
- 2022-09-23 14:49:14,795:INFO:epoch: 5 step: [11923/14785], loss: 8.5622, overflow: False, scale: 576460752303423488, lr: 0.009842, time: 589.08
- 2022-09-23 14:49:47,991:INFO:epoch: 5 step: [11979/14785], loss: 7.8068, overflow: False, scale: 576460752303423488, lr: 0.009841, time: 592.73
- 2022-09-23 14:50:21,298:INFO:epoch: 5 step: [12035/14785], loss: 7.0416, overflow: False, scale: 576460752303423488, lr: 0.009839, time: 594.72
- 2022-09-23 14:50:54,798:INFO:epoch: 5 step: [12091/14785], loss: 7.9298, overflow: False, scale: 576460752303423488, lr: 0.009838, time: 598.16
- 2022-09-23 14:51:28,061:INFO:epoch: 5 step: [12147/14785], loss: 7.4459, overflow: False, scale: 576460752303423488, lr: 0.009836, time: 593.96
- 2022-09-23 14:52:01,304:INFO:epoch: 5 step: [12203/14785], loss: 7.1705, overflow: False, scale: 576460752303423488, lr: 0.009835, time: 593.59
- 2022-09-23 14:52:34,439:INFO:epoch: 5 step: [12259/14785], loss: 8.1349, overflow: False, scale: 576460752303423488, lr: 0.009833, time: 591.63
- 2022-09-23 14:53:07,393:INFO:epoch: 5 step: [12315/14785], loss: 7.4902, overflow: False, scale: 576460752303423488, lr: 0.009831, time: 588.42
- 2022-09-23 14:53:40,602:INFO:epoch: 5 step: [12371/14785], loss: 7.8545, overflow: False, scale: 576460752303423488, lr: 0.009830, time: 592.97
|