#2 好不容易安装上了,但是运行报错,内容如下,求教

Closed
created 2 years ago by dohappy2 · 7 comments
dohappy2 commented 2 years ago
Traceback (most recent call last): File "run_inference.py", line 206, in <module> run_eval(args_opt) File "run_inference.py", line 144, in run_eval model_predict = get_model_2b6_fp16(args) File "run_inference.py", line 84, in get_model_2b6_fp16 pangu = PANGUALPHA_fp16(config) File "/home/w/pangu-alpha-gpu/inference/pangu_dropout_recompute_eos_fp16.py", line 752, in __init__ self.backbone = PANGUALPHA_Model(config) File "/home/w/pangu-alpha-gpu/inference/pangu_dropout_recompute_eos_fp16.py", line 621, in __init__ per_block = Block(config, i + 1).set_comm_fusion(int(i / fusion_group_size) + 2) File "/home/w/pangu-alpha-gpu/inference/pangu_dropout_recompute_eos_fp16.py", line 465, in __init__ self.attention = Attention(config, scale, layer_idx) File "/home/w/pangu-alpha-gpu/inference/pangu_dropout_recompute_eos_fp16.py", line 300, in __init__ config.embedding_size, dtype=mstype.float16).to_float( File "/home/w/.local/lib/python3.7/site-packages/mindspore/_extends/utils.py", line 43, in deco fn(self, *args, **kwargs) TypeError: __init__() got an unexpected keyword argument 'dtype'
dohappy2 commented 2 years ago
Poster
也安装附录的源码修改 也修改过了
yands started working 2 years ago
yands commented 2 years ago
Owner
这个错误应该是 Dence() 的 init 函数没有 dtype 这个参数。请检查class Dence() 的 init()函数是都修改为 `def __init__(self, in_channels, out_channels, weight_init='normal', bias_init='zeros', has_bias=True, activation=None, dtype=mstype.float32):` 或者尝试使用我提供的镜像文件来运行程序。
dohappy2 commented 2 years ago
Poster
请问是要修改mindspore/nn/layer/basic.py 的 class Dense()嘛?
yands commented 2 years ago
Owner
是的
dohappy2 commented 2 years ago
Poster
应该是我修改的文件不对,所以有最上面的报错,现在已经能加载 提示#### Load ckpt success!!! #### Building prefix dict from the default dictionary ... Loading model from cache /tmp/jieba.cache Loading model cost 0.656 seconds. Prefix dict has been built successfully. [WARNING] ME(16521:140086779144000,MainProcess):2021-06-04-22:54:02.896.251 [mindspore/common/_decorator.py:32]...... [ERROR] KERNEL(16521,python3.7.5):2021-06-04-22:54:09.211.623 [mindspore/ccsrc/backend/kernel_compiler/gpu/gpu_kernel_factory.cc:115] CheckSM] Half precision ops can be used on Devices which computing capacity is >= 6, but the current device's computing capacity is 5 Traceback (most recent call last): File "run_inference.py", line 206, in <module> run_eval(args_opt) File "run_inference.py", line 169, in run_eval output_ids = generate(model_predict, input_ids, 1024, 9) File "/home/w/pangu-alpha-gpu/inference/generate.py", line 70, in generate logits = model.predict(inputs).asnumpy() File "/home/w/.local/lib/python3.7/site-packages/mindspore/train/model.py", line 791, in predict result = self._predict_network(*predict_data) File "/home/w/.local/lib/python3.7/site-packages/mindspore/nn/cell.py", line 341, in __call__ out = self.compile_and_run(*inputs) File "/home/w/.local/lib/python3.7/site-packages/mindspore/nn/cell.py", line 608, in compile_and_run self.compile(*inputs) File "/home/w/.local/lib/python3.7/site-packages/mindspore/nn/cell.py", line 595, in compile _executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode) File "/home/w/.local/lib/python3.7/site-packages/mindspore/common/api.py", line 494, in compile result = self._executor.compile(obj, args_list, phase, use_vm) RuntimeError: mindspore/ccsrc/backend/kernel_compiler/gpu/gpu_kernel_factory.cc:115 CheckSM] Half precision ops can be used on Devices which computing capacity is >= 6, but the current device's computing capacity is 5 现在报错提示,半精度运算可用于计算能力大于等于6的设备,但当前设备的计算能力为5。 我设备现在Ubuntu 18.04 32G内存 显卡是TITAN X 12G,运行的程序是Python run_inference.py --model=2B6_fp16 --load_ckpt_path=PanguAlpha_2.6B_fp16.ckpt。按介绍是应该能运行起来的,但是我运行时同时查看显卡情况,显存并没有被占用。内存只占用到9G就报错退出了。求教大神了
dohappy2 commented 2 years ago
Poster
尝试使用了docker镜像,也提示算力不足RuntimeError: mindspore/ccsrc/backend/kernel_compiler/gpu/gpu_kernel_factory.cc:115 CheckSM] Half precision ops can be used on Devices which computing capacity is >= 6, but the current device's computing capacity is 5
yands commented 2 years ago
Owner
那可能是 mindspore 规定了算力低的 gpu 没法运行,这个问题我暂时也没法解决。抱歉
yands closed this issue 2 years ago
Sign in to join this conversation.
No Label
No Milestone
No Assignees
2 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.