好不容易安装上了，但是运行报错，内容如下，求教

Traceback (most recent call last):
File "run_inference.py", line 206, in
run_eval(args_opt)
File "run_inference.py", line 144, in run_eval
model_predict = get_model_2b6_fp16(args)
File "run_inference.py", line 84, in get_model_2b6_fp16
pangu = PANGUALPHA_fp16(config)
File "/home/w/pangu-alpha-gpu/inference/pangu_dropout_recompute_eos_fp16.py", line 752, in init
self.backbone = PANGUALPHA_Model(config)
File "/home/w/pangu-alpha-gpu/inference/pangu_dropout_recompute_eos_fp16.py", line 621, in init
per_block = Block(config, i + 1).set_comm_fusion(int(i / fusion_group_size) + 2)
File "/home/w/pangu-alpha-gpu/inference/pangu_dropout_recompute_eos_fp16.py", line 465, in init
self.attention = Attention(config, scale, layer_idx)
File "/home/w/pangu-alpha-gpu/inference/pangu_dropout_recompute_eos_fp16.py", line 300, in init
config.embedding_size, dtype=mstype.float16).to_float(
File "/home/w/.local/lib/python3.7/site-packages/mindspore/_extends/utils.py", line 43, in deco
fn(self, *args, **kwargs)
TypeError: init() got an unexpected keyword argument 'dtype'

也安装附录的源码修改也修改过了

这个错误应该是 Dence() 的 init 函数没有 dtype 这个参数。请检查class Dence() 的 init()函数是都修改为
def __init__(self, in_channels, out_channels, weight_init='normal', bias_init='zeros', has_bias=True, activation=None, dtype=mstype.float32):

或者尝试使用我提供的镜像文件来运行程序。

👍 1

请问是要修改mindspore/nn/layer/basic.py 的 class Dense()嘛？

是的

👍 1

应该是我修改的文件不对，所以有最上面的报错，现在已经能加载
提示#### Load ckpt success!!! ####
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.656 seconds.
Prefix dict has been built successfully.
[WARNING] ME(16521:140086779144000,MainProcess):2021-06-04-22:54:02.896.251 [mindspore/common/_decorator.py:32]......
[ERROR] KERNEL(16521,python3.7.5):2021-06-04-22:54:09.211.623 [mindspore/ccsrc/backend/kernel_compiler/gpu/gpu_kernel_factory.cc:115] CheckSM] Half precision ops can be used on Devices which computing capacity is >= 6, but the current device's computing capacity is 5
Traceback (most recent call last):
File "run_inference.py", line 206, in
run_eval(args_opt)
File "run_inference.py", line 169, in run_eval
output_ids = generate(model_predict, input_ids, 1024, 9)
File "/home/w/pangu-alpha-gpu/inference/generate.py", line 70, in generate
logits = model.predict(inputs).asnumpy()
File "/home/w/.local/lib/python3.7/site-packages/mindspore/train/model.py", line 791, in predict
result = self._predict_network(*predict_data)
File "/home/w/.local/lib/python3.7/site-packages/mindspore/nn/cell.py", line 341, in call
out = self.compile_and_run(*inputs)
File "/home/w/.local/lib/python3.7/site-packages/mindspore/nn/cell.py", line 608, in compile_and_run
self.compile(*inputs)
File "/home/w/.local/lib/python3.7/site-packages/mindspore/nn/cell.py", line 595, in compile
_executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode)
File "/home/w/.local/lib/python3.7/site-packages/mindspore/common/api.py", line 494, in compile
result = self._executor.compile(obj, args_list, phase, use_vm)
RuntimeError: mindspore/ccsrc/backend/kernel_compiler/gpu/gpu_kernel_factory.cc:115 CheckSM] Half precision ops can be used on Devices which computing capacity is >= 6, but the current device's computing capacity is 5
现在报错提示，半精度运算可用于计算能力大于等于6的设备，但当前设备的计算能力为5。
我设备现在Ubuntu 18.04 32G内存显卡是TITAN X 12G，运行的程序是Python run_inference.py --model=2B6_fp16 --load_ckpt_path=PanguAlpha_2.6B_fp16.ckpt。按介绍是应该能运行起来的，但是我运行时同时查看显卡情况，显存并没有被占用。内存只占用到9G就报错退出了。求教大神了

应该是我修改的文件不对，所以有最上面的报错，现在已经能加载提示#### Load ckpt success!!! #### Building prefix dict from the default dictionary ... Loading model from cache /tmp/jieba.cache Loading model cost 0.656 seconds. Prefix dict has been built successfully. [WARNING] ME(16521:140086779144000,MainProcess):2021-06-04-22:54:02.896.251 [mindspore/common/_decorator.py:32]...... [ERROR] KERNEL(16521,python3.7.5):2021-06-04-22:54:09.211.623 [mindspore/ccsrc/backend/kernel_compiler/gpu/gpu_kernel_factory.cc:115] CheckSM] Half precision ops can be used on Devices which computing capacity is >= 6, but the current device's computing capacity is 5 Traceback (most recent call last): File "run_inference.py", line 206, in <module> run_eval(args_opt) File "run_inference.py", line 169, in run_eval output_ids = generate(model_predict, input_ids, 1024, 9) File "/home/w/pangu-alpha-gpu/inference/generate.py", line 70, in generate logits = model.predict(inputs).asnumpy() File "/home/w/.local/lib/python3.7/site-packages/mindspore/train/model.py", line 791, in predict result = self._predict_network(*predict_data) File "/home/w/.local/lib/python3.7/site-packages/mindspore/nn/cell.py", line 341, in __call__ out = self.compile_and_run(*inputs) File "/home/w/.local/lib/python3.7/site-packages/mindspore/nn/cell.py", line 608, in compile_and_run self.compile(*inputs) File "/home/w/.local/lib/python3.7/site-packages/mindspore/nn/cell.py", line 595, in compile _executor.compile(self, *inputs, phase=self.phase, auto_parallel_mode=self._auto_parallel_mode) File "/home/w/.local/lib/python3.7/site-packages/mindspore/common/api.py", line 494, in compile result = self._executor.compile(obj, args_list, phase, use_vm) RuntimeError: mindspore/ccsrc/backend/kernel_compiler/gpu/gpu_kernel_factory.cc:115 CheckSM] Half precision ops can be used on Devices which computing capacity is >= 6, but the current device's computing capacity is 5 现在报错提示，半精度运算可用于计算能力大于等于6的设备，但当前设备的计算能力为5。我设备现在Ubuntu 18.04 32G内存显卡是TITAN X 12G，运行的程序是Python run_inference.py --model=2B6_fp16 --load_ckpt_path=PanguAlpha_2.6B_fp16.ckpt。按介绍是应该能运行起来的，但是我运行时同时查看显卡情况，显存并没有被占用。内存只占用到9G就报错退出了。求教大神了

尝试使用了docker镜像，也提示算力不足RuntimeError: mindspore/ccsrc/backend/kernel_compiler/gpu/gpu_kernel_factory.cc:115 CheckSM] Half precision ops can be used on Devices which computing capacity is >= 6, but the current device's computing capacity is 5

那可能是 mindspore 规定了算力低的 gpu 没法运行，这个问题我暂时也没法解决。抱歉

Deleting a branch is permanent. It CANNOT be undone. Continue?

Dear OpenI User

Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.

For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》

#2 好不容易安装上了，但是运行报错，内容如下，求教