#602 之前跑通的代码未改动, 再运行时在向userhome存储model时遇到: 'TorchHistory.add_log_hooks_to_pytorch_module.<locals>.<lambda>'

Open
created 1 year ago by Tangsj2 · 2 comments
Tangsj2 commented 1 year ago
您好! 8.18日之前在章鱼上一直成功运行,未改动代码. 训练中存储目标directory为/userhome/. 发现只要注释掉模型存储就可以正常运行. 但是算法需要模型的reload. 目前是在存储模型时遇到问题: self.save_model(f"/userhome/_save_dir/2/{epoch}.pt") File "/code/mouge.py", line 65, in save_model torch.save(self.get(), model_path) File "/root/miniconda3/envs/mouge/lib/python3.9/site-packages/torch/serialization.py", line 379, in save _save(obj, opened_zipfile, pickle_module, pickle_protocol) File "/root/miniconda3/envs/romi/lib/python3.9/site-packages/torch/serialization.py", line 589, in _save pickler.dump(obj) AttributeError: Can't pickle local object 'TorchHistory.add_log_hooks_to_pytorch_module.<locals>.<lambda>' 目前遇到:只要使用 torch.save,就会报错. 请求支援!!!! 多谢!
yangxzh1 commented 1 year ago
Collaborator
我看AttributeError: Can't pickle local object 'TorchHistory.add_log_hooks_to_pytorch_module这个关键字可以百度到很多类似的问题的解决方案,大多说的就是多线程的序列化相关,可以尝试看看 https://www.jb51.net/article/184308.htm
yangxzh1 added the
help wanted
label 1 year ago
yangxzh1 added the
跟进中
label 1 year ago
Tangsj2 commented 1 year ago
Poster
8月18号之前一直可以成功运行, 23号开始报错,错误出现在每次往/userhome/存储模型.pt的时候. 注释掉模型save就可以正常运行,但是算法需要模型的reload. after the model being trained, store it Failed. Killing the trainer. Traceback (most recent call last): File "/code/mouge.py", line 364, in train if self.args['model_save_path'] is not None: torch.save(self.model, self.args['model_save_path']) File "/root/miniconda3/envs/romi/lib/python3.9/site-packages/torch/serialization.py", line 379, in save _save(obj, opened_zipfile, pickle_module, pickle_protocol) File "/root/miniconda3/envs/romi/lib/python3.9/site-packages/torch/serialization.py", line 589, in _save pickler.dump(obj) AttributeError: Can't pickle local object 'TorchHistory.add_log_hooks_to_pytorch_module.<locals>.<lambda>'
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.