Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
|
1 month ago | |
|---|---|---|
| docs/pic | 4 months ago | |
| licenses | 10 months ago | |
| msa_thirdparty | 1 month ago | |
| msadapter | 1 month ago | |
| scripts | 1 month ago | |
| tests | 1 month ago | |
| .gitignore | 4 months ago | |
| LICENSE | 4 months ago | |
| README.md | 1 month ago | |
| env_config.sh | 4 months ago | |
| setup.py | 1 month ago | |
简体中文
MSAdapter是一款MindSpore生态适配工具,在不改变用户原有使用习惯下,将PyTorch/JAX等三方框架代码快速迁移到MindSpore生态上,帮助用户高效使用昇腾算力。
msadapter是将PyTorch训练脚本高效迁移至MindSpore框架执行的工具,其目的是在不改变原有PyTorch用户的使用习惯情况下,使得PyTorch代码能在昇腾上获得高效性能。
| 分支名 | 发布时间 | 配套MindSpore版本 | mindspeed | mindspeed-llm | mindspeed-mm | megatron |
|---|---|---|---|---|---|---|
| v0.5.0 | 2025-09-30 | 2.7.1 | 2.2.0_core_r0.12.1 | 2.2.0 | 2.2.0 | core_v0.12.1 |
| v0.6.0 | 2025-12-30 | 2.7.2 | 2.3.0_core_r0.12.1 | 2.3.0 | 2.3.0 | core_v0.12.1 |
| v0.7.0 | 2026-1-30 | 2.8.0 | 2.3.0_core_r0.12.1 | 2.3.0 | 2.3.0 | core_v0.12.1 |
有关安装指南、教程和API的更多详细信息,请参阅教程文档。
首先查看版本说明选择所需的msadapter和MindSpore版本。
请根据MindSpore官网安装指南进行安装。
export PYTHONPATH=${MindSpeed_Core_MS_PATH}/msadapter/:$PYTHONPATH
export PYTHONPATH=${MindSpeed_Core_MS_PATH}/msadapter/msa_thirdparty:$PYTHONPATH
步骤一:源码下载
git clone https://gitee.com/mindspore/msadapter.git
步骤二:构建
cd msadapter
bash scripts/build.sh
构建完成后,msadapter目录下会新增一个build文件夹与一个dist文件夹。
步骤三:安装
pip install ${MindSpeed_Core_MS_PATH}/msadapter/dist/*.whl
export PYTHONPATH=/*/site-packages/msa_thirdparty:$PYTHONPATH
# /*/site-packages 指python环境下的安装包路径,可以使用pip show msadapter获取。
脚本中控制是否使用msadapter的代码
msadapter.enable_torch_proxy(True)
msadapter.enable_torch_proxy(False)
import torch
from torch import nn
from torch.nn import functional as F
net = nn.Linear(10, 1)
import msadapter # 改为mindspore后端执行
import torch
from torch import nn
from torch.nn import functional as F
net = nn.Linear(10, 1)
import msadapter
from msadapter import nn
from msadapter.nn import functional as F
net = nn.Linear(10, 1)
需要使用msrun启动,不支持使用torchrun启动。
NPUS_PER_NODE=8
MASTER_ADDR=localhost
MASTER_PORT=6099
NNODES=2
NODE_RANK=0
WORLD_SIZE=$(($NPUS_PER_NODE*$NNODES))
DISTRIBUTED_ARGS="
--node_rank $NODE_RANK \
--master_addr $MASTER_ADDR \
--master_port $MASTER_PORT
--worker_num $WORLD_SIZE \
--local_worker_num $NPUS_PER_NODE \
--log_dir=msrun_log \ # 分卡日志路径存储路径
--join=True \ # True代表显示屏打日志
"
msrun $DISTRIBUTED_ARGS pretrain_gpt.py
安装好msadapter后, 你可以按照以下方式使用它:
import msadapter
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
# 1.Working with data
# Download training data from open datasets.
training_data = datasets.FashionMNIST(root="data", train=True, download=True, transform=ToTensor())
# Download test data from open datasets.
test_data = datasets.FashionMNIST(root="data", train=False, download=True, transform=ToTensor())
# 2.Creating Models
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(28*28, 512),
nn.ReLU(),
nn.Linear(512, 512),
nn.ReLU(),
nn.Linear(512, 10)
)
def forward(self, x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
if __name__ == '__main__':
train_dataloader = DataLoader(training_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)
# 3.create Models
model = NeuralNetwork()
classes = [
"T-shirt/top",
"Trouser",
"Pullover",
"Dress",
"Coat",
"Sandal",
"Shirt",
"Sneaker",
"Bag",
"Ankle boot",
]
# 4.Predict
model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
pred = model(x)
predicted, actual = classes[pred[0].argmax(0)], classes[y]
print(f'Predicted: "{predicted}", Actual: "{actual}"')
安装完msadapter后,代码执行时torch同名的导入模块会自动被转换为msadapter相应的模块(目前支持torch、torchvision、torch_npu、torchair等相关模块的自动转换),接下来执行主入口的.py文件即可。更多的使用方式可以参考使用指南
目前MSAdapter的使用存在如下限制:
示例代码:
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
training_data = datasets.FashionMNIST(root="data", train=True, download=True, transform=ToTensor())
train_dataloader = DataLoader(training_data, batch_size=64, pin_memory=True)
for batch, (X, y) in enumerate(train_dataloader):
X, y = X.cuda(), y.cuda()
报错信息如下:
Traceback (most recent call last):
File "/path/to/your/torch/utils/data/_utils/pin_memory.py", line 98, in pin_memory
clone[i] = pin_memory(item, device)
File "/path/to/your/torch/utils/data/_utils/pin_memory.py", line 64, in pin_memory
return data.pin_memory(device)
TypeError: pin_memory() takes 1 positional argument but 2 were given
示例代码:
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
training_data = datasets.FashionMNIST(root="data", train=True, download=True, transform=ToTensor())
train_dataloader = DataLoader(training_data, batch_size=64, pin_memory=True)
for batch, (X, y) in enumerate(train_dataloader):
X, y = X.cuda(), y.cuda()
报错信息如下:
Traceback (most recent call last):
File "/path/to/your/torch/utils/data/_utils/pin_memory.py", line 98, in pin_memory
clone[i] = pin_memory(item, device)
File "/path/to/your/torch/utils/data/_utils/pin_memory.py", line 64, in pin_memory
return data.pin_memory(device)
TypeError: pin_memory() takes 1 positional argument but 2 were given
示例代码:
import torch
from torch import nn
import mindspore as ms
class NeuralNetwork(nn.Module):
def __init__(self):
super().__init__()
self.linear = nn.Linear(28*28, 512)
def forward(self, x):
logits = self.linear(x)
return logits
class myNN(ms.nn.Cell):
def __init__(self):
super().__init__()
self.linear = nn.Linear(28*28, 512)
def construct(self, x):
logits = self.linear(x)
return logits
model = myNN()
ms.save_checkpoint(model, "./net.ckpt")
model2 = NeuralNetwork()
model.load_state_dict(torch.load("./net.ckpt"))
报错信息如下:
Traceback (most recent call last):
File "/path/to/your/demo.py", line 99, in <module>
model.load_state_dict(torch.load("./mynn.ckpt"))
File "/path/to/your/torch/serialization.py", line 1020, in load
return _legacy_load(opened_file, pickle_module, **pickle_load_args)
File "/path/to/your/torch/serialization.py", line 1118, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
EOFError: Ran out of input
示例代码:
from mindspore import Tensor
a = Tensor([2, 2])
print(f'before import torch: a.shape={a.shape}')
import torch
print(f'after import torch: a.shape={a.shape}')
执行结果如下,可以看到,import torch后,原本的mindspore.Tensor.shape行为发生了改变。
before import torch: a.shape=(2,)
after import torch: a.shape=torch.Size([2])
不支持混跑的MindSpore接口详见下表:
| 模块 | 受影响接口 |
|---|---|
| mindspore.Tensor/mindspore.StubTensor | is_shared |
| softmax | |
| type_ | |
| retain_grad | |
| shape | |
| to_dense | |
| _base | |
| data | |
| numel | |
| nelement | |
| repeat | |
| cuda | |
| npu | |
| cpu | |
| size | |
| dim | |
| clone | |
| log_softmax | |
| narrow | |
| view | |
| __or__ | |
| device | |
| __and__ | |
| __xor__ | |
| __iter__ | |
| __reduce_ex__ | |
| expand | |
| detach | |
| T | |
| transpose | |
| mean | |
| clamp | |
| is_cuda | |
| is_cpu | |
| repeat_interleave | |
| is_sparse | |
| requires_grad | |
| requires_grad_ | |
| unsqueeze | |
| __pow__ | |
| float | |
| backward | |
| expand | |
| split | |
| norm | |
| record_stream | |
| data_ptr | |
| pin_memory | |
| grad | |
| grad | |
| __imul__ | |
| reshape | |
| squeeze | |
| element_size | |
| exponential_ |
MindSpore对PyTorch接口的支持工具
Python Shell C++ Text JavaScript
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》