MSAdapter是MindSpore适配PyTorch接口的工具,其目的是在不改变原有PyTorch用户的使用习惯情况下,使得PyTorch代码能在昇腾上获得高效性能,用户只需要将PyTorch源代码中import torch替换为import ms_adapter.pytorch即可实现模型能支持昇腾上训练。
目前MSAdapte支持MindSpore的PYNATIVE模式训练,GRAPH模式训练需要调整代码,训练过程部分代码需要用户自定义。我们做了很多接口的对齐工作,但是由于框架设计原因仍然有部分接口在MindSpore里不需要或者对不齐需要删除或修改.
在昇腾平台使用MindSpore不需要将模型或者数据给到GPU,遇到torch.cuda或者to gpu、cpu的操作都可以删除。如:
torch.cuda
torch.Tensor.to
torch.nn.Module.to
对于no_grad、grad等梯度操作,在MindSpore中不需要
torch.no_grad
MSAdapter的数据处理接口除了DataLoader中pin_memory=True不生效外,均PyTorch几乎完全对齐。仅需将数据处理相关导入包修改为从ms_adapter导入,示例如下
from ms_adapter.pytorch.utils.data import DataLoader
from ms_adapter.torchvision import datasets, transforms
transform = transforms.Compose([transforms.Resize((224, 224), interpolation=InterpolationMode.BICUBIC),
transforms.ToTensor(),
transforms.Normalize(mean=[0.4914, 0.4822, 0.4465], std=[0.247, 0.2435, 0.2616])
])
train_images = datasets.CIFAR10('./', train=True, download=True, transform=transform)
train_data = DataLoader(train_images, batch_size=128, shuffle=True, num_workers=2, drop_last=True)
我们开发了大量和PyTorch功能和输出一致的模型构建接口,仅需将模型构建组件导入包改为从ms_adapter导入,需要注意的是有些操作仍然不支持,如自定义反向传播。如果有一些必要的接口和功能可以通过ISSUE 向我们反馈.
from ms_adapter.pytorch.nn import Module, Linear, Flatten
class MLP(Module):
def __init__(self):
super(MLP, self).__init__()
self.flatten = Flatten()
self.line1 = Linear(in_features=1024, out_features=64)
self.line2 = Linear(in_features=64, out_features=128, bias=False)
self.line3 = Linear(in_features=128, out_features=10)
def forward(self, inputs):
x = self.flatten(inputs)
x = self.line1(x)
x = self.line2(x)
x = self.line3(x)
return x
对于简单的模型训练可以调用MindSpore的原生接口,如mindspore.Model.train或者ms.nn.WithLossCell和ms.nn.TrainOneStepCell单步训练方式。具体使用如下:
方式一:函数式迭代训练
from mindspore import ops
from ms_adapter.pytorch import nn
import ms_adapter.pytorch as torch
model = LeNet()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), learning_rate=0.1, momentum=0.9, weight_decay=1e-4)
# 定义前向过程
def forward_fn(data, label):
logits = model(data)
loss = criterion(logits, label)
return loss, logits
# 计算梯度
grad_fn = ops.value_and_grad(forward_fn, None, optimizer.parameters, has_aux=True)
# 单步训练
def train_step(data, label):
(loss, _), grads = grad_fn(data, label)
loss = ops.depend(loss, optimizer(grads))
return loss
# 数据迭代训练
for i, (input, target) in enumerate(train_loader):
loss = train_step(input, target)
方式二:使用MindSpore的Model.train训练
import mindspore as ms
from mindspore.dataset import GeneratorDataset
from mindspore.train.callback import LossMonitor, TimeMonitor
model = LeNet()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), learning_rate=0.1, momentum=0.9, weight_decay=1e-4)
model = ms.Model(model, criterion, optimizer, metrics={'accuracy'})
dataset = GeneratorDataset(source=train_data, column_names=["data", "label"])
model.train(epochs, dataset, callbacks=[TimeMonitor(), LossMonitor()])
方式三:使用WithLossCell和TrainOneStepCell迭代训练
import mindspore as ms
from ms_adapter.pytorch import nn
import ms_adapter.pytorch as torch
model = LeNet()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), learning_rate=0.1, momentum=0.9, weight_decay=1e-4)
loss_net = ms.nn.WithLossCell(model, criterion)
train_net = ms.nn.TrainOneStepCell(loss_net, optimizer)
for i in range(epochs):
for X, y in train_data:
loss = train_net(X, y)
Q:设置context.set_context(mode=context.GRAPH_MODE)后运行出现类似问题:Tensor.add_
is an in-place operation and "x.add_()" is not encouraged to use in MindSpore static graph mode. Please use "x = x.add()" or other API instead。
A:目前在设置GRAPH模式下不支持原地操作相关的接口,需要按照提示信息进行修改。需要注意的是,即使在PYNATIVE模式下,原地操作相关接口也是不鼓励使用的,因为目前在MSAdapter不会带来内存收益,而且会给反向梯度计算带来不确定性。
Q:运行代码出现类似报错信息:AttributeError: module 'ms_adapter.pytorch' has no attribute 'xxx'。
A:首先确定'xxx'是否为torch 1.12版本支持的接口,PyTorch官网明确已废弃或者即将废弃的接口和参数,MSAdapter不会兼容支持,请使用其他同等功能的接口代替。如果是PyTorch对应版本支持,而MSAdapter中暂时没有,欢迎参与MSAdapter项目贡献你的代码,也可以通过创建任务(New issue)反馈需求。
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》