本教程介绍如何在MindCV中调用预训练模型,在测试图像上进行分类预测。
通过调用mindcv.models
中的registry.list_models
函数,可以打印出全部网络模型的名字,一个网络在不同参数配置下的模型也会分别打印出来,例如resnet18 / resnet34 / resnet50 / resnet101 / resnet152。
import sys
sys.path.append("..")
from mindcv.models import registry
registry.list_models()
['BiTresnet50',
'RepMLPNet_B224',
'RepMLPNet_B256',
'RepMLPNet_D256',
'RepMLPNet_L256',
'RepMLPNet_T224',
'RepMLPNet_T256',
'convit_base',
'convit_base_plus',
'convit_small',
...
'visformer_small',
'visformer_small_v2',
'visformer_tiny',
'visformer_tiny_v2',
'vit_b_16_224',
'vit_b_16_384',
'vit_b_32_224',
'vit_b_32_384',
'vit_l_16_224',
'vit_l_16_384',
'vit_l_32_224',
'xception']
我们以resnet50模型为例,介绍两种使用mindcv.models
中create_model
函数进行模型checkpoint加载的方法。
1). 当接口中的pretrained
参数设置为True时,可以自动下载网络权重。
from mindcv.models import create_model
model = create_model(model_name='resnet50', num_classes=1000, pretrained=True)
# 切换网络的执行逻辑为推理场景
model.set_train(False)
102453248B [00:16, 6092186.31B/s]
ResNet<
(conv1): Conv2d<input_channels=3, output_channels=64, kernel_size=(7, 7), stride=(2, 2), pad_mode=pad, padding=3, dilation=(1, 1), group=1, has_bias=False, weight_init=normal, bias_init=zeros, format=NCHW>
(bn1): BatchNorm2d<num_features=64, eps=1e-05, momentum=0.9, gamma=Parameter (name=bn1.gamma, shape=(64,), dtype=Float32, requires_grad=True), beta=Parameter (name=bn1.beta, shape=(64,), dtype=Float32, requires_grad=True), moving_mean=Parameter (name=bn1.moving_mean, shape=(64,), dtype=Float32, requires_grad=False), moving_variance=Parameter (name=bn1.moving_variance, shape=(64,), dtype=Float32, requires_grad=False)>
(relu): ReLU<>
(max_pool): MaxPool2d<kernel_size=3, stride=2, pad_mode=SAME>
...
(pool): GlobalAvgPooling<>
(classifier): Dense<input_channels=2048, output_channels=1000, has_bias=True>
>
2). 当接口中的checkpoint_path
参数设置为文件路径时,可以从本地加载后缀为.ckpt
的模型参数文件。
from mindcv.models import create_model
model = create_model(model_name='resnet50', num_classes=1000, checkpoint_path='./resnet50_224.ckpt')
# 切换网络的执行逻辑为推理场景
model.set_train(False)
这里,我们下载一张Wikipedia的图片作为测试图片,使用mindcv.data
中的create_dataset
函数,为单张图片构造自定义数据集。
from mindcv.data import create_dataset
num_workers = 1
# 数据集目录路径
data_dir = "./data/"
dataset = create_dataset(root=data_dir, split='test', num_parallel_workers=num_workers)
# 图像可视
from PIL import Image
Image.open("./data/test/dog/dog.jpg")
数据集的目录结构如下:
data/
└─ test
├─ dog
│ ├─ dog.jpg
│ └─ ……
└─ ……
通过调用create_transforms
函数,获得预训练模型使用的ImageNet数据集的数据处理策略(transform list)。
我们将得到的transform list传入create_loader
函数,指定batch_size=1
和其他参数,即可完成测试数据的准备,返回Dataset
Object,作为模型的输入。
from mindcv.data import create_transforms, create_loader
transforms_list = create_transforms(dataset_name='imagenet', is_training=False)
data_loader = create_loader(
dataset=dataset,
batch_size=1,
is_training=False,
num_classes=1000,
transform=transforms_list,
num_parallel_workers=num_workers
)
将自定义数据集的图片传入模型,获得推理的结果。这里使用mindspore.ops
的Squeeze
函数去除batch维度。
import mindspore.ops as P
import numpy as np
images, _ = next(data_loader.create_tuple_iterator())
output = P.Squeeze()(model(images))
pred = np.argmax(output.asnumpy())
with open("imagenet1000_clsidx_to_labels.txt") as f:
idx2label = eval(f.read())
print('predict: {}'.format(idx2label[pred]))
predict: Labrador retriever
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》