ResMLP: Feedforward networks for image classification with data-efficient training

This repository contains PyTorch evaluation code, training code and pretrained models for the following projects:

DeiT (Data-Efficient Image Transformers), ICML 2021
CaiT (Going deeper with Image Transformers), ICCV 2021 (Oral)
ResMLP (ResMLP: Feedforward networks for image classification with data-efficient training)
PatchConvnet (Augmenting Convolutional networks with attention-based aggregation)
3Things (Three things everyone should know about Vision Transformers)
DeiT III (DeiT III: Revenge of the ViT)

ResMLP obtain good performance given its simplicity:

For details see ResMLP: Feedforward networks for image classification with data-efficient training by Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Gautier Izacard, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek and Hervé Jégou.

If you use this code for a paper please cite:

@article{touvron2021resmlp,
  title={ResMLP: Feedforward networks for image classification with data-efficient training},
  author={Hugo Touvron and Piotr Bojanowski and Mathilde Caron and Matthieu Cord and Alaaeldin El-Nouby and Edouard Grave and Gautier Izacard and Armand Joulin and Gabriel Synnaeve and Jakob Verbeek and Herv'e J'egou},
  journal={arXiv preprint arXiv:2105.03404},
  year={2021},
}

Model Zoo

We provide baseline ResMLP models pretrained on ImageNet1k 2012, using the distilled version of our method:

name	acc@1	res	FLOPs	#params	url
ResMLP-S12	77.8	224	3B	15M	model
ResMLP-S24	80.8	224	6B	30M	model
ResMLP-S36	81.1	224	23B	116M	model
ResMLP-B24	83.6	224	100B	129M	model

Model pretrained on ImageNet-22k with finetuning on ImageNet1k 2012:

name	acc@1	res	FLOPs	#params	url
ResMLP-B24	84.4	224	100B	129M	model

Models pretrained with DINO without finetuning:

name	acc@1 (knn)	res	FLOPs	#params	url
ResMLP-S12	62.6	224	3B	15M	model
ResMLP-S24	69.4	224	6B	30M	model

The models are also available via torch hub.
Before using it, make sure you have the pytorch-image-models package timm==0.3.2 by Ross Wightman installed.

Evaluation transforms

ResMLP employs a slightly different pre-processing, in particular a crop-ratio of 0.9 at test time. To reproduce the results of our paper please use the following pre-processing:

def get_test_transforms(input_size):
    mean, std = [0.485, 0.456, 0.406],[0.229, 0.224, 0.225]    
    transformations = {}
    Rs_size=int(input_size/0.9)
    transformations= transforms.Compose(
        [transforms.Resize(Rs_size, interpolation=3),
         transforms.CenterCrop(input_size),
         transforms.ToTensor(),
         transforms.Normalize(mean, std)])
    return transformations

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING.md and CODE_OF_CONDUCT.md for more info.

3.9 KiB Raw Permalink Blame History

ResMLP: Feedforward networks for image classification with data-efficient training

Model Zoo

Evaluation transforms

License

Contributing

3.9 KiB

Raw Permalink Blame History