katekong 1e8e701f02 | 1 year ago | |
---|---|---|
.. | ||
backbones | 1 year ago | |
heads | 1 year ago | |
necks | 1 year ago | |
README.md | 1 year ago | |
__init__.py | 1 year ago | |
_registry.py | 1 year ago | |
base_model.py | 1 year ago | |
builder.py | 1 year ago | |
det_dbnet.py | 1 year ago | |
rec_crnn.py | 1 year ago |
Decompose the model into 3 (or 2) modules: backbone, (neck,) head. Neck is usually not involved in recognition tasks.
For each module:
a. if it is implemented in MindOCR, skip since you can get the module by the build_{module}
function .
b. if not, please implement it and follow the module format guideline
Define your model in two ways
a. Write a model py file, which includes the model class and specification functions. Please follow the model format guideline. It is to allows users to invoke a pre-defined model easily, such as model = build_model('dbnet_r50', pretrained=True)
.
b. Config the architecture in a yaml file. Please follow the yaml format guideline . It is to allows users to modify a base architecture quickly in yaml file.
To verify the correctness of the written model, please run test_model.py
python tests/ut/test_model.py --config /path/to/yaml_config_file
models/backbones/{task}_{backbone}.py
, e.g, det_resnet.py
(since the same backbone for det and rec may differ, the task prefix is necessary)class DetResNet
__init__
args: no limitation, define by model need.out_channels
(List), to describe channels of each output features. e.g. self.out_channels=[256, 512, 1024, 2048]
construct
args: x (Tensor)construct
return: features (List[Tensor]) for features extracted from different layers in the backbone, feature dim order [bs, channels, …]
. Expect shape of each feature: [bs, channels, H, W]
models/necks/{neck_name}.py
, e.g, fpn.py
class FPN
__init__
args: MUST contain in_channels
param as the first position, e.g. __init__(self, in_channels, out_channels=256, **kwargs)
.out_channels
attribute, to describe channel of the outpu feature. e.g. self.out_channels=256
construct
args: features (List(Tensor))construct
return: feature (Tensor) for output feature, feature dim order [bs, channels, …]
models/heads/{head_name}.py
, e.g., dbhead.py
class DBHead
__init__
args: MUST contain in_channels
param as the first position, e.g. __init__(self, in_channels, out_channels=2, **kwargs)
.construct
args: feature (Tensor)construct
return: prediction Union[Tensor, dict]. If it is a dict, key names should match the used key in loss function. {'maps': out, 'score': score}, which should match the loss function.Note: if there is no neck in the model architecture like crnn, you can skip writing for neck. BaseModel
will select the last feature of the features (List(Tensor)) output by Backbone, and forward it Head module.
models/{task}_{model_class_name}.py
, e.g., det_dbnet.py
class DBNet
BaseModel
, e.g., class DBNet(BaseModel)
{model_class_name}_{specifiation}.py
, e.g. def dbnet_r50()
(Note: no need to add task prefix assuming no one model can solve any two tasks)def dbnet_r50(pretrained=False, **kwargs)
.Note: Once you finish writing the model specification function, you should be able to use it in the yaml file for training or inference as follows,
# in a yaml file
model:
name: dbnet_r50 # model specificatio function name
pretrained: False
or, use it via the build_model
func.
# in a python script
model = build_model('dbnet_r50', pretrained=False)
To define/config the model architecture in yaml file, you should follow the keys in the following examples.
model: # R
type: det
backbone: # R
name: det_resnet50 # R, backbone specification function name
pretrained: False
neck: # R
name: FPN # R, neck class name
out_channels: 256 # D, neck class __init__ arg
#use_asf: True
head: # R, head class name
name: ConvHead # D, head class __init__ arg
out_channels: 2
k: 50
model: # R
type: rec
backbone: # R
name: resnet50 # R
pretrained: False
head: # R
name: ConvHead # R
out_channels: 30 # D
(R is short for Required. D - Depends on model)
This is forked from https://github.com/mindspore-lab/mindocr
Jupyter Notebook Python Text Markdown Shell
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》