mindocr

Branch: main

44 KiB

Raw Permalink Blame History

Third-party Models Offline Inference - Quick Start
4.FAQ about converting and inference

Third-party Models Offline Inference - Quick Start

1. Third-Party Model Support List

MindOCR supports the inference of third-party models (PaddleOCR, MMOCR, etc.), and this document displays a list of adapted models. The performance test is based on Ascend310P, and some models have no test data set yet.

1.1 Text Detection

name	model	backbone	dataset	F-score(%)	FPS	source	config	download	reference
ch_pp_det_OCRv4	DBNet	MobileNetV3	/	/	/	PaddleOCR	yaml	infer model	ch_PP-OCRv4_det
ch_pp_server_det_v2.0	DBNet	ResNet18_vd	MLT17	46.22	21.65	PaddleOCR	yaml	infer model	ch_ppocr_server_v2.0_det
ch_pp_det_OCRv3	DBNet	MobileNetV3	MLT17	33.89	22.40	PaddleOCR	yaml	infer model	ch_PP-OCRv3_det
ch_pp_det_OCRv2	DBNet	MobileNetV3	MLT17	42.99	21.90	PaddleOCR	yaml	infer model	ch_PP-OCRv2_det
ch_pp_mobile_det_v2.0_slim	DBNet	MobileNetV3	MLT17	31.66	19.88	PaddleOCR	yaml	infer model	ch_ppocr_mobile_slim_v2.0_det
ch_pp_mobile_det_v2.0	DBNet	MobileNetV3	MLT17	31.56	21.96	PaddleOCR	yaml	infer model	ch_ppocr_mobile_v2.0_det
en_pp_det_OCRv3	DBNet	MobileNetV3	IC15	42.14	55.55	PaddleOCR	yaml	infer model	en_PP-OCRv3_det
ml_pp_det_OCRv3	DBNet	MobileNetV3	MLT17	66.01	22.48	PaddleOCR	yaml	infer model	ml_PP-OCRv3_det
en_pp_det_dbnet_resnet50vd	DBNet	ResNet50_vd	IC15	79.89	21.17	PaddleOCR	yaml	infer model	DBNet
en_pp_det_psenet_resnet50vd	PSE	ResNet50_vd	IC15	80.44	7.75	PaddleOCR	yaml	train model	PSE
en_pp_det_east_resnet50vd	EAST	ResNet50_vd	IC15	85.58	20.70	PaddleOCR	yaml	train model	EAST
en_pp_det_sast_resnet50vd	SAST	ResNet50_vd	IC15	81.77	22.14	PaddleOCR	yaml	train model	SAST
en_mm_det_dbnetpp_resnet50	DBNet++	ResNet50	IC15	81.36	10.66	MMOCR	yaml	train model	DBNetpp
en_mm_det_fcenet_resnet50	FCENet	ResNet50	IC15	83.67	3.34	MMOCR	yaml	train model	FCENet

Notice: When using the en_pp_det_psenet_resnet50vd model for inference, you need to modify the onnx file with the
following command

python deploy/models_utils/onnx_optim/insert_pse_postprocess.py \
      --model_path=./pse_r50vd.onnx \
      --binary_thresh=0.0 \
      --scale=1.0

1.2 Text recognition

name	model	backbone	dataset	Acc(%)	FPS	source	dict file	config	download	reference
ch_pp_rec_OCRv4	CRNN	MobileNetV1Enhance	/	/	/	PaddleOCR	ppocr_keys_v1.txt	yaml	infer model	ch_PP-OCRv4_rec
ch_pp_server_rec_v2.0	CRNN	ResNet34	MLT17 (ch)	49.91	154.16	PaddleOCR	ppocr_keys_v1.txt	yaml	infer model	ch_ppocr_server_v2.0_rec
ch_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	MLT17 (ch)	49.91	408.38	PaddleOCR	ppocr_keys_v1.txt	yaml	infer model	ch_PP-OCRv3_rec
ch_pp_rec_OCRv2	CRNN	MobileNetV1Enhance	MLT17 (ch)	44.59	203.34	PaddleOCR	ppocr_keys_v1.txt	yaml	infer model	ch_PP-OCRv2_rec
ch_pp_mobile_rec_v2.0	CRNN	MobileNetV3	MLT17 (ch)	24.59	167.67	PaddleOCR	ppocr_keys_v1.txt	yaml	infer model	ch_ppocr_mobile_v2.0_rec
en_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	MLT17 (en)	79.79	917.01	PaddleOCR	en_dict.txt	yaml	infer model	en_PP-OCRv3_rec
en_pp_mobile_rec_number_v2.0_slim	CRNN	MobileNetV3	/	/	/	PaddleOCR	en_dict.txt	yaml	infer model	en_number_mobile_slim_v2.0_rec
en_pp_mobile_rec_number_v2.0	CRNN	MobileNetV3	/	/	/	PaddleOCR	en_dict.txt	yaml	infer model	en_number_mobile_v2.0_rec
korean_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	korean_dict.txt	yaml	infer model	korean_PP-OCRv3_rec
japan_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	japan_dict.txt	yaml	infer model	japan_PP-OCRv3_rec
chinese_cht_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	chinese_cht_dict.txt	yaml	infer model	chinese_cht_PP-OCRv3_rec
te_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	te_dict.txt	yaml	infer model	te_PP-OCRv3_rec
ka_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	ka_dict.txt	yaml	infer model	ka_PP-OCRv3_rec
ta_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	ta_dict.txt	yaml	infer model	ta_PP-OCRv3_rec
latin_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	latin_dict.txt	yaml	infer model	latin_PP-OCRv3_rec
arabic_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	arabic_dict.txt	yaml	infer model	arabic_PP-OCRv3_rec
cyrillic_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	cyrillic_dict.txt	yaml	infer model	cyrillic_PP-OCRv3_rec
devanagari_pp_rec_OCRv3	SVTR	MobileNetV1Enhance	/	/	/	PaddleOCR	devanagari_dict.txt	yaml	infer model	devanagari_PP-OCRv3_rec
en_pp_rec_crnn_resnet34vd	CRNN	ResNet34_vd	IC15	66.35	420.80	PaddleOCR	ic15_dict.txt	yaml	infer model	CRNN
en_pp_rec_rosetta_resnet34vd	Rosetta	Resnet34_vd	IC15	64.28	552.40	PaddleOCR	ic15_dict.txt	yaml	infer model	Rosetta
en_pp_rec_vitstr_vitstr	ViTSTR	ViTSTR	IC15	68.42	364.67	PaddleOCR	EN_symbol_dict.txt	yaml	train model	ViTSTR
en_mm_rec_nrtr_resnet31	NRTR	ResNet31	IC15	67.26	32.63	MMOCR	english_digits_symbols.txt	yaml	train model	NRTR
en_mm_rec_satrn_shallowcnn	SATRN	ShallowCNN	IC15	73.52	32.14	MMOCR	english_digits_symbols.txt	yaml	train model	SATRN

1.3 Text angle classification

name	model	dataset	Acc(%)	FPS	source	config	download	reference
ch_pp_mobile_cls_v2.0	MobileNetV3	/	/	/	PaddleOCR	yaml	infer model	ch_ppocr_mobile_v2.0_cls

2. Overview of Third-Party Inference

graph LR;
    A[ThirdParty models] -- xx2onnx --> B[ONNX] -- converter_lite --> C[MindIR];
    C --input --> D[infer.py] -- outputs --> eval_rec.py/eval_det.py;
    H[images] --input --> D[infer.py];

3. Third-Party Model Inference Methods

3.1 Text Detection

Let's take ch_pp_det_OCRv4 in Third-Party Model Support List as an example to introduce the inference method:

3.1.1 Download Thirdparty model file

In Third-Party Model Support List, infer model indicates model file for inference; train model indicates model file for training, and it need to be converted to inference model first.

If the model file is infer model, like ch_pp_det_OCRv4, dowload and extract infer model and get the following folder:

ch_PP-OCRv4_det_infer/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

If the model file is train model, like en_pp_det_psenet_resnet50vd, dowload and extract train model and get the following folder:

det_r50_vd_pse_v2.0_train/
├── train.log
├── best_accuracy.pdopt
├── best_accuracy.states
├── best_accuracy.pdparams

And it need to be converted by the following commands:

git clone https://github.com/PaddlePaddle/PaddleOCR.git
cd PaddleOCR
python tools/export_model.py \
    -c configs/det/det_r50_vd_pse.yml \
    -o Global.pretrained_model=./det_r50_vd_pse_v2.0_train/best_accuracy  \
    Global.save_inference_dir=./det_db

and you will get the following folder:

det_db/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

3.1.2 Convert the thirdparty model to onnx file

Download and use the paddle2onnx tool

pip install paddle2onnx

and convert the inference model into an onnx file:

paddle2onnx \
     --model_dir det_db \
     --model_filename inference.pdmodel \
     --params_filename inference.pdiparams\
     --save_file det_db.onnx \
     --opset_version 11 \
     --input_shape_dict="{'x':[-1,3,-1,-1]}" \
     --enable_onnx_checker True

A brief explanation of parameters for paddle2onnx is as follows:

Parameter	Description
--model_dir	Configures the directory path containing the Paddle model.
--model_filename	[Optional] Configures the file name storing the network structure located under `--model_dir`.
--params_filename	[Optional] Configures the file name storing the model parameters located under `--model_dir`.
--save_file	Specifies the directory path for saving the converted model.
--opset_version	[Optional] Configures the OpSet version for converting to ONNX. Multiple versions, such as 7~16, are currently supported, and the default is 9.
--input_shape_dict	Specifies the shape of the input tensor for generating a dynamic ONNX model. The format is "{'x': [N, C, H, W]}", where -1 represents dynamic shape.
--enable_onnx_checker	[Optional] Configures whether to check the correctness of the exported ONNX model. It is recommended to enable this switch, and the default is False.

The value of --input_shape_dict in the parameter can be viewed by opening the inference model through the Netron tool.

Learn more about paddle2onnx

The det_db.onnx file will be generated after the above command is executed;

3.1.3 Convert onnx file to Lite MindIR file

Use converter_lite tool on Ascend310/310P to convert onnx files to mindir:

Create config.txt and specify the model input shape:

If converting to static shape model, like static shape of [1,3,736,1280], the config is as following
```
[ascend_context]
input_format=NCHW
input_shape=x:[1,3,736,1280]
```

If converting to dynamic shape(scaling) model, the config is as following

[ascend_context]
input_format=NCHW
input_shape=x:[1,3,-1,-1]
dynamic_dims=[736,1280],[768,1280],[896,1280],[1024,1280]

If converting to dynamic shape model, the config is as following

[acl_build_options]
input_format=NCHW
input_shape_range=x:[-1,3,-1,-1]

A brief explanation of the configuration file parameters is as follows:

Parameter	Attribute	Function Description	Data Type	Value Description
input_format	Optional	Specify the format of the model input	String	Optional values are "NCHW", "NHWC", "ND"
input_shape	Optional	Specify the shape of the model input. The input_name must be the input name in the original model, arranged in order of input, separated by ";"	String	For example: "input1:[1,64,64,3];input2:[1,256,256,3]"
dynamic_dims	Optional	Specify dynamic BatchSize and dynamic resolution parameters	String	For example: "dynamic_dims=[48,520],[48,320],[48,384]"

Learn more about Configuration File Parameters

Run the following command:

converter_lite\
     --saveType=MINDIR \
     --fmk=ONNX \
     --optimize=ascend_oriented \
     --modelFile=det_db.onnx \
     --outputFile=det_db_lite \
     --configFile=config.txt

After the above command is executed, the det_db_lite.mindir file will be generated;

A brief explanation of the converter_lite parameters is as follows:

Parameter	Required	Parameter Description	Value Range	Default	Remarks
fmk	Yes	Input model format	MINDIR, CAFFE, TFLITE, TF, ONNX	-	-
saveType	No	Set the exported model to MINDIR or MS model format.	MINDIR, MINDIR_LITE	MINDIR	The cloud-side inference version can only infer models converted to MINDIR format
modelFile	Yes	Input model path	-	-	-
outputFile	Yes	Output model path. Do not add a suffix, ".mindir" suffix will be generated automatically.	-	-	-
configFile	No	1) Path to the quantization configuration file after training; 2) Path to the configuration file for extended functions	-	-	-
optimize	No	Set the model optimization type for the device. Default is none.	none、general、gpu_oriented、ascend_oriented	-	-

Learn more about converter_lite

Learn more about Model Conversion Tutorial

3.1.4 Inference with Lite MindIR

Perform inference using deploy/py_infer/infer.py codes and det_db_lite.mindir model file:

python deploy/py_infer/infer.py \
    --input_images_dir=/path/to/ic15/ch4_test_images \
    --det_model_path=/path/to/mindir/det_db_lite.mindir \
    --det_model_name_or_config=ch_pp_det_OCRv4 \
    --res_save_dir=/path/to/ch_pp_det_OCRv4_results

After the execution is completed, the prediction file det_results.txt will be generated in the directory pointed to by the parameter --res_save_dir.

When doing inference, you can use the --vis_det_save_dir parameter to visualize the results

Learn more about infer.py inference parameters

3.1.5 Evalution

Evaluate the results using the following command:

python deploy/eval_utils/eval_det.py\
    --gt_path=/path/to/ic15/test_det_gt.txt\
    --pred_path=/path/to/ch_pp_det_OCRv4_results/det_results.txt

3.2 Text Recognition

Let's take ch_pp_rec_OCRv4 in Third-Party Model Support List as an example to introduce the inference method:

3.2.1 Download Thirdparty model file

In Third-Party Model Support List, infer model indicates model file for inference; train model indicates model file for training, and it need to be converted to inference model first.

If the model file is infer model, like ch_pp_rec_OCRv4, dowload and extract infer model and get the following folder:

ch_PP-OCRv4_det_infer/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

If the model file is train model, like en_pp_rec_vitstr_vitstr, dowload and extract train model and get the following folder:

rec_vitstr_none_ce_train/
├── train.log
├── best_accuracy.pdopt
├── best_accuracy.states
├── best_accuracy.pdparams

And it need to be converted by the following commands:

git clone https://github.com/PaddlePaddle/PaddleOCR.git
cd PaddleOCR
python tools/export_model.py \
    -c configs/rec/rec_vitstr_none_ce.yml \
    -o Global.pretrained_model=./rec_vitstr_none_ce_train/best_accuracy  \
    Global.save_inference_dir=./rec_vitstr

and you will get the following folder:

rec_vitstr/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

3.2.2 Convert the thirdparty model to onnx file

Download and use the paddle2onnx tool

pip install paddle2onnx

and convert the inference model into an onnx file:

paddle2onnx \
    --model_dir ch_PP-OCRv4_rec_infer \
    --model_filename inference.pdmodel \
    --params_filename inference.pdiparams \
    --save_file rec_crnn.onnx \
    --opset_version 11 \
    --input_shape_dict="{'x':[-1,3,48,-1]}" \
    --enable_onnx_checker True

The rec_crnn.onnx file will be generated after the above command is executed;

Please refer to 3.1.2 Convert the thirdparty model to onnx file for details about paddle2onnx.

3.2.3 Convert onnx file to Lite MindIR file

Use converter_lite tool on Ascend310/310P to convert onnx files to mindir:

Create config.txt and specify the model input shape:

If converting to static shape model, like static shape of [1,3,48,320], the config is as following
```
[ascend_context]
input_format=NCHW
input_shape=x:[1,3,48,320]
```

If converting to dynamic shape(scaling) model, the config is as following

[ascend_context]
input_format=NCHW
input_shape=x:[1,3,-1,-1]
dynamic_dims=[48,520],[48,320],[48,384],[48,360],[48,394],[48,321],[48,336],[48,368],[48,328],[48,685],[48,347]

If converting to dynamic shape model, the config is as following

[acl_build_options]
input_format=NCHW
input_shape_range=x:[-1,3,-1,-1]

For a brief description of the configuration parameters, please refer to 3.1.3 Convert onnx file to Lite MindIR file

Run the following command:

converter_lite \
    --saveType=MINDIR \
    --fmk=ONNX \
    --optimize=ascend_oriented \
    --modelFile=rec_crnn.onnx \
    --outputFile=rec_crnn_lite \
    --configFile=config.txt

After the above command is executed, the rec_crnn_lite.mindir.mindir file will be generated;

For a brief description of the converter_lite parameters, see the text detection example above.

Learn more about converter_lite

Learn more about Model Conversion Tutorial

3.2.4 Download the Dictionary File for Recognition

According to Third-Party Model Support List, download ppocr_keys_v1.txt which matches with ch_pp_rec_OCRv4.

3.2.5 Inference with Lite MindIR

Perform inference using deploy/py_infer/infer.py codes and rec_crnn_lite.mindir model file:

python deploy/py_infer/infer.py \
    --input_images_dir=/path/to/mlt17_ch \
    --rec_model_path=/path/to/mindir/rec_crnn_lite.mindir \
    --rec_model_name_or_config=ch_pp_rec_OCRv4 \
    --character_dict_path=/path/to/ppocr_keys_v1.txt \
    --res_save_dir=/path/to/ch_rec_infer_results

After the execution is completed, the prediction file rec_results.txt will be generated in the directory pointed to by the parameter --res_save_dir.

Learn more about infer.py inference parameters

3.2.6 Evalution

Evaluate the results using the following command:

python deploy/eval_utils/eval_rec.py \
    --gt_path=/path/to/mlt17_ch/chinese_gt.txt \
    --pred_path=/path/to/en_rec_infer_results/rec_results.txt

Refer Dataset converters for dataset preparation.

3.3 Text Direction Classification

Let's take ch_pp_mobile_cls_v2 in Third-Party Model Support List as an example to introduce the inference method:

3.3.1 Download Thirdparty model file

In Third-Party Model Support List
，ch_pp_mobile_cls_v2.0 is a infer model，so convertion is not needed. dowload and extract it and get the following folder:

ch_ppocr_mobile_v2.0_cls_infer/
├── inference.pdmodel
├── inference.pdiparams
├── inference.pdiparams.info

3.3.2 Convert the thirdparty model to onnx file

convert the inference model into an onnx file:

paddle2onnx \
    --model_dir cls_mv3 \
    --model_filename inference.pdmodel \
    --params_filename inference.pdiparams \
    --save_file cls_mv3.onnx \
    --opset_version 11 \
    --input_shape_dict="{'x':[-1,3,-1,-1]}" \
    --enable_onnx_checker True

The cls_mv3.onnx file will be generated after the above command is executed;