MindOCR is an open-source toolbox for OCR development and application based on MindSpore. It helps users to train and apply the best text detection and recognition models, such as DBNet/DBNet++ and CRNN/SVTR, to fulfill image-text understanding needs.
To install the dependency, please run
pip install -r requirements.txt
Additionally, please install MindSpore(>=1.9) following the official installation instructions for the best fit of your machine.
For distributed training, please install openmpi 4.0.3.
Environment | Version |
---|---|
MindSpore | >=1.9 |
Python | >=3.7 |
Notes:
- If you use MX Engine for Inference, the version of Python should be 3.9.
- If scikit_image cannot be imported, you can use the following command line to set environment variable
$LD_PRELOAD
referring to here. Changepath/to
to your directory.export LD_PRELOAD=path/to/scikit_image.libs/libgomp-d22c30c5.so.1.0.0:$LD_PRELOAD
Coming soon
The latest version of MindOCR can be installed as follows:
pip install git+https://github.com/mindspore-lab/mindocr.git
Notes: MindOCR is only tested on MindSpore>=1.9, Linux on GPU/Ascend devices currently.
We will take DBNet model and ICDAR2015 dataset as an example to illustrate how to configure the training process with a few lines of modification on the yaml file.
Please refer to DBNet readme for detailed instructions.
We will take CRNN model and LMDB dataset as an illustration on how to configure and launch the training process easily.
Detailed instructions can be viewed in CRNN readme.
Note:
The training pipeline is fully extendable. To train other text detection/recognition models on a new dataset, please configure the model architecture (backbone, neck, head) and data pipeline in the yaml file and launch the training script with python tools/train.py -c /path/to/yaml_config
.
MX, which is short for MindX, allows efficient model inference and deployment on Ascend devices.
MindOCR supports OCR model inference with MX Engine. Please refer to mx_infer for detailed illustrations.
Coming soon
Coming soon
For the detailed performance of the trained models, please refer to configs.
For detailed inference performance using MX engine, please refer to mx inference performance
grouping_strategy
or no_weight_decay_params
arg.type
of loss_scale
as dynamic
. A YAML example can be viewed in configs/rec/crnn/crnn_icdar15.yaml
output_keys
-> output_columns
, num_keys_to_net
-> num_columns_to_net
i) Create a new training task on the openi cloud platform.
ii) Link the dataset (e.g., ic15_mindocr) on the webpage.
iii) Add run parameter `config` and write the yaml file path on the website UI interface, e.g., '/home/work/user-job-dir/V0001/configs/rec/test.yaml'
iv) Add run parameter `enable_modelarts` and set True on the website UI interface.
v) Fill in other blanks and launch.
ckpt_load_path
ckpt_save_dir
is moved from system
to train
in yaml.We appreciate all kinds of contributions including issues and PRs to make MindOCR better.
Please refer to CONTRIBUTING.md for the contributing guideline. Please follow the Model Template and Guideline for contributing a model that fits the overall interface :)
This project follows the Apache License 2.0 open-source license.
If you find this project useful in your research, please consider citing:
@misc{MindSpore OCR 2023,
title={{MindSpore OCR }:MindSpore OCR Toolbox},
author={MindSpore Team},
howpublished = {\url{https://github.com/mindspore-lab/mindocr/}},
year={2023}
}
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》