English | 简体中文
We will use PP-LiteSeg model
and Medical Video Disc Segmentation Dataset
as example to introduce PaddleSeg's configurable driver. If you want to know how to use API, you can click PaddleSeg Advanced Tutorial.
The whole process is as follows:
Before using PaddleSeg to train an image segmentation model, users need to complete the following tasks:
PaddlePaddle
(version >= 2.1), please refer to Quick Installation for the specific installation method. Due to the high computational cost of the image segmentation model, it is recommended to use PaddleSeg under the GPU version of PaddlePaddle.git clone https://github.com/PaddlePaddle/PaddleSeg.git
#If the github download network is poor, users can choose gitee to download
git clone https://gitee.com/paddlepaddle/PaddleSeg.git
Install the PaddleSeg API library, while installing the library, other dependencies for running PaddleSeg are also installed at the same time
pip install paddleseg
Execute the following command in the PaddleSeg directory, if the predicted result appears in the PaddleSeg/output folder, the installation is successful.
Note that: the commands of training, validation and prediction are executed in the root of PaddleSeg by default.
python predict.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--model_path https://paddleseg.bj.bcebos.com/dygraph/optic_disc/pp_liteseg_optic_disc_512x512_1k/model.pdparams\
--image_path docs/images/optic_test_image.jpg \
--save_dir output/result
Our demo uses the optic disc segmentation dataset
for training.
Optic disc segmentation is a set of fundus medical segmentation datasets, including 267 training images, 76 verification images, and 38 test images. You can download them by the following command.
mkdir data
cd data
wget https://paddleseg.bj.bcebos.com/dataset/optic_disc_seg.zip
unzip optic_disc_seg.zip
cd ..
The original image and segmentation result are shown below. Our task will be to segment the optic disc area in the eyeball picture.
Figure 1: Original image and segmentation result
How to use your own dataset for training is the most concerned thing for developers. Below we will focus on explaining what we should prepare if we want to customize the dataset. And we will tell you how to make corresponding changes in the configuration file after the dataset is ready.
It is recommended to organize into the following structure.
custom_dataset
|
|--images
| |--image1.jpg
| |--image2.jpg
| |--...
|
|--labels
| |--label1.png
| |--label2.png
| |--...
|
|--train.txt
|
|--val.txt
|
|--test.txt
The origin images with 3 channels are saved in images
directory. The label images with 1 channel are saved in labels
directory. The train.txt, val.txt and test.txt denotes the train set, validation set and test set, respectively.
It is not necessary for the folder to be named custom_dataset, images, labels, and the user can name it independently.
The file in train.txt val.txt test.txt does not have to be in the same directory as the custom_dataset folder, it can be modified through the options in the configuration file.
The contents of train.txt and val.txt are as follows:
images/image1.jpg labels/label1.png
images/image2.jpg labels/label2.png
...
The format of the dataset we just downloaded is similar (label.txt is optional). If users want to label and divide the dataset, please refer to Data Marking Document and dataset division document.
We choose the PP-LiteSeg model for training.
PP-LiteSeg model is a real-time semantic segmentation model, which is proposed by PaddleSeg team.
Compared to other models, PP-LiteSeg achieves superior trade-off between accuracy and speed on Cityscapes and CamVid dataset. Specifically, we present a Flexible and Lightweight Decoder (FLD) to reduce computation overhead of previous decoder. To strengthen feature representations, we propose a Unified Attention Fusion Module (UAFM), which takes advantage of spatial and channel attention to produce a weight and then fuses the input features with the weight. Moreover, a Simple Pyramid Pooling Module (SPPM) is proposed to aggregate global context with low computation cost. The architecture of PP-LiteSeg is shown in next figure. For more information of PP-LiteSeg , please refer to doc.
After understanding the principle of PP-LiteSeg, we can prepare for training. In the above, we talked about PaddleSeg providing configurable driver for model training. So before training, let’s take a look at the configuration file. Here we take pp_liteseg_optic_disc_512x512_1k.yml
as an example. The yaml format configuration file includes model type, backbone network, training and testing, pre-training dataset and supporting tools (such as Data augmentation) and other information.
PaddleSeg lists every option that can be optimized in the configuration file. Users can customize the model by modifying this configuration file (All configuration files are under the PaddleSeg/configs folder), such as custom models The backbone network used, the loss function used by the model, and the configuration of the network structure. In addition to customizing the model, data processing strategies can be configured in the configuration file, such as data augmentation strategies such as resizing, normalization, and flipping.
Key Parameter:
-1: In the learning rate given in the PaddleSeg configuration file, except for the single-card learning rate in "pp_liteseg_optic_disc_512x512_1k.yml", the rest of the configuration files are all 4-card learning rates. If the user is training with a single card, then learn The rate setting should become 1/4 of the original.
-2: The configuration file in PaddleSeg gives a variety of loss functions: CrossEntropy Loss, BootstrappedCrossEntropy Loss, Dice Loss, BCE Loss, OhemCrossEntropyLoss, RelaxBoundaryLoss, OhemEdgeAttentionLoss, Lovasz Hinge Loss, Lovasz Soft Loss, users can perform according to their own needs Change.
-3: The details of config file are as following.
batch_size: 4 # Set the number of pictures sent to the network at one iteration. Generally speaking, the larger the video memory of the machine you are using, the higher the batch_size value.
iters: 1000 # Number of iterations
train_dataset: # Training dataset
type: OpticDiscSeg # The name of the training dataset class
dataset_root: data/optic_disc_seg # The directory where the training dataset is stored
num_classes: 2 # Number of pixel categories
transforms: # Data transformation and data augmentation
- type: Resize Need to resize before sending to the network
target_size: [512, 512] # Resize the original image to 512*512 and send it to the network
- type: RandomHorizontalFlip # Flip the image horizontally with a certain probability
- type: Normalize # Normalize the image
mode: train
val_dataset: # Validating dataset
type: OpticDiscSeg # The name of the training dataset class
dataset_root: data/optic_disc_seg # The directory where the validating dataset is stored
num_classes: 2 # Number of pixel categories
transforms: # Data transformation and data augmentation
- type: Resize Need to resize before sending to the network
target_size: [512, 512] # Resize the original image to 512*512 and send it to the network
- type: Normalize # Normalize the image
mode: val
optimizer: # Set the type of optimizer
type: sgd #Using SGD (Stochastic Gradient Descent) method as the optimizer
momentum: 0.9
weight_decay: 4.0e-5 # Weight attenuation, the purpose of use is to prevent overfitting
lr_scheduler: # Related settings for learning rate
type: PolynomialDecay # A type of learning rate,a total of 12 strategies are supported
learning_rate: 0.01
power: 0.9
end_lr: 0
loss: # Set the type of loss function
types:
- type: CrossEntropyLoss # The type of loss function
coef: [1, 1, 1]
# PP-LiteSeg has 2 auxiliary losses and a main losses, coef means weight: total_loss = coef_1 * loss_1 + .... + coef_n * loss_n
model: # Model description
type: PPLiteSeg # Set model name
backbone: # Set the backbone,include name and pretrained weights
type: STDC1
pretrained: https://bj.bcebos.com/paddleseg/dygraph/PP_STDCNet1.tar.gz
FAQ
Q: Some readers may have questions, what kind of configuration items are designed in the configuration file, and what kind of configuration items are in the command line parameters of the script?
A: The information related to the model scheme is in the configuration file, and it also includes data augmentation strategies for the original sample. In addition to the three common parameters of iters, batch_size, and learning_rate, the command line parameters only involve the configuration of the training process. In other words, the configuration file ultimately determines what model to use.
When the user prepares the dataset, he can specify the location in the configuration file to modify the data path for further training
Here, we take the "pp_liteseg_optic_disc_512x512_1k.yml" file mentioned in the above article as an example, and select the data configuration part for your explanation.
Mainly focus on these parameters:
train_dataset:
type: Dataset
dataset_root: dataset/optic_disc_seg
train_path: dataset/optic_disc_seg/train_list.txt
num_classes: 2
transforms:
- type: Resize
target_size: [512, 512]
- type: RandomHorizontalFlip
- type: Normalize
mode: train
val_dataset:
type: Dataset
dataset_root: dataset/optic_disc_seg
val_path: dataset/optic_disc_seg/val_list.txt
num_classes: 2
transforms:
- type: Resize
target_size: [512, 512]
- type: Normalize
mode: val
After we modify the corresponding configuration parameters, we can start training.
export CUDA_VISIBLE_DEVICES=0 # Set 1 usable card
**Please execute the following command under windows**
**set CUDA_VISIBLE_DEVICES=0**
python train.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--do_eval \
--use_vdl \
--save_interval 500 \
--save_dir output
The weights of trained model is saved in output
.
output
├── iter_500 # Means to save the model once at 500 steps
├── model.pdparams # Model parameters
└── model.pdopt # Optimizer parameters during training
├── iter_1000
├── model.pdparams
└── model.pdopt
└── best_model # #During training, after training, add --do_eval, every time the model is saved, it will be evaled once, and the model with the highest miou will be saved as best_model
└── model.pdparams
Parameter | Effection | Is Required | Default |
---|---|---|---|
iters | Number of training iterations | No | The value specified in the configuration file. |
batch_size | Batch size on a single card | No | The value specified in the configuration file. |
learning_rate | Initial learning rate | No | The value specified in the configuration file. |
config | Configuration files | Yes | - |
save_dir | The root path for saving model and visualdl log files | No | output |
num_workers | The number of processes used to read data asynchronously, when it is greater than or equal to 1, the child process is started to read dat | No | 0 |
use_vdl | Whether to enable visualdl to record training data | No | No |
save_interval | Number of steps between model saving | No | 1000 |
do_eval | Whether to do evaluation when saving the model, the best model will be saved according to mIoU | No | No |
log_iters | Interval steps for printing log | No | 10 |
resume_model | Restore the training model path, such as: output/iter_1000 |
No | None |
keep_checkpoint_max | Number of latest models saved | No | 5 |
Figure 3: In-depth exploration of configuration files
In PaddleSeg2.0 mode, users can find that PaddleSeg adopts a more coupled configuration design, placing common configurations such as data, optimizer, and loss function under a single configuration file. When we try to change to a new network The structure is time, you only need to pay attention to model switching, which avoids the tedious rhythm of switching models to re-adjust these common parameters and avoid user errors.
FAQ
Q: There are some common parameters in multiple configuration files, so which one shall I prevail?
A: As shown by the serial number in the figure, the parameters of the No. 1 yml file can cover the parameters of the No. 2 yml file, that is, the configuration file No. 1 is better than the No. 2. In addition, if the parameters appearing in the yaml file are specified in the command line, the configuration of the command line is better than the yaml file. (For example: adjust batch_size
in the command line according to your machine configuration, no need to modify the preset yaml file in configs)
Note: If you want to use multi-card training, you need to specify the environment variable CUDA_VISIBLE_DEVICES
as multi-card
(if not specified, all GPUs will be used by default), and use paddle.distributed.launch
to start the training script (Can not use multi-card training under Windows, because it doesn't support nccl):
export CUDA_VISIBLE_DEVICES=0,1,2,3 # Set 4 usable cards
python -m paddle.distributed.launch train.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--do_eval \
--use_vdl \
--save_interval 500 \
--save_dir output
python train.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--resume_model output/iter_500 \
--do_eval \
--use_vdl \
--save_interval 500 \
--save_dir output
When the use_vdl
switch is turned on, PaddleSeg will write the data during the training process into the VisualDL file, and you can view the log during the training process in real time. The recorded data includes:
do_eval
switch is turned on)do_eval
switch is turned on)Use the following command to start VisualDL to view the log
**The following command will start a service on 127.0.0.1, which supports viewing through the front-end web page, and the actual ip address can be specified through the --host parameter**
visualdl --logdir output/
Enter the suggested URL in the browser, the effect is as follows:
Figure 4: VDL effect demonstration
After the training is completed, the user can use the evaluation script val.py to evaluate the effect of the model. Assuming that the number of iterations (iters) in the training process is 1000, the interval for saving the model is 500, that is, the training model is saved twice for every 1000 iterations of the dataset. Therefore, there will be a total of 2 regularly saved models, plus the best model best_model saved, there are a total of 3 models. You can specify the model file you want to evaluate through model_path.
python val.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--model_path output/iter_1000/model.pdparams
If you want to perform multi-scale flip evaluation, you can turn it on by passing in --aug_eval
, and then passing in scale information via --scales
, --flip_horizontal
turns on horizontal flip, and flip_vertical
turns on vertical flip. Examples of usage are as follows:
python val.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--model_path output/iter_1000/model.pdparams \
--aug_eval \
--scales 0.75 1.0 1.25 \
--flip_horizontal
If you want to perform sliding window evaluation, you can open it by passing in --is_slide
, pass in the window size by --crop_size
, and pass in the step size by --stride
. Examples of usage are as follows:
python val.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--model_path output/iter_1000/model.pdparams \
--is_slide \
--crop_size 256 256 \
--stride 128 128
In the field of image segmentation, evaluating model quality is mainly judged by three indicators, accuracy
(acc), mean intersection over union
(mIoU), and Kappa coefficient
.
With the running of the evaluation script, the final printed evaluation log is as follows.
...
2021-01-13 16:41:29 [INFO] Start evaluating (total_samples=76, total_iters=76)...
76/76 [==============================] - 2s 30ms/step - batch_cost: 0.0268 - reader cost: 1.7656e-
2021-01-13 16:41:31 [INFO] [EVAL] #Images=76 mIoU=0.8526 Acc=0.9942 Kappa=0.8283
2021-01-13 16:41:31 [INFO] [EVAL] Class IoU:
[0.9941 0.7112]
2021-01-13 16:41:31 [INFO] [EVAL] Class Acc:
[0.9959 0.8886]
In addition to analyzing the IOU, ACC and Kappa indicators of the model, we can also check the cutting sample effect of some specific samples, and inspire further optimization ideas from Bad Case.
The predict.py script is specially used to visualize prediction cases. The command format is as follows
python predict.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--model_path output/iter_1000/model.pdparams \
--image_path dataset/optic_disc_seg/JPEGImages/H0003.jpg \
--save_dir output/result
Among them, image_path
can also be a directory. At this time, all the pictures in the directory will be predicted and the visualization results will be saved.
Similarly, you can use --aug_pred
to turn on multi-scale flip prediction, and --is_slide
to turn on sliding window prediction.
We select 1 picture to view, the effect is as follows. We can intuitively see the difference between the cutting effect of the model and the original mark, thereby generating some optimization ideas, such as whether the cutting boundary can be processed in a regular manner.
Figure 5: Prediction effect display
In order to facilitate the user's industrial-level deployment, PaddleSeg provides a one-click function of moving to static, which is to convert the trained dynamic graph model file into a static graph form.
python export.py \
--config configs/quick_start/pp_liteseg_optic_disc_512x512_1k.yml \
--model_path output/iter_1000/model.pdparams
Parameter | Effection | Is Required | Default |
---|---|---|---|
config | Configuration file | Yes | - |
save_dir | The root path for saving model and visualdl log files | No | output |
model_path | Path of pretrained model parameters | No | The value specified in the configuration file. |
- Result Files
output
├── deploy.yaml # Deployment related configuration files
├── model.pdiparams # Static graph model parameters
├── model.pdiparams.info # Additional parameter information, generally don’t need attention
└── model.pdmodel # Static graph model files
-PaddleSeg currently supports the following deployment methods:
Platform | Library | Tutorial |
---|---|---|
Python | Paddle prediction library | e.g. |
C++ | Paddle prediction library | e.g. |
Mobile | PaddleLite | e.g. |
Serving | HubServing | Comming soon |
Front-end | PaddleJS | e.g. |
#Run the following command, an image of H0003.png will be generated under the output file
python deploy/python/infer.py \
--config output/deploy.yaml\
--image_path dataset/optic_disc_seg/JPEGImages/H0003.jpg\
--save_dir output
Parameter | Effection | Is required | Default |
---|---|---|---|
config | Configuration file generated when exporting the model, instead of the configuration file in the configs directory | Yes | - |
image_path | The path or directory of the test image. | Yes | - |
use_trt | Whether to enable TensorRT to accelerate prediction. | No | No |
use_int8 | Whether to run in int8 mode when starting TensorRT prediction. | No | No |
batch_size | Batch sizein single card. | No | The value specified in the configuration file. |
save_dir | The directory of prediction results. | No | output |
with_argmax | Perform argmax operation on the prediction results. | No | No |
PaddleSeg
├── configs # Configuration file folder
├── paddleseg # core code for training deployment
├── core # Start model training, evaluation and prediction interface
├── cvlibs # The Config class is defined in this folder. It saves all hyperparameters such as dataset, model configuration, backbone network, loss function, etc.
├── callbacks.py
└── ...
├── datasets # PaddleSeg supported data formats, including ade, citycapes and other formats
├── ade.py
├── citycapes.py
└── ...
├── models # This folder contains the various parts of the PaddleSeg network
├── backbone # The backbone network used by paddleseg
├── hrnet.py
├── resnet_vd.py
└── ...
├── layers # Some components, such as the attention mechanism
├── activation.py
├── attention.py
└── ...
├── losses # This folder contains the loss function used by PaddleSeg
├── dice_loss.py
├── lovasz_loss.py
└── ...
├── ann.py # This file represents the algorithm model supported by PaddleSeg, here represents the ann algorithm.
├── deeplab.py #This file represents the algorithm model supported by PaddleSeg, here it represents the Deeplab algorithm.
├── unet.py #This file represents the algorithm model supported by PaddleSeg, here it represents the unet algorithm.
└── ...
├── transforms # Data preprocessing operations, including various data augmentation strategies
├── functional.py
└── transforms.py
└── utils
├── config_check.py
├── visualize.py
└── ...
├── train.py # The training entry file, which describes the analysis of parameters, the starting method of training, and the resources prepared for training.
├── predict.py # Prediction file
└── ...
PaddleSeg and other development kits in various fields have provided top-level solutions for real industrial practice. Some domestic teams have used PaddleSeg's development kits to achieve good results in international competitions. It can be seen that the effects provided by the development kits are State Of The Art.
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》