Local Inference deployment

1. Description

This document describes deploying segmentation models on the server side using the Python interface for Paddle Inference. With a certain configuration and a small amount of code, you can integrate the model into your own service to complete the task of image segmentation.

Paddle Inference's official website document introduces the steps to deploy the model, various API interfaces, examples, etc. You can use it according to your actual needs .

2. Preliminary preparation

Please use the Model Export Tool to export your model, or click to download our Sample Model for testing.

Then prepare a test image for the test effect, we provide a image in the validation set of cityscapes for demonstration effect, if Your model is trained using another dataset, please prepare test images by yourself.

3. Prediction

Enter the following command in the terminal to make predictions:

python deploy/python/infer.py --config /path/to/deploy.yaml --image_path # if you set input_shape in export, you need to pass in images with the same shape as input_shape or pass in shape-changing transforms in deploy.yaml.

The parameter descriptions are as follows:

Parameter name	Purpose	Required option	Default value
config	The configuration file generated when exporting the model, not the configuration file in the configs directory	Yes	-
image_path	The path or directory of the predicted image	is	-
use_cpu	Whether to use X86 CPU prediction, the default is to use GPU prediction	No	No
use_trt	Whether to enable TensorRT to speed up prediction	No	No
use_int8	Whether to run in int8 mode when starting TensorRT prediction	no	no
use_mkldnn	Whether to enable MKLDNN for accelerated prediction	No	No
batch_size	single card batch size	no	specified value in configuration file
save_dir	Directory where prediction results are saved	no	output
with_argmax	Perform argmax operation on the prediction result	No	No

Test examples and predicted results are as follows

Notice

When using quantitative model prediction, both TensorRT prediction and int8 prediction need to be turned on to have the acceleration effect
To use TensorRT, you need to use the Paddle library that supports the TRT function. Please refer to Appendix to download the corresponding PaddlePaddle installation package, or refer to source code compilation to compile it yourself.

2.9 KiB Raw Permalink Blame History

Local Inference deployment

1. Description

2. Preliminary preparation

3. Prediction

2.9 KiB

Raw Permalink Blame History