This fork includes configurations for fine-tuning PaddleOCR on license plate data, along with an extension for MLflow logging.
These instructions are based on the guide provided here.
- Open the cloned repository and install the environment with
poetry:
cd PaddleOCR/
poetry install- Download a pretrained model.
# Download the pre-trained model for en_PP-OCRv3 (the weights can also be found in the models.dev bucket on MinIO)
wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv3/english/en_PP-OCRv3_rec_train.tar
# Decompress model parameters
cd pretrain_models
tar -xf en_PP-OCRv3_rec_train.tar && rm -rf en_PP-OCRv3_rec_train.tar
cd ..- Ensure your dataset is formatted for PaddleOCR and move it to the
datasets/directory:mv /path/to/paddleocr_dataset ./datasets - Fine-tune the model:
poetry run python3 tools/train.py -c configs/rec/PP-OCRv3/slplates_finetuning.yml -o Global.pretrained_model=pretrain_models/en_PP-OCRv3_rec_train/best_accuracy- Export the fine-tuned model to an inference model and specify a custom destination directory (e.g.,
./slplates_inference_model):
poetry run python3 tools/export_model.py -c configs/rec/PP-OCRv3/slplates_finetuning.yml -o Global.pretrained_model=output/normal_finetuned_slplates_paddleocr/best_model/model Global.save_inference_dir=./slplates_inference_modelTo enable MLflow logging during fine-tuning, add the following to your configuration file:
mlflow:
mlflow_log_every_n_iter: <n_iter> # Log training metrics every <n_iter> iterations (evaluation metrics are always logged)
name: <name_of_the_run> # Name of the MLflow run (e.g., "lp_finetuned")
log_only_best_model: <true/false> # Whether to log only the best model based on its accuracy
pbtxt_configs: # Configuration for generating .pbtxt files to log models in Triton inference server format
max_batch_size: <n_samples> # Maximum batch size for inference
dynamic_batching: <true/false> # Enable or disable dynamic batching for TritonAdditionally, create a .env file by copying .env.example and filling in the correct values."
This project is released under Apache License Version 2.0.
In addition to the modifications described in the header of the original files, the following changes have been made:
- The
setup.pyandMANIFEST.infiles have been removed as the project has been switched to using Poetry for dependency management and packaging. - Heavily revised
README.mdto reflect changes and usage for license plate fine-tuning.
