[GSoC] Add digit and text recognition samples. #17675
Conversation
Users may want to compare different algorithms/approaches on their tasks. |
Hi, I keep the original digits.cpp as digits_SVM.cpp, and add the proposal as digits_LeNet.cpp. |
I modified the original font size, the previous one is too big to affect the display effect. |
@zihaomu, thank you very much! I've tested your code and it works great! Can you, please, also update text_detection.py to use the same model for detection and add OCR part? |
@vpisarev Thank you for your reply. |
It seems to me ready for merge. May I ask to squash all the commits into one? |
5f7eef9
to
397ba2d
|
Hi, this is my GSoC project to add digit and text recognition samples.
Status Update:
The detailed tutorial of OCR models usage method and how to train your own OCR model have been added to the doc/tutorials/dnn/dnn_OCR/dnn_OCR.markdown.
1. digit recognition
Take the live image from the camera, use connected component analysis to detect potential regions with each digit, and use the LeNet to classify.

With CPU only (i5-8300), it can achieve 12 FPS.
2. scene text recognition
My laptop environment is CPU: i5-8300, GPU: 1050, Ubuntu 18
Take the live image from the camera, use EAST as text detector. After getting the detector output, crop these bounding box as the input of the text recognizer based on VGG Net. Finally, print the result near the box. Using GPU can achieve around 9FPS.
After loading the model into OpenCV, test the performance of the text recognition model on different data sets. And the result is filled in following table:
These pre-trained models can be found here https://drive.google.com/drive/folders/1cTbQ3nuZG-EKWak6emD_s8_hHXWz7lAr?usp=sharing. The FPS in the table is the performance of the text recognition model on my computer, and does not include the text detection model.
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.