The solution I developed first begins with determining the level of preprocessing the image requires in order to be robust to lighting conditions. It does this by evaluating the mean lightness of the image, then determining if the lightness needs increased via CLAHE and by how much.
From there, the various regions of the image are segmented in an attempt to isolate all the text regions, with some additional filtering to reduce guaranteed false positives based on things like aspect ratio and non-maximal suppression to eliminate redundant samples.
The last major step is the classification step. First each region is determined to either contain text or not, which is done via a CNN for text discrimination as well as a Stroke-Width Transform filter. The remaining regions are then classified into digits, and their position within the image is retained so they can be sorted later.
The regions are then sorted and the digits present in the image are returned in order,
Code for this project is not available. Providing it publicly would be in violation of Georgia Institute of Technology's plagiarism rules.
The algorithm was able to correctly identify the sequence of digits in all images shown.
Copyright © 2022 Skyler Horn - All Rights Reserved.
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.