Message Boards Message Boards

0
|
6198 Views
|
2 Replies
|
1 Total Likes
View groups...
Share
Share this post:

TextRecognize giving inconsistent performance

Posted 9 years ago

Hello,

I am trying to perform TextRecognize on some colored jpeg images. In each image is a large white label with 8 large typed characters. There may be colored objects outside the white label in the image (sorry in advance but I can't provide any images). I find that TextRecognize does one of three things: Correctly return the string, returns nothing, or returns junk. I am looking for ideas to improve the performance and suspect this function only works well with text on a uniform background. Any clues or suggestions would be appreciated. Thanks

POSTED BY: david p
2 Replies

Dear David,

you might want to look at LocalAdaptiveBinarize and in particular the first example of the Applications section. There is also this example. Another important factor is that the scanned page is properly aligned. If the text is slightly distorted or rotated OCR will be problematic. It has been reported that changing the alpha channel might also help.

I also found this discussion helpful.

You are saying that the background is not uniform. If LocalAdaptiveBinarize does not help you might want to use RemoveBackground to preprocess the images.

As Sean said, an image that has a similar problem would help a lot.

Cheers,

Marco

POSTED BY: Marco Thiel

What kind of preprocessing are you doing on the image?

It's very hard to say what kind of preprocessing you might want to do without an example at all. Maybe you can find an example online that is somewhat similar to what you are doing.

POSTED BY: Sean Clarke
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract