Message Boards Message Boards

5 Replies
9 Total Likes
View groups...
Share this post:

How to segment characters from images with complicated backgrounds?

Posted 10 years ago
The key limitation in most OCR software is handling images that are not from printed material and so I've trained my ocr engine for reading text from images with different fonts and imperfect character faces. Unfortunately, most times the text we wish to capture is on things such as cards (id, business, credit, etc..) that have characters that are quite difficult to segment from the background. This is due to coloration and lighting as well as challenges that arise due to the 3D structures of some characters.  The preprocessing stage in my program consists of first smoothing the cropped image with gaussian blur, then performing edge detection using sobel, and using an adaptive threshold. As you can see the results are not great:

Example Input from a card:


Another example Input from a card:


So the question is, when trying to preprocess a such images with non-uniform backgrounds for OCR, how does one segment the foreground text? 

Most of the difficult images I've encountered are taken under sunlight with a non-uniform background, where the sun casts strong shadows and makes bright spots in the image.  Some possiblities I've been trying are using surf data and Harris corners, to no avail...

Any suggestions and code would be much apreciated, and I can post more sample inputs if needed! This paper may help: Credit Card Processing Using Cell Phone Images
POSTED BY: Michael Sollami
5 Replies
First thing to note, cleaning background and OCR are different things. If one fails, does not mean the other is bad. So let’s split it into two parts.

1) OCR: Optical Character Recognition choices and levels of difficulty:2) Cleaning background

I will use a different image:
i = Import[""]

-------------- METHOD 1 --------------

Create a mask with double Binarize 
a = .55; b = .68; f[x_, a_, b_] := a < x < b;
mask = RegionBinarize[i, Binarize[i, f[#[[1]], a, b] && f[#[[2]], a, b] && f[#[[3]], a, b] &], .21]

Extract the digits 
ImageMultiply[i, Blur[Erosion[Dilation[mask, 4], 2], 10]]

-------------- METHOD 2 --------------

Create a mask with ClusteringComponents
cc = ClusteringComponents[i, 30];
maks = Table[Map[KroneckerDelta[k, #] &, cc, {2}], {k, 25, 30}] // Total // Colorize // Binarize // DeleteSmallComponents

Extract the digits
ImageMultiply[i, Blur[mask, 10]] // ImageAdjust

POSTED BY: Vitaliy Kaurov
Looking online at, there doesn't seem to be much information available about how they do this, however, they seem to be doing a couple of things that would make the problem much easier. They first detect where the edge of the card is in the image. Since the numbers are placed in a uniform location on the card, they then know exactly where each group of four letters are. Maybe they use a projective transform in case the card is rotated a bit or something. The font used for the letters appears to be equally spaced, so in theory, if you have an accurate boundary of the card (which they do), you know location of each number individually and can segment the problem. This would make the recognition problem much eacher .

So let's say I know the location of the numbers. I think this might be enough to try OCR already, but if I want to improve things further, I would try to estimate the background of the card. The function Inpaint has a number of algorithms for this:

First I would use my knowledge of where the characters are to make a Mask. I've hand drawn one from your first example, but I think a program would do a much better job. I've adjusted the original image a bit since it was kinda dark, but this doesn't really seem to strongly affect how well the method below works.

Let's just call the second image here "mask". Run the original image thru a "FastMarching" algorithm to get an estimation of the background:
background = Inpaint[img, ColorNegate@mask, Method -> "FastMarching"]

There's an estimation of the background values. I'm sure we could do better, but this will probably work fine. Now that we have the background estimated, we can look at the diffrence between the original image and the background:
diff = ImageAdd[background, ColorNegate@img]
POSTED BY: Sean Clarke
Thanks Vitaliy!  TextRecognize on the final image still doesn't work emoticon  and my custom ocr gives the wrong answer too (9000 1239 5576 9000).
POSTED BY: Michael Sollami
What about running your custom OCR not on the final image but on the masks?
POSTED BY: Vitaliy Kaurov
By the way there is a discussion of a similar sort for a very tough case:

Cleaning mildew from old documents using Mathematica
POSTED BY: Vitaliy Kaurov
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract