Group Abstract

Message Boards

WOLFRAM COMMUNITY

7.3K Views

5 Replies

9 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Image Processing Mathematica

How to segment characters from images with complicated backgrounds?

Michael Sollami

Michael Sollami, Mustbin Inc.

Posted 12 years ago

The key limitation in most OCR software is handling images that are not from printed material and so I've trained my ocr engine for reading text from images with different fonts and imperfect character faces. Unfortunately, most times the text we wish to capture is on things such as cards (id, business, credit, etc..) that have characters that are quite difficult to segment from the background. This is due to coloration and lighting as well as challenges that arise due to the 3D structures of some characters. The preprocessing stage in my program consists of first smoothing the cropped image with gaussian blur, then performing edge detection using sobel, and using an adaptive threshold. As you can see the results are not great: Example Input from a card: Output: Another example Input from a card: Output: So the question is, when trying to preprocess a such images with non-uniform backgrounds for OCR, how does one segment the foreground text? Most of the difficult images I've encountered are taken under sunlight with a non-uniform background, where the sun casts strong shadows and makes bright spots in the image. Some possiblities I've been trying are using surf data and Harris corners, to no avail... Any suggestions and code would be much apreciated, and I can post more sample inputs if needed! This paper may help: Credit Card Processing Using Cell Phone Images

POSTED BY: Michael Sollami

5 Replies

Sort By:

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 12 years ago

By the way there is a discussion of a similar sort for a very tough case: Cleaning mildew from old documents using Mathematica

POSTED BY: Vitaliy Kaurov

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 12 years ago

What about running your custom OCR not on the final image but on the masks?

POSTED BY: Vitaliy Kaurov

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 12 years ago

First thing to note, cleaning background and OCR are different things. If one fails, does not mean the other is bad. So lets split it into two parts. 1) OCR: Optical Character Recognition choices and levels of difficulty: Domesticated: Use standard libraries which you already know Adventurous: Develop your own OCR take a look at Markus example of custom OCR for musical notation in this course: Serious and Not-So-Serious Image Processing Applications. This is a good option because All credit cards use the same font and size for numbers. Brave: Look into the future - Ef?cient Learning of Deep Boltzmann Machines or Restricted Boltzmann Machines in R 2) Cleaning background I will use a different image: i = Import["http://goo.gl/0uK5uV"] -------------- METHOD 1 -------------- Create a mask with double Binarize a = .55; b = .68; f[x_, a_, b_] := a < x < b; mask = RegionBinarize[i, Binarize[i, f[#[[1]], a, b] && f[#[[2]], a, b] && f[#[[3]], a, b] &], .21] Extract the digits ImageMultiply[i, Blur[Erosion[Dilation[mask, 4], 2], 10]] -------------- METHOD 2 -------------- Create a mask with ClusteringComponents cc = ClusteringComponents[i, 30]; maks = Table[Map[KroneckerDelta[k, #] &, cc, {2}], {k, 25, 30}] // Total // Colorize // Binarize // DeleteSmallComponents Extract the digits ImageMultiply[i, Blur[mask, 10]] // ImageAdjust

POSTED BY: Vitaliy Kaurov

Michael Sollami

Michael Sollami, Mustbin Inc.

Posted 12 years ago

Thanks Vitaliy! TextRecognize on the final image still doesn't work and my custom ocr gives the wrong answer too (9000 1239 5576 9000).

POSTED BY: Michael Sollami

Sean Clarke

Sean Clarke, Wolfram Research

Posted 12 years ago

Looking online at card.io, there doesn't seem to be much information available about how they do this, however, they seem to be doing a couple of things that would make the problem much easier. They first detect where the edge of the card is in the image. Since the numbers are placed in a uniform location on the card, they then know exactly where each group of four letters are. Maybe they use a projective transform in case the card is rotated a bit or something. The font used for the letters appears to be equally spaced, so in theory, if you have an accurate boundary of the card (which they do), you know location of each number individually and can segment the problem. This would make the recognition problem much eacher . So let's say I know the location of the numbers. I think this might be enough to try OCR already, but if I want to improve things further, I would try to estimate the background of the card. The function Inpaint has a number of algorithms for this: First I would use my knowledge of where the characters are to make a Mask. I've hand drawn one from your first example, but I think a program would do a much better job. I've adjusted the original image a bit since it was kinda dark, but this doesn't really seem to strongly affect how well the method below works. Let's just call the second image here "mask". Run the original image thru a "FastMarching" algorithm to get an estimation of the background: background = Inpaint[img, ColorNegate@mask, Method -> "FastMarching"] There's an estimation of the background values. I'm sure we could do better, but this will probably work fine. Now that we have the background estimated, we can look at the diffrence between the original image and the background: diff = ImageAdd[background, ColorNegate@img]

POSTED BY: Sean Clarke

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback