Message Boards Message Boards

0
|
8100 Views
|
13 Replies
|
0 Total Likes
View groups...
Share
Share this post:

DeleteBorder

DeleteBorder increases the size of an image I"m working on by a factor of 7. This seems strange for a function that is supposed to remove image material. Any ideas why this is happening or, more importantly, how to prevent it? Thanks for any insight. Cheers, Scott

POSTED BY: Scott Guthery
13 Replies
Posted 9 years ago

Scott, I'd rather not speculate, so could you post the code and the image you're working with?

POSTED BY: Eric Rimbey

Hello, Eric ...

Here's the code ...

pageImage = Import[inputFile];
pageImage=DeleteBorderComponents[pageImage];

I've attached the image.

Before DeleteBorderComponents I see 750,132 pixels. After DeleteBorderComponents I see 5,659,236.

Thanks in advance for any insight.

Cheers, Scott

Attachments:
POSTED BY: Scott Guthery
Posted 9 years ago

Scott, the jpg you attached seems to have dimensions 1844x3069, which is the size you report for the post DeleteBorderComponents image. Maybe you posted the wrong starting image? Or maybe you just miscalculated the size of the starting image?

POSTED BY: Eric Rimbey

The image is the correct one. The numbers I noted are the number of black pixels in what comes out of ImageData:

pageImage = Import[inputFile];
pageImage=DeleteBorderComponents[pageImage];
binPage = MorphologicalBinarize[pageImage];
Pixels = ImageData[binPage];

In other words, DeleteBorderComponents turns the entire image black (1844 x 3069 = 5659236). Without DeleteBorderComponents there are 750132 black pixels in the image.

Cheers, Scott

POSTED BY: Scott Guthery
Posted 9 years ago

Oh, I completely misunderstood.

Hmm, I tried several manipulations of the original image before attempting DeleteBorderComponents (changing the color space, ColorNegate, etc), and every time the result is a completely black image. DeleteBorderComponents doesn't seem to have any options for setting the threshold for what's considered background, which many other image manipulation functions do, so that's kind of weird. This is the first time I've ever tried using DeleteBorderComponents, so I'd have to do some further experimentation before I can speculate on a cause for this behavior with this image. I hope someone else in the community has suggestions. Sorry that I couldn't be more helpful.

POSTED BY: Eric Rimbey

DeleteBorderComponents expects to be run on a binary image, so you typically will want to run a segmentation function first to map the color or greyscale image to binary, and then perform the DeleteBorderComponents operation. Also worth noting is that when dealing with black on a white page you should negate the colors to turn the text into the components and the page into the background. I've attached the results for the following code:

im = Import["jqa_v12_p01.jpg"];
binary=MorphologicalBinarize[ColorNegate[im]];
deleted=DeleteBorderComponents[binary];

You can see the segmented text in the binary image (with the colors reversed to make the text white), and then the regions that touch the border removed. It looks like the important part for your example will be to then fiddle with different methods for binarizing the image such that not too much of the text itself is removed when deleting the border components. Plus, MorphologicalBinarize seems to render the text mostly unreadable, which is probably not what you wanted.

Hopefully this helps clarify why DeleteBorderComponents didn't behave like you expected.

Attachments:
POSTED BY: Matthew Sottile
Posted 9 years ago

Nice explanation, Matthew!

POSTED BY: Eric Rimbey

Thanks much, Matthew. Greatly appreciated. And thanks to you too, Eric, for your patience and attention to the matter.

Cheers, Scott

POSTED BY: Scott Guthery

Hi all, here comes just a little remark. I am aware that I am not actually answering the original question, but I am just assuming that "cleaning" such an image is the ultimate goal here (thank you for this interesting and challenging task!). So this is what I tried:

A sample of the original image:

enter image description here

I end up with:

enter image description here

I started by partitioning the image, working on the parts. One of the resulting advantages is that the successive processes can then be parallelized. Basically I found NonlocalMeansFilter most helpful (but it takes hours to run!). My code:

ClearAll["Global`*"]
img0 = ColorConvert["<your image>", "Grayscale"];
partImgs0 = ImagePartition[img0, {200}];
partImgs1 = ParallelMap[ColorToneMapping@*ImageAdjust, partImgs0, {2}];
partImgs2 = ParallelMap[NonlocalMeansFilter[#, 6] &, partImgs1, {2}];
partImgs3 = ParallelMap[ImageAdjust[#, {2, -.2, 1.5}] &, partImgs2, {2}];
img1 = ImageAssemble[partImgs3]

(* dirty trick: drawing black vertical lines next to left/right borders
to prevent 'DeleteBorderComponents' from swallowing whole words *)
img1data = ImageData[img1];
Table[(img1data[[n, 10]] = 0; img1data[[n, -10]] = 0), {n, 1, Length[img1data]}];
img1a = Image[img1data]
img2 = DeleteBorderComponents[img1a]
img3 = ColorNegate@DeleteSmallComponents[img2, 100]

I am sure there still is room for lots of improvement!

Regards -- Henrik

POSTED BY: Henrik Schachner

Henrik ...

Thanks much for your comments and your code. Apologies for the delay in extending my thanks; I had to upgrade from 8 to 10 to understand the code. (Wolfram should credit you with the sale!) Cleaning followed by word extraction is indeed the objective. There is a tsunami of handwritten material being digitized. The time and expense of existing transcription methodologies coupled with declining population skills in cursive mean (IMHO, of course) we have to find new ways searching this material. In passing, I'm curious about the @* notation in line 4, in particular the *. Is this just function composition of some sort?

Thanks again.

Cheers, Scott

POSTED BY: Scott Guthery

Hi Scott! Thanks for your nice reply! I hope that my use of @* was not the only reason you made the upgrade. You are right: @* means "composition", but this is not essential here; f@*g is equivalent to f[g[#]]&. But upgrading from v8 to v10 makes a lot of sens anyway! One major innovation is the introduction of Association, which is an improvement when working with large amounts of data (" There is a tsunami of handwritten material being digitized").

Regards -- Henrik

POSTED BY: Henrik Schachner

With reiterated thanks to Henrik for his code, I find that one must be careful with ImagePartition as the application of a function to the parts may result in an assembled image that is different than applying the function to the whole (unpartitioned) image.

Here's the unpartitioned code:

pageImage = ColorConvert[pageImage, "Grayscale"];
pageImage = ImageAdjust[pageImage, {2}];
pageImage = ColorToneMapping[pageImage];
pageImage = ImageAdjust[pageImage, {2, -.2, 1.5}];
pageImage = ColorNegate[Binarize[pageImage]];
Print[pageImage];

And here's the partitioned code (via Henrik):

pageImage = ColorConvert[pageImage, "Grayscale"];
partImgs0 = ImagePartition[pageImage, {200}];
partImgs1 = ParallelMap[ColorToneMapping@*ImageAdjust, partImgs0, {2}];
partImgs3 =  ParallelMap[ImageAdjust[#, {2, -.2, 1.5}] &,   partImgs1, {2}]; 
pageImage = ImageAssemble[partImgs3];
pageImage = ColorNegate[Binarize[pageImage]];
Print[pageImage];

See attached for original image and the partitioned and unpartitioned results.

All just FYI.

Cheers, Scott

Attachments:
POSTED BY: Scott Guthery

Hi Scott,

yes, I observed as well that some tiles of the partition may look "different", but typically this occurs only when tiles do (nearly) not contain any "black inc", so I was not very concerned about. My motivation was to enhance the readability of the handwriting, not to make the document as such look "nicer". My (maybe naive) idea was that the dynamic range of the image can be stretched much more by working on small partitions instead of the whole image. I would like to show a simple example for what I mean: Let's generate a test image:

ClearAll["Global`*"]
gcd = ColorData["GrayTones", "Image"];
lne = Graphics[{Thickness[.05], Black, Line[{{0, .5}, {1, .5}}]}];
img = Image @ Graphics[List @@@ {gcd, lne}, PlotRange -> {{0, 1}, {0.4, .6}}, ImageSize -> 600];
imgpart = First@ImagePartition[#, {150, 120}] &@img;

Then adjusting the whole image looks like:

enter image description here

the same done on tiles gives:

enter image description here

The fact that with ImagePartition (i.e. working on independent tiles) one can use parallelization is just a nice side effect.

Regards -- Henrik

POSTED BY: Henrik Schachner
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract