Message Boards

WOLFRAM COMMUNITY

5340 Views

0 Replies

2 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Recognize box letters

Kotaro Okazaki

Kotaro Okazaki, FTI

Posted 7 years ago

Can TextRecognize recognize the image of box letters below? Unfortunately TextRecognize on V11.01 cannot recognize it correctly. So I try to let TextRecognize recognize it by using some Wolfram Languages. 1. One line image (Attachments: img1.jpg) TextRecognize cannot recognize it. TextRecognize[img1, Language -> "English"] TextRecognize can recognize it a bit by removing borders (ImageCrop). TextRecognize misrecognizes the gap between box letters as "l". img3 = ImageCrop[img1, {1170, 140}] TextRecognize[img3, Language -> "English"] So I crop each box from the image. First I use EdgeDetect and ImageLines to find boundaries. lines = ImageLines[EdgeDetect[img1, 9], 0.1]; HighlightImage[img1, Line /@ lines] I find out the coordinates of each point where lines cross each other. "buf" is the buffer to find the coordinates a little inside. line = 1; char = 7; buf = 7; row = (Take[lines, 2line] // Sort); rowpart = {#[[1]] + buf, #[[2]] - buf} & /@ Partition[row[[All, 1, 2]], 2]; col = (Take[lines, {2line + 1, Length[lines]}] // Sort); colpart = {#[[1]] + buf, #[[2]] - buf} & /@ Partition[col[[All, 1, 1]], 2]; Now I could crop each box from the image. imglist = Table[ImageTake[img1, rowpart[[i]], colpart[[j]]], {i, 1, line}, {j, 1, char}]; ImageResize[#, 70] & /@ imglist[[1]] I assemble all the box images. imgtake = ImageAssemble[Flatten[imglist]] TextRecognize can perfectly recognize it. TextRecognize[imgtake, Language -> "English"] 2. Two lines image (Attachments: img2.jpg) TextRecognize cannot recognize it. TextRecognize[img2, Language -> "English"] TextRecognize can almost recognize it by removing borders (ImageCrop). img? = ImageCrop[img2, {1165, 305}] TextRecognize[img?, Language -> "English"] I crop each box from the image for TextRecognize to recognize it perfectly. lines = ImageLines[EdgeDetect[img2, 9], 0.1]; HighlightImage[img2, Line /@ lines] I find out the coordinates of each point where lines cross each other. line = 2; char = 7; buf = 7; row = (Take[lines, 2line] // Sort); rowpart = {#[[1]] + buf, #[[2]] - buf} & /@ Partition[row[[All, 1, 2]], 2]; col = (Take[lines, {2line + 1, Length[lines]}] // Sort); colpart = {#[[1]] + buf, #[[2]] - buf} & /@ Partition[col[[All, 1, 1]], 2]; Now I could crop each box from the image. imglist = Table[ImageTake[img2, rowpart[[i]], colpart[[j]]], {i, 1, line}, {j, 1, char}]; ImageResize[#, 70] & /@ Flatten[imglist] I assemble all the box images. However, ImageAssemble expects images of the same height in one row. dims = Min /@ Transpose[ImageDimensions /@ Flatten[imglist]]; imgtake = ImageAssemble[ Flatten[imglist] /. x_Image :> ImageCrop[x, dims, Padding -> Automatic]] TextRecognize can perfectly recognize it. TextRecognize[imgtake, Language -> "English"] Attachments:

Can TextRecognize recognize the image of box letters below? Unfortunately TextRecognize on V11.01 cannot recognize it correctly. So I try to let TextRecognize recognize it by using some Wolfram Languages.

enter image description here

1. One line image (Attachments: img1.jpg)

enter image description here

TextRecognize cannot recognize it.

TextRecognize[img1, Language -> "English"]

enter image description here

TextRecognize can recognize it a bit by removing borders (ImageCrop). TextRecognize misrecognizes the gap between box letters as "l".

img3 = ImageCrop[img1, {1170, 140}]

enter image description here

TextRecognize[img3, Language -> "English"]

enter image description here

So I crop each box from the image. First I use EdgeDetect and ImageLines to find boundaries.

lines = ImageLines[EdgeDetect[img1, 9], 0.1];
HighlightImage[img1, Line /@ lines]

enter image description here

I find out the coordinates of each point where lines cross each other. "buf" is the buffer to find the coordinates a little inside.

line = 1; char = 7; buf = 7;
row = (Take[lines, 2*line] // Sort);
rowpart = {#[[1]] + buf, #[[2]] - buf} & /@ 
   Partition[row[[All, 1, 2]], 2];
col = (Take[lines, {2*line + 1, Length[lines]}] // Sort);
colpart = {#[[1]] + buf, #[[2]] - buf} & /@ 
   Partition[col[[All, 1, 1]], 2];

Now I could crop each box from the image.

imglist = 
  Table[ImageTake[img1, rowpart[[i]], colpart[[j]]], {i, 1, line}, {j,
     1, char}];
ImageResize[#, 70] & /@ imglist[[1]]

enter image description here

I assemble all the box images.

imgtake = ImageAssemble[Flatten[imglist]]

enter image description here

TextRecognize can perfectly recognize it.

TextRecognize[imgtake, Language -> "English"]

enter image description here

2. Two lines image (Attachments: img2.jpg)

enter image description here

TextRecognize cannot recognize it.

TextRecognize[img2, Language -> "English"]

enter image description here

TextRecognize can almost recognize it by removing borders (ImageCrop).

img? = ImageCrop[img2, {1165, 305}]

enter image description here

TextRecognize[img?, Language -> "English"]

enter image description here

I crop each box from the image for TextRecognize to recognize it perfectly.

lines = ImageLines[EdgeDetect[img2, 9], 0.1];
HighlightImage[img2, Line /@ lines]

enter image description here

I find out the coordinates of each point where lines cross each other.

line = 2; char = 7; buf = 7;
row = (Take[lines, 2*line] // Sort);
rowpart = {#[[1]] + buf, #[[2]] - buf} & /@ 
   Partition[row[[All, 1, 2]], 2];
col = (Take[lines, {2*line + 1, Length[lines]}] // Sort);
colpart = {#[[1]] + buf, #[[2]] - buf} & /@ 
   Partition[col[[All, 1, 1]], 2];

Now I could crop each box from the image.

imglist = 
  Table[ImageTake[img2, rowpart[[i]], colpart[[j]]], {i, 1, line}, {j,
     1, char}];
ImageResize[#, 70] & /@ Flatten[imglist]

enter image description here

I assemble all the box images. However, ImageAssemble expects images of the same height in one row.

dims = Min /@ Transpose[ImageDimensions /@ Flatten[imglist]];
imgtake = 
 ImageAssemble[
  Flatten[imglist] /. 
   x_Image :> ImageCrop[x, dims, Padding -> Automatic]]

enter image description here

TextRecognize can perfectly recognize it.

TextRecognize[imgtake, Language -> "English"]

enter image description here

POSTED BY: Kotaro Okazaki

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Group Abstract

Feedback