Can TextRecognize recognize the image of box letters below? Unfortunately TextRecognize on V11.01 cannot recognize it correctly. So I try to let TextRecognize recognize it by using some Wolfram Languages.
1. One line image (Attachments: img1.jpg)
TextRecognize cannot recognize it.
TextRecognize[img1, Language -> "English"]
TextRecognize can recognize it a bit by removing borders (ImageCrop). TextRecognize misrecognizes the gap between box letters as "l".
img3 = ImageCrop[img1, {1170, 140}]
TextRecognize[img3, Language -> "English"]
So I crop each box from the image. First I use EdgeDetect and ImageLines to find boundaries.
lines = ImageLines[EdgeDetect[img1, 9], 0.1];
HighlightImage[img1, Line /@ lines]
I find out the coordinates of each point where lines cross each other. "buf" is the buffer to find the coordinates a little inside.
line = 1; char = 7; buf = 7;
row = (Take[lines, 2*line] // Sort);
rowpart = {#[[1]] + buf, #[[2]] - buf} & /@
Partition[row[[All, 1, 2]], 2];
col = (Take[lines, {2*line + 1, Length[lines]}] // Sort);
colpart = {#[[1]] + buf, #[[2]] - buf} & /@
Partition[col[[All, 1, 1]], 2];
Now I could crop each box from the image.
imglist =
Table[ImageTake[img1, rowpart[[i]], colpart[[j]]], {i, 1, line}, {j,
1, char}];
ImageResize[#, 70] & /@ imglist[[1]]
I assemble all the box images.
imgtake = ImageAssemble[Flatten[imglist]]
TextRecognize can perfectly recognize it.
TextRecognize[imgtake, Language -> "English"]
2. Two lines image (Attachments: img2.jpg)
TextRecognize cannot recognize it.
TextRecognize[img2, Language -> "English"]
TextRecognize can almost recognize it by removing borders (ImageCrop).
img? = ImageCrop[img2, {1165, 305}]
TextRecognize[img?, Language -> "English"]
I crop each box from the image for TextRecognize to recognize it perfectly.
lines = ImageLines[EdgeDetect[img2, 9], 0.1];
HighlightImage[img2, Line /@ lines]
I find out the coordinates of each point where lines cross each other.
line = 2; char = 7; buf = 7;
row = (Take[lines, 2*line] // Sort);
rowpart = {#[[1]] + buf, #[[2]] - buf} & /@
Partition[row[[All, 1, 2]], 2];
col = (Take[lines, {2*line + 1, Length[lines]}] // Sort);
colpart = {#[[1]] + buf, #[[2]] - buf} & /@
Partition[col[[All, 1, 1]], 2];
Now I could crop each box from the image.
imglist =
Table[ImageTake[img2, rowpart[[i]], colpart[[j]]], {i, 1, line}, {j,
1, char}];
ImageResize[#, 70] & /@ Flatten[imglist]
I assemble all the box images. However, ImageAssemble expects images of the same height in one row.
dims = Min /@ Transpose[ImageDimensions /@ Flatten[imglist]];
imgtake =
ImageAssemble[
Flatten[imglist] /.
x_Image :> ImageCrop[x, dims, Padding -> Automatic]]
TextRecognize can perfectly recognize it.
TextRecognize[imgtake, Language -> "English"]
Attachments: