Message Boards Message Boards

Using OCR to copy Math content to notebook?

Posted 3 years ago

Transcribing mathematics from books/articles/classroom notes into a Mathematica notebook is time consuming by hand typing (am legally blind with problems with my fingers) and yes it is much faster due to autocomplete in Mathematica, but that is not as smart as it could be; however, I rather work smart than hard. ;-) Over ten yrs ago I read papers on OCR of math equations and some looked promising. But with all the advances in machine learning, I am expecting by now the ability to MOCR (math ocr) within Mathematica or some other software tool I can buy. Even if it is getting me to 80% of the work done would be helpful.

Speaking of working hard, so has anybody written up a tutorial(S), comprehensive please, on inputting equations from textbooks etc into Mathematica for study? Looking for coverage on definitions that use ". . ." a lot. Math teachers using Mathematica need to share in one notebook training in transcribing from textbook to a notebook. I have found only small fragments over the decade.

Thanks, Andrew

POSTED BY: Andrew Meit

To get the math from a textbook into Mathematica in a computable format will take a process. best bet is to get a textbook that is in HTML+MathML or to get the LaTeX version from the publisher. BookShare from Benetech has a lot of accessible math books, where accessible sometimes means the formula are not images or have legit alt text. Then you can pass the alt text through WolframAlpha to convert it to something computable. You can import mathml, but textbooks rarely use computable mathml and opt for presentation mathml. So then you might need to use something like MathPix to convert the image into a nearly computable format or take the wolfram alpha compatible format they give and pass it through Wolfram|Alpha. https://docs.mathpix.com/#introduction. This process is still very manual, but there are people working on PDF parsers attempting to solve it, like InftyReader.

POSTED BY: Kyle Keane
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract