Group Abstract

Message Boards

4.3K Views

4 Replies

1 Total Like

View groups...

Follow this post

Share this post:

GROUPS:

External Programs and Systems Import and Export Wolfram Language

Posted 2 years ago

I would like to copy a command from a book on Mathematica in PDF format (undoubtedly created with MMA) and paste it into a Notebook so it can be executed. When I select this text in the PDF file (which I'm entering by hand now) Table[RandomSample[Range[39], 7] // Sort, {10 ^ 6}]; and paste it into a Notebook, it appears like this: Table@RandomSample@Range@39D, 7D êê Sort, 810 ^ 6<D; BTW: the code for '@' maps to '[' in the Mathamatica1Mono font. Using `Style[<cmd>, FontFamily -> Mathematica1Mono]` doesn't help. This prevents me from experimenting and educating myself using some great online resources. Importing the PDF into MMA doesn't help.

POSTED BY: James M Marks

4 Replies

Sort By:

Posted 2 years ago

I see Roland mentioned a sniping tool for Windows. Modern versions of MacOS will recognize text from files (PDF and others) opened in Preview. If you open the PDF in Preview, you can select a text block, copy it, and paste it elsewhere. There's also an indie Mac app called TextSniper that will allow you to quickly copy text from anywhere on your screen. It will even recognize text from a video -- something that still seems like magic to me. TextSniper is far more convenient than the Apple-supplied text scanner in Preview. The app is listed for eight bucks. It's a good tool from a good developer. All of the methods will get some conversion errors.

POSTED BY: Phil Earnhardt

Posted 2 years ago

Update: As I explained in my posting about TextRecognize, pdf files created by Mathematica savings contain their input text together with its image graphics. You can reimport the Notebook text Import[ "filepath\file.pdf", {"PageFormattedText"} ]

POSTED BY: Roland Franzius

Posted 2 years ago

Thank you, Roland, for your discussion. I tried your approach using a simple example from an online MMA book. As you can see from this small Notebook, it was not entirely successful. The problem is that TextRecognize does not understand the MMA symbol for Rule[] (and many other symbols which are unique to MMA: https://www.wolframcloud.com/obj/e3fcc77f-d8a1-4fda-bb8e-1c91b18b2ec5 When I look at the raw source of the PDF page, I see that it switches fonts hundreds of times, sometimes on a character by character basis. All of the fonts are actually stored in the PDF (to guarantee portability), including some MMA-private fonts. I'm in the process of writing code to parse the raw source and substitute an appropriate ASCII string when I encounter a symbol which uses an MMA font. I will update this discussion with the results of this project.

POSTED BY: James M Marks

Posted 2 years ago

Hi James, in Windows press Shift + Windows + "S" to call the snipping tool. Paste the picture as an image into ToExpression@TextRecognize[picture, Language -> "English"] You have to allow internet access and you have to wait some minutes downloading the repository from the server. Regards Roland

POSTED BY: Roland Franzius

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback