Message Boards Message Boards

0
|
5500 Views
|
1 Reply
|
2 Total Likes
View groups...
Share
Share this post:

Import a web page with latex code

Posted 10 years ago

To be brief; I would like to import a web page, so I write

Import["http://fabricebaudoin.wordpress.com/2012/10/25/lecture-32-the-malliavin-derivative/]; 

which gives a notebook file without png image in latex code. Is there any way to load the image in latex script, by exploring the source code for example.

[Updated version]]

You can get all the image links using

Import["http://fabricebaudoin.wordpress.com/2012/10/25/lecture-32-the-malliavin-derivative/", "ImageLinks"]

and get all the images themselves using

Import["http://fabricebaudoin.wordpress.com/2012/10/25/lecture-32-the-malliavin-derivative/", "Images"]

And you can create a list of URL to Image replacement rules for use later on using something like:

With[{url = 
   "http://fabricebaudoin.wordpress.com/2012/10/25/lecture-32-the-malliavin-derivative/"},
 Thread[Import[url, "ImageLinks"] -> Import[url, "Images"]]
 ]

Note that the WordPress page in question has presumably created the images of the expressions from original latex code using some WordPress plugin. The latex code is contained in the URLs themselves.

I only spent a small bit of time working on this and it does not work in a bunch of cases--presumably because of either escaping some content in the Latex code or not extracting the latex code properly. So I will give this to you as a starting point and perhaps you can ferret out the bugs.

Here is the initial stab at the code--it's buggy but some cases work.

teXLinkRules =
  Module[{links},
   links = 
    Import["http://fabricebaudoin.wordpress.com/2012/10/25/lecture-32-the-malliavin-derivative/", "ImageLinks"];

   links = Select[links, StringMatchQ[#, "*?latex=*"] &];

   (# -> URLDecode[StringSplit[#, {"?latex=", "&"}][[2]]]) & /@ links

   ];

This gives a set of rules that transform links to TeX strings (well, there are bugs, so perhaps not always....).

Then you can turn the TeX content to Mathematica formatted expressions as follows:

ToExpression[#, TeXForm] & /@ (Last /@ teXLinkRules)

In this you can see the cases that work and the ones that Failed.

I hope this gives you a start.

Note added: the Failed cases may have to do with the presence of commas... try StringDeleting them to see if that helps....

POSTED BY: David Reiss
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract