Message Boards Message Boards

4
|
2836 Views
|
0 Replies
|
4 Total Likes
View groups...
Share
Share this post:

A faster way to load content from URLs

Posted 11 years ago
The Wolfram Language comes with a simple and useful function, URLFetch. This function will take
a url as its argument and download the content stored at that url. On the Raspberry Pi this function
can be a bit slow, as can be seen below. The first evaluation is exceptionally slow due to the autoloading
and initialization of required functionality:

In[1]:= URLFetch["http://www.wolfram.com"]; // AbsoluteTiming

Out[1]= {18.163791, Null}

In[2]:= URLFetch["http://www.wolfram.com"]; // AbsoluteTiming

Out[2]= {1.952269, Null}


Evaluating this URLFetch command repeatedly will give timings between 1.5 and 3.0 seconds.

It is possible to speed this up by using lower levels functions from the CURLLink package:

 Needs["HTTPClient`CURLLink`"];
 CURLInitialize[];
 
 FastURLFetch[url_]:= Module[{handle,result},
  handle = CURLHandleLoad[];
  CURLOption[handle, "CURLOPT_URL", url];
  CURLOption[handle, "CURLOPT_CUSTOMREQUEST", "GET"];
  CURLOption[handle, "CURLOPT_WRITEFUNCTION", "WRITE_MEMORY"];
  CURLOption[handle, "CURLOPT_WRITEDATA", "MemoryPointer"];
CURLOption[handle, "CURLOPT_HEADERFUNCTION", "WRITE_MEMORY"];
CURLOption[handle, "CURLOPT_WRITEHEADER", "MemoryPointer"];
CURLPerform[handle];
encoding = StringCases[ ToLowerCase[CURLHeaderData[handle]], "charset="~~enc:("utf-8"|"iso-8859-1"):>enc]/.{({}|{"utf-8"})->"UTF8",{"iso-8859-1"}->"ISOLatin1"};
result = FromCharacterCode[CURLRawContentData[handle],encoding];
CURLHandleUnload[handle];
result]


This will give timings between 0.6 and 1.0 seconds, which is significantly better the URLFetch and very close to timings seen by using a command line tool like 'wget':

In[1]:= FastURLFetch["http://www.wolfram.com"]; // AbsoluteTiming

Out[1]= {0.706224, Null}

In[2]:= FastURLFetch["http://www.wolfram.com"]; // AbsoluteTiming

Out[2]= {0.824583, Null}
POSTED BY: Arnoud Buzing
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract