Message Boards Message Boards

1
|
8642 Views
|
2 Replies
|
3 Total Likes
View groups...
Share
Share this post:

Accelerating data download from Wolfram curated database?

Hello all! I have the following problem. I need to calculate geo-distances between administrative division of Poland. Poland has three levels of division. It is easy to get a division from curated database like this:

level1 = Entity["Country", "Poland"][EntityProperty["Country", "AdministrativeDivisions"]];
level2 = AdministrativeDivisionData[#, "Subdivisions"] & /@ level1;
level2 = Flatten[level2];

To get geo-distances for the first level I can to this:

level1$dists = Outer[GeoDistance[#1, #2, DistanceFunction -> "Center"] &, level1, level1];

The above code takes some time to evaluate but nothing terribly long. However, at the second level of division there are 373 regions so if I use the above construction I need to calculate 139129 geo-distance. After cutting down all repetitions and distances between same regions there are still 69378 geo-distance to compute. Each call to GeoDistance[] takes about 5 second if DistanceFunction->"Center" is used and about 0.5 - 1 second with default options. This means that the computations should take about 15 hours in the best scenario.

Question: I think that the time required to do computation is mainly required for connecting to Wolfram servers. Hence the question is: can make this computation like a "transaction" in bulk with a single connection? Maybe I miss something and there is a better way of solving the above problem? Any help will be greatly appreciated!

POSTED BY: Michal Ramsza
2 Replies

Thank you very much. Using the Listable attribute makes it way faster. Also, it is, indeed, better to partition a list into shorter lists because otherwise one can get operation timeout errors. I have the following results.

In[2]:= level1 = Entity["Country", "Poland"]["AdministrativeDivisions"];
In[3]:= GeoDistance[level1, level1, DistanceFunction -> "Center"]; // AbsoluteTiming

Out[3]= {1.11975, Null}

So, this is fast enough for my purposes. Thank you again. Interestingly, we have

In[7]:= Attributes[GeoDistance]

Out[7]= {Protected, ReadProtected}

So, it does not say that GeoDistance[] has attribute Listable.

POSTED BY: Michal Ramsza

You should take advantage of GeoDistance listability.

In[2]:= level1$dists = 
   GeoDistance[level1, level1, 
    DistanceFunction -> "Center"]; // AbsoluteTiming

Out[2]= {0.91751, Null}

Also with the default option is fast:

In[3]:= level1$dists2 = GeoDistance[level1, level1]; // AbsoluteTiming

Out[3]= {1.34659, Null}

But for a list as long as level2, you should do partition yourself and call GeoDistance in groups. Instead of using Partition, you can also use directly the natural partition of first administrative divisions, something like:

In[4]:= level2u = AdministrativeDivisionData[#, "Subdivisions"] & /@ level1;

In[5]:= level2$dist = Outer[GeoDistance, level2u, level2u, 1]; // AbsoluteTiming

Out[5]= {288.062, Null}
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract