Message Boards Message Boards

1
|
8067 Views
|
2 Replies
|
3 Total Likes
View groups...
Share
Share this post:

Accelerating data download from Wolfram curated database?

Hello all! I have the following problem. I need to calculate geo-distances between administrative division of Poland. Poland has three levels of division. It is easy to get a division from curated database like this:

level1 = Entity["Country", "Poland"][EntityProperty["Country", "AdministrativeDivisions"]];
level2 = AdministrativeDivisionData[#, "Subdivisions"] & /@ level1;
level2 = Flatten[level2];

To get geo-distances for the first level I can to this:

level1$dists = Outer[GeoDistance[#1, #2, DistanceFunction -> "Center"] &, level1, level1];

The above code takes some time to evaluate but nothing terribly long. However, at the second level of division there are 373 regions so if I use the above construction I need to calculate 139129 geo-distance. After cutting down all repetitions and distances between same regions there are still 69378 geo-distance to compute. Each call to GeoDistance[] takes about 5 second if DistanceFunction->"Center" is used and about 0.5 - 1 second with default options. This means that the computations should take about 15 hours in the best scenario.

Question: I think that the time required to do computation is mainly required for connecting to Wolfram servers. Hence the question is: can make this computation like a "transaction" in bulk with a single connection? Maybe I miss something and there is a better way of solving the above problem? Any help will be greatly appreciated!

POSTED BY: Michal Ramsza
2 Replies
POSTED BY: Michal Ramsza

You should take advantage of GeoDistance listability.

In[2]:= level1$dists = 
   GeoDistance[level1, level1, 
    DistanceFunction -> "Center"]; // AbsoluteTiming

Out[2]= {0.91751, Null}

Also with the default option is fast:

In[3]:= level1$dists2 = GeoDistance[level1, level1]; // AbsoluteTiming

Out[3]= {1.34659, Null}

But for a list as long as level2, you should do partition yourself and call GeoDistance in groups. Instead of using Partition, you can also use directly the natural partition of first administrative divisions, something like:

In[4]:= level2u = AdministrativeDivisionData[#, "Subdivisions"] & /@ level1;

In[5]:= level2$dist = Outer[GeoDistance, level2u, level2u, 1]; // AbsoluteTiming

Out[5]= {288.062, Null}
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract