Message Boards Message Boards

[GIS] Call Baidu Geoencoder with SN from Wolfram Langugage

Posted 6 years ago

Download the notebook at the end of the thread


Abstract

We discuss in detail about how to use Wolfram Language and Baidu Map API service to work on GIS related domestic data science project. This API service is very useful to convert any given street address to geo position in terms of latitude and longitude within mainland China.


Demo

For example, I can visualize average cost person for dinner of a restaurant against its location via GeoBubbleChart. Without geoencode, I may not put their street address into the plot funtion directly. The same routine is quite useful in commercial property planning in general.

test


Instruction

Starting from a valid App Key (AK) for the API service according to this document

bdAPIkey = "7ha3**********************72g";

ing

After you are asked to generate the APP key, you will need to choose how to verify the GET request you send to the server to retrieve data. Two options available:

  • White list of IP address or "0.0.0.0/0" to accept all IP
  • SN checksum

The first method is only OK for testing or in the case that you have a static IP to send request from for internal use or reverse proxy. We are going to use the second method which is more generic than the first one.

Basics steps are:

  • Encode a specific partial URL from the query
  • Append a private key to the above result and enconde againe
  • Compute the MD5 checksum of the new string to generate the SN required
  • Attach the SN to the original query
  • Send this GET http request to the server and retrieve XML/Json result
  • Parse the structured return value

The domain is always like this:

domain = "http://api.map.baidu.com";

The scheme URL and the query URL are constructed via URLBuild with the App Key sitting at the end of the query

urlpartial=URLBuild[{"/geocoder","v2/"},
{"address"-> "???????","callback"-> "showLocation","output"-> "xml","ak"-> bdAPIkey},CharacterEncoding -> "UTF8"]
(* "/geocoder/v2/?address=%E4%B8%8A%E6%B5%B7%E5%B8%82%E4%B8%8A%E6%B5%B7%E4%B8%AD%E5%BF%83&callback=showLocation&output=xml&ak=7ha<SAMPLE_KEY>72g" *)

Shanghai Tower, a 632 m skyscraper is chosen to be the address as input for instance. Note: This entity is curated in Wolfram Language and its geo position is available in Entity[...] call.

The next step requires us to attach the private key/SK to the encoded partial URL:

urlpartial~~sk
(* "/geocoder/v2/?address=%E4%B8%8A%E6%B5%B7%E5%B8%82%E4%B8%8A%E6%B5%B7%E4%B8%AD%E5%BF%83&callback=showLocation&output=xml&ak=7ha<SAMPLE_KEY>D72gHo<SAMPLE_KEY>HP" *)

where

sk = Ho<SAMPLE_KEY>HP

Then the signature/SN for verification is generated by (See comment below about All MD5's created equal)

sn = Hash[URLEncode[urlpartial ~~ sk], "MD5", "HexString"]
(* "c87<MD5 HEX Digest>d8d" *)

Let's append the signature/SN to the original query. We can do this either by HTTPRequest[<URL>, "Body"->{...}] or URLBuild again:

fullURL = URLBuild[{"http://api.map.baidu.com", "geocoder", "v2/"},
 {"address" -> "???????", "callback" -> "showLocation", 
  "output" -> "xml", "ak" -> bdAPIkey, "sn" -> sn}]
(* "http://api.map.baidu.com/geocoder/v2/?address=%E4%B8%8A%E6%B5%B7%E5%B8%82%E4%B8%8A%E6%B5%B7%E4%B8%AD%E5%BF%83&callback=showLocation&utput=xml&ak=7h<SAMPLE_Key>2g&sn=c87<MD5 HEX Digest>d8d *)

Just pass the URL string into the HTTPRuest function:

req = HTTPRequest[fullURL, <|Method -> "GET"|>]

and the resultant response, if everything goes well, is

xmlOBJ = URLExecute[req, "XML"]

geo

You can inspect the returning XML object to see the {lat,lon} information is available for the aforementioned address. Use the following code to extract the geo position pair from the XML with Case function:

SetAttributes[FindLatLonPair, HoldAll]

FindLatLonPair[xmlOBJ_] := Module[{xmllocations},
   xmllocations = 
    Cases[xmlOBJ, XMLElement["lat", __] | XMLElement["lng", __], 
     Infinity];
   Association[Sort@xmllocations /. {
      XMLElement["lng", {}, {lng_}] :> 
       Rule["Longitude", ToExpression@lng],
      XMLElement["lat", {}, {lat_}] :> 
       Rule["Latitude", ToExpression@lat]
      }
    ]
   ] /; Head[xmlOBJ] === XMLObject["Document"]

Quickly apply this function on the XML object we had before:

result2


Code of the Demo

Assuming I have curated some data for a list of restaurants in a region. The data include the street addresses and average cost per customer on food and service for dinner there.

Import the data (not attached with the notebook)

entitiesRaw = DeleteCases[Import["data.csv"], item_ /; item[[1]] === ""];

If you wrap everything I have shown in the API call into a function, then Map the function onto all street address in the datasheet imported, You shall have a list of valid XML objects. Extract all lat-lon pairs:

geopos = FindLatLonPair /@ (resultsXMLObj);
(*{<|lat->n1,lon->n2|>,<|lat->n3,lon->n4|> ... }*)

Use the following method to generate geo postion <-> value pair

bubbleChartPair = Thread[(GeoPosition[Values[#]] & /@ geopos) -> {dinnerCost1, dinnerCost2 .... } ];
(*{ {Lat, Lon} -> dinnerCost , {Lat, Lon} -> dinnerCost ... }*)

just put them into the GeoBubblePlot function to generate a nice spacial trend graphic, for instance

GeoBubbleChart[bubbleChartPair]

Some of the Geo positions are offset due to difference in datum (BD9 vs Mathematica's default datum) or civic GIS usage precision lost.

All MD5's Are Created Equal


In the documentation for Baidu API's SN generation, several code snippets are given to demonstrate the MD5 hash code. The results are the same as if from Mathematica. In case you wonder, here is a proof:

sn = Hash["wolfram", "MD5", "HexString"]
(* 5f7e6b1fa5f9740f66c5437b200425d8 *)

comparing to what I have from the Python.org online interactive session

md5

Location of Private Key


After you create AK/App Key, you will be redirected to this page. Private key/SK is bounded by the gold box.

pk

Attachments:
POSTED BY: Shenghui Yang

enter image description here - Congratulations! This post is now a Staff Pick as distinguished by a badge on your profile! Thank you, keep it coming, and consider contributing your work to the The Notebook Archive!

POSTED BY: EDITORIAL BOARD
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract