I've been trying to procedurally generate schematics of towns and cities. I turned to Mathematica to extract exemplar data to train my algorithms. I thought I'd share the data extraction process.
Open street map has some really interesting data. For example, this area of Kunduz Afghanistan has high quality building polygon data, exactly what I was after. One can only conjecture why high density LIDAR data of this area exists, or how it ended up so accessible. OSM has an export button, provided you select a small enough region, it saves you an '.osm' file. Two example files from Kunduz are attached. .osm primarily stores two structures: nodes (single geographic points) and ways (lists of nodes that form polygons, tagged to reflect content type). We can easily drag these out in Mathematica to sift out our target data.
First importing a data set, cutting an import string into individual sections. Then selecting the node type data, and saving the node coordinates as functions of their IDs, and saving those IDs in 'nodes'.
strings = StringDrop[StringTrim[#], 1] & /@ StringSplit[Import["C:\\Users\\Me\\Downloads\\map.osm"], ">"];
nodestrings = Cases[If[StringSplit[#][[1]] == "node", StringDrop[#, 5]] & /@ strings, Except[Null]];
(node[#[[1]]] = {#[[3]], #[[2]]}; AppendTo[nodes, #[[1]]];) & /@
(ToExpression[StringDrop[StringTrim[StringDrop[#[[1]], #[[2]]]], -#[[3]]] & /@
Transpose[{StringSplit[#][[{1, -2, -1}]], {4, 5, 5}, {1, 1, 2}}] & /@ nodestrings]);)
Next I select the non node data, extracting ways, their node references, and their tags; stored in 'waylists' and 'taglists'.
nonnode = Cases[If[StringSplit[#][[1]] != "node", #] & /@ (StringDrop[StringTrim[#], 1] & /@
StringSplit[Import["C:\\Users\\Me\\Downloads\\map.osm"], ">"]), Except[Null]];
openers = Parallelize[StringSplit[#][[1]] & /@ # & /@ SplitBy[nonnode, StringSplit[#][[1]] &]];
nonnode = SplitBy[nonnode, StringSplit[#][[1]] &];
AppendTo[waylists, Parallelize[nonnode[[#[[1]] ;; #[[2]]]] & /@
Transpose[{First /@ Position[openers, "way"], First /@ Position[openers, "/way"]}]]];
AppendTo[taglists, Cases[If[StringTake[#, 1] == "k", StringDrop[StringDrop[#, 3], -1]] & /@ #,
Except[Null]][[1]] & /@ StringSplit[#[[3]]] & /@ waylists[[-1]]];
Now the data can be easily polled. For example, here all ways with the building tag are dumped into lists of locations, allowing easy visualisation of building polygons.
buildpolys = Table[(node[#] & /@ ToExpression[StringDrop[StringDrop[#, 8], -2] & /@ #[[2]]]) & /@
waylists[[i]][[First /@ Position[If[MemberQ[#, "building"], 1] & /@ taglists[[i]], 1]]], {i, 1, 1}];
Graphics[{Blue, Line[#] & /@ buildpolys}]
All these functions were designed to handle multiple files, allowing larger imports. See the attached notebook for an example.
Attachments: