The commute back home can get quite brutal when a car accident happens. On the way home, there are two fast chain restaurants and people try to take an illegal turn left. Being this a three lane road, and with heavy traffic, accidents seem to happen every 4-6 months.
I wondered how dangerous this section of the road actually was, so I contacted the GDOT (Georgia Department of Transportation) to ask them if the accident data was readily available to the public.
In less than 24 hours, the folks at GDOT pointed me to the geoTraqs Web Application, which allows you to download different features, including Crashes that have occurred in the last 3 years. The graphical interface is quite good, but wanted to be able to aggregate the number of vehicles involved in car accidents on a specific section of the road and allow to easily map and see different areas of the city.
Wanted to used a Geographics function to allow me to pick an area of interest to zoom in the analysis. This required the setting up of several functions to be able to use a locator in Manipulate to extract the coordinates of interest.
gg = GeoGraphics[
Entity["City", {"Atlanta", "Georgia", "UnitedStates"}],
GeoRange -> 10000];
tr = "DisplayFunction" /. (CoordinatesToolOptions /.
AbsoluteOptions[gg]);
trans[u_] :=
Quiet[ToExpression@
StringReplace[
ToString@#, {"lat=" -> "", "lon=" -> "", "\[Degree]" -> ""}] & /@
tr[u][[1, 1]]]
trans[Automatic] =
trans[Mean /@ (PlotRange /. AbsoluteOptions[gg])] // Quiet;
Manipulate[(xx = Evaluate[{trans[u], Quantity[r, "km"]}];
GeoGraphics[GeoDisk[trans[u], Quantity[r, "km"]],
GeoRange -> Quantity[range, "km"], ImageSize -> 500]), {{u,
Entity["City", {"Atlanta", "Georgia", "UnitedStates"}][
"Coordinates"]}, Locator}, {{r, 1, "Radius (km)"}, .75,
6}, {{range, 2, "GeoRange (km)"}, 0.5, 20, .25},
SynchronousUpdating -> False]
Data was pre-processed, scrubbed and converted to a Dataset variable, which was saved into a .wdx binary file. This type of file is platform independent so it can be used in other computers too. The car crash dataset file can be downloaded here.
NotebookDirectory[];
SetDirectory[NotebookDirectory[]];
data = Import["carcrashdataset.wdx"];
Also needed a couple of auxiliary functions for the next phase. The GeoDistance function available in Mathematica can be a wee slow, so I picked up a formula for calculating it independently. Also set up a GeoMarker function that can be scaled up based on the number of vehicles involved in the accident.
distCompile =
Compile[{lat1, long1, lat2, long2},
Abs@ArcCos[
Sin[lat1 Pi/180.] Sin[lat2 Pi/180.] +
Cos[Pi lat1/180.] Cos[Pi lat2/180.] Cos[
long2 Pi/180 - long1 Pi/180]] 6371];
myGeoMarker[coords_, radius_, scale_] :=
Tooltip[GeoMarker[coords,
Graphics[{Point[{0, 0}], Opacity[0.3], Red, Disk[]}],
"Scale" -> Scaled[(radius/scale)^.55]], radius];
The challenge was how to aggregate the accidents to those areas of interests. The GPS location data from the accidents tend to cluster around those points where it is more likely to experience a crash. Using the Graph functionality, I was able to generate a graph network and link those accidents that are close to another within a certain distance to each other (20m). By looking into the different subgraphs, I picked the vertex with the highest degree of centrality as the center of each GeoMarker.
With[{lat = xx[[1, 1]], lon = xx[[1, 2]],
radius = QuantityMagnitude[UnitConvert[xx[[2]], "km"]]},
carcrash =
data[Select[
distCompile[#Latitude, #Longitude, lat, lon] < radius &], <|
"GeoPosition" -> GeoPosition[{#Latitude, #Longitude}],
"Vehicles" -> #VEHICLES|> &]];
With[{d =
carcrash[All, GeoPositionXYZ[#GeoPosition][[1, ;; 2]] &] //
Normal},
g = AdjacencyGraph[
Boole[RegionMember[Disk[#, 20.]][d] & /@ d] -
DiagonalMatrix[ConstantArray[1, Length@d]]]];
f = With[{x = Subgraph[g, #]},
Rule[Part[VertexList[x],
First@Ordering[DegreeCentrality[x], All, Greater]],
carcrash[VertexList[x] /* Total, "Vehicles"]]] & /@
Reverse@ConnectedComponents[g];
GeoGraphics[
myGeoMarker[carcrash[#[[1]], "GeoPosition"] // Normal, #[[2]],
4000] & /@ f, ImageSize -> 1024]
There has been 38 vehicles involved in an accident at the exit of the fast food restaurants during the last three years. Maybe it is in order to place a sign reminding the drivers that no left turns are allowed.
Also, using the new Dataset functionality introduced in version 10 it is easy to manipulate the data for analysis. As an example, let's explore how the accidents are distributed based on the time of the day.
Histogram[
data[DateValue[#, "Hour"] &, "Accident Date"], Automatic, "PDF",
PlotTheme -> "Detailed",
PlotLabel -> "% Accidents by Hour - City of Atlanta"]