There is, of course, a lot more you can do. For example we can use the following website:
http://thegamesdb.net
This allows us to crosscheck the data we have looked at before. So if we take the names list from before:
data = Import["http://pastebin.com/DG1CsVXk", "Data"];
Quiet[names = (StringSplit[#, "("] & /@ data[[2, 2, 3 ;;]][[1 ;;]])[[All, 1]]];
We can use:
smalldataset =
Quiet[{"id" ->
Flatten[StringSplit[StringSplit[#, "<id>"], "</id>"]][[1]],
"GameTitle" ->
Flatten[StringSplit[StringSplit[#, "<GameTitle>"],
"</GameTitle>"]][[2]],
If[StringContainsQ[
Flatten[StringSplit[StringSplit[#, "<ReleaseDate>"],
"</ReleaseDate>"]][[2]], "Platform"],
"ReleaseDate" -> "Missing",
"ReleaseDate" ->
Interpreter["Date"][
Flatten[StringSplit[StringSplit[#, "<ReleaseDate>"],
"</ReleaseDate>"]][[2]]]]} & /@ ((StringSplit[#,
"<Game>\n"] & @(Import[
"http://thegamesdb.net/api/GetGamesList.php?name=" <> #] \
&@ RandomChoice[names]))[[2 ;;]])]
To make a nice list of rules. Note that your database is much larger so many queries on http://thegamesdb.net will give empty sets or worse errors. Anyways, we an the use fancy things like
TimelinePlot[Association["GameTitle" -> "ReleaseDate" /. smalldataset]]
To obtain
This command gives 100 games:
smalldataset =
Quiet[{"id" ->
Flatten[StringSplit[StringSplit[#, "<id>"], "</id>"]][[1]],
"GameTitle" ->
Flatten[StringSplit[StringSplit[#, "<GameTitle>"],
"</GameTitle>"]][[2]],
If[StringContainsQ[
Flatten[StringSplit[StringSplit[#, "<ReleaseDate>"],
"</ReleaseDate>"]][[2]], "Platform"],
"ReleaseDate" -> "Missing",
"ReleaseDate" ->
Interpreter["Date"][
Flatten[StringSplit[StringSplit[#, "<ReleaseDate>"],
"</ReleaseDate>"]][[2]]]]} & /@ ((StringSplit[#,
"<Game>\n"] & @(Import[
"http://thegamesdb.net/api/GetGamesList.php?name=" <> #] & /@ Import[
"http://thegamesdb.net/api/GetGamesList.php?platform=PC"]))[[2 ;;]])]
We can again plot the TimeLinePlot:
TimelinePlot[Association[Select["GameTitle" -> "ReleaseDate" /. smalldataset, DateObjectQ[#[[2]]] &]]]
It is much nicer when it is interactive in the notebook, but it looks like this:
That shows quite nicely how much the market has grown. It also suggests clusters of release dates.
It is very easy to make a nice, orderly dataset out of this:
Dataset[Association /@ smalldataset]
There is certainly lots more to discover here.
Cheers,
Marco