Group Abstract Group Abstract

Message Boards Message Boards

Can you help me navigate an XML tree?

Posted 3 years ago
POSTED BY: Dave Lartigue
3 Replies

This now seems to be working!

getUserRatings[username_String] := Module[
   {urlUser, listReady, k1, games, num, gameIDs, id, ratings, ra, 
    results, i, a},
   urlUser = 
    "https://www.boardgamegeek.com/xmlapi2/collection?username=" <> 
     username <> "&rated=1&stats=1";
   listReady = 0;
   While[listReady == 0,
    k1 = Import[urlUser, "XML"];
    Which[
     k1[[2, 1]] == "items", listReady = 1,
     k1[[2, 1]] == "message", listReady = 0; Pause[8],
     k1[[2, 1]] == "errors", listReady = 1
     ];
    ];
   If[k1[[2, 1]] == "errors", Print["Invalid Username"],
    games = 
     Cases[k1, 
      XMLElement["item", {___, "subtype" -> "boardgame", ___}, ___], 
      Infinity];
    num = Length[games];
    gameIDs = 
     Cases[games, 
      XMLElement["item", {___, "objectid" -> id___, ___}, ___] -> id, 
      Infinity];
    ratings = 
     Cases[games, XMLElement["rating", {"value" -> ra___}, ___] -> ra,
       Infinity];
    results = {};
    For[i = 1, i <= num, i++,
     a = <|"gameID" -> gameIDs[[i]], "userName" -> username, 
       "rating" -> ratings[[i]]|>;
     AppendTo[results, a];
     ];
    Return[results];
    ];
   ];

The only place where I am explicitly poking into the structure of the XML is checking [[2,1]] to see if I got back an error, a "please wait", or actual data. It's fast, too. I tried it on a user with >2000 ratings and it finished almost immediately after getting the data! Thank you for your help! You saved me a lot of time and grief!

POSTED BY: Dave Lartigue

Thanks Eric! As multiple games can share the same name, I am using <item objectid="xxxx"> to identify the game. It's a unique identifier for BGG data.

This will definitely get me started. Many thanks!

POSTED BY: Dave Lartigue
Posted 3 years ago

There are several ways to go about this based on how rigid the schema is and what you want to do with the data. The simplest way, and the one with fewest assumptions about schema, would probably be to use Cases. If r1 is your data, then

boardgames = Cases[r1, XMLElement["item", {___, "subtype" -> "boardgame", ___}, ___], Infinity]

will give you a list of all XMLElements that are boardgames.

Now, for each boardgame, we can play the same trick to look for the rating:

Cases[#, XMLElement["rating", ___], Infinity] & /@ boardgames

Cases produces a list, so what you now have is a list of lists. Each list contains the rating substructures for each boardgame structure, which means they no longer contain anything else from the boardgame structure, including the name, which I'm assuming will be important at some point.

If you were to save each of these lists (the boardgames list and the list of ratings) in variables, you could do some further processing to pair them or whatever. You could also just try to put this whole thing in an Association. Let's say that boardgames is the variable for the boardgames. Then you could have done this:

ratings = AssociationMap[Cases[#, XMLElement["rating", ___], Infinity] &, boardgames]

This will be a big hairy thing. You might want just the name of each boardgame to be the key, instead of the whole boardgame structure itself. To get that, you can map yet another Cases function onto the keys:

KeyMap[Cases[#, XMLElement["name", _, name_] -> name, Infinity] &, ratings]

I imagine that all of the lists generated by Cases will be superfluous to your purpose, but I'm not making any assumptions about how many ratings or names there will be. You can further clean the data to suit your purpose.

POSTED BY: Eric Rimbey
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard