Message Boards Message Boards

1
|
13753 Views
|
3 Replies
|
3 Total Likes
View groups...
Share
Share this post:

Filtering/Query Nested Data (Dataset and Association)

Posted 10 years ago

I am trying to filter a nested Dataset.

I have some data that looks like this (curated for this example):

h = Dataset@{<|"basicData" -> <|"ID" -> "1008", 
  "dateMMa" -> {2014, 10, 8, 0, 0, 0.`}|>, 
"away" -> <|"City" -> "montréal"|>, 
"home" -> <|"City" -> "toronto"|>|>, <|"basicData" -> <|"ID" -> 
   "1009", "dateMMa" -> {2014, 10, 8, 1, 0, 0.`}|>, 
"away" -> <|"City" -> "philadelphia"|>, 
"home" -> <|"City" -> "boston"|>|>}

I would like to filter this on "dateMMA" for anything earlier than {2014, 10, 8, 1, 0, 0.`} (October 8th 2014). This should yield only one item:

<|"basicData" -> <|"ID" -> "1008",  "dateMMa" -> {2014, 10, 8, 0, 0, 0.`}|>,  "away" -> <|"City" -> "montréal"|>,  "home" -> <|"City" -> "toronto"|>|>

This does not work:

h[[All, "basicData"]][Select[AbsoluteTime@#dateMMa > AbsoluteTime[{2014, 10, 8, 1, 0, 0.`}] &]]

Nor does this:

h[All, "basicData", Select[AbsoluteTime@#dateMMa > AbsoluteTime[{2014, 10, 8, 1, 0, 0.`}] &]]

Nor does this:

h[Select[AbsoluteTime@#dateMMa > AbsoluteTime[{2014, 10, 8, 1, 0, 0.`}] &]]

Is this type of filter/query possible?

Would there be a better way to structure this Dataset to achieve this goal?

POSTED BY: Ray Troy
3 Replies

I suggest you make use of DateObject and named parts (Keys).

Generally, I prefer to work without the Dataset wrapper and I also have changed the first date in your sample data to improve clarity.

olddata = {<| "basicData" -> <|"ID" -> "1008",  "dateMMa" -> {2014, 12, 8, 0, 0, 0.`}|>, 
        "away" -> <|"City" -> "montréal"|>, 
        "home" -> <|"City" -> "toronto"|>|>, <|
        "basicData" -> <|"ID" -> "1009",  "dateMMa" -> {2014, 10, 8, 1, 0, 0.`}|>, 
        "away" -> <|"City" -> "philadelphia"|>, 
        "home" -> <|"City" -> "boston"|>|>};

Firstly, put a DateObject wrapper around the dates by using the new query language.

    newdata = Query[All, <|"basicData" -> <|KeyDrop[#[["basicData"]], "dateMMa"],
  "dateMMa" -> DateObject[#[["basicData","dateMMa"]]]|>,  KeyDrop[#, "basicData"]|> &]@olddata

The query is simple in comparison and requires the use of Select and DateObject

Query[Select[#[["basicData", "dateMMa"]] <  DateObject[{2014, 11, 8}] &]]@newdata

    {<|"basicData" -> <|"Id" -> "1009",  "dateMMa" -> DateObject[{2014, 10, 8}, TimeObject[{1, 0, 0.}]]|>, 
     "away" -> <|"City" -> "philadelphia"|>, 
     "home" -> <|"City" -> "boston"|>|>}
POSTED BY: Emerson Willard

This is great and worked. Thank you. The #[key1, key2] notation is new to me and seemed to be the insight I needed. Using this new notation, here is something that worked and is a little more straightforward to me,

h[Select[AbsoluteTime@#["basicData", "dateMMa"] < AbsoluteTime@{2014,10,8,1,0,0} &]]

POSTED BY: Ray Troy
Posted 10 years ago

I had a very similar problem, and this was a big help. Many Thanks!

POSTED BY: John Custy
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract