Message Boards Message Boards

[✓] Dataset manipulation


How do I manipulate one column of a list from within a larger dataset? For example:

In[333]:= families = <|
  "smith" -> <|"members" -> 26, "mean income" -> 95000, 
    "ratios" -> {<|"height" -> 1.5, "weight" -> 74|>, <|
       "height" -> 1.8, "weight" -> 83|>, <|"height" -> 1.9, 
       "weight" -> 91|>, <|"height" -> 2.1, "weight" -> 105|>}|>, 
  "jones" -> <|"members" -> 32, "mean income" -> 84000, 
    "ratios" -> {<|"height" -> 1.5, "weight" -> 73|>, <|
       "height" -> 1.7, "weight" -> 84|>, <|"height" -> 1.9, 
       "weight" -> 98|>, <|"height" -> 2.1, "weight" -> 115|>}|>|>

I can extract the table of smith family heights and weights with:

Dataset[families["smith", "ratios"]]

and I can multiply ALL of this with something like:

Dataset[2*families["smith", "ratios"]]

But how do I output a similar dataset, after multiplying only the height by 2?

POSTED BY: Martin Currie
2 years ago


this could work:

Dataset[{2*#["height"], #["weight"]} & /@ families["smith", "ratios"]]

If you want to keep the dataset structure intact, this works:

Dataset[<|"height" -> 2*#["height"], "weight" -> #["weight"]|> & /@ families["smith", "ratios"]]

You can also use Query

Query[All, {2 #["height"] &, "weight"}]@Dataset[families["smith", "ratios"]]



POSTED BY: Marco Thiel
2 years ago

Your second option covers exactly what I need to do - thank you very much.

I wish it was easier to find advice like yours in the documentation...

POSTED BY: Martin Currie
2 years ago

First of all I'd like to thank Marco for his great answer to Martin's question. I am fairly new to Mathematica and really learned a lot from his examples.

Here's another suggestion, which (similar to Marco's third example) makes use of the Query function for datasets.

Dataset[families]["smith", "ratios", All, MapAt[2 # &, "height"]]

What's different is mainly the use of the operator form of the MapAt function. The advantage is improved readability and it retains the dataset structure consisting of a list of associations, which facilitates use of the output for further queries if so desired.

Cheers, Eric

1 year ago

Based on Eric's solution, Query (and Dataset) actually have a concise syntax for what is essentially MapAt:

Dataset[families]["smith", "ratios", All, {"height" -> (2 # &)}]

There might be a slightly less recursive way of writing this next one, but this is one way you could use this syntax with Query to multiply those fields by 2 in the full original Dataset:

Dataset[families][{"smith" -> Query[{"ratios" -> Query[All, {"height" -> (2 # &)}]}]}]
POSTED BY: Christopher Wolfram
1 year ago

Christopher, thank you for some very helpful advice! I am very excited about your approach to modifying the original dataset!


Just to provide another option here is a solution that produces the same result as the recursive method you kindly posted.

MapAt[2 # &, Dataset[families], {"smith", "ratios", All,"height"}]
1 year ago

Group Abstract Group Abstract