Message Boards Message Boards

3
|
11031 Views
|
4 Replies
|
3 Total Likes
View groups...
Share
Share this post:

Apply the same function to multiple columns in Dataset using Merge

So, this may have been obvious to many of you, and in might even have been covered here or elsewhere in a way I did not find, but, since it took a while for me to figure it out and I find it quite helpful, I thought I would share.

Suppose one has a Dataset object in Mathematica. Here's a real world sample.

myds=Dataset[{<|"Unique" -> "59809AL97179", "NAIC/MFA Plan ID" -> 97179, "Parent Name (MFA)" -> "UNITEDHEALTH GRP", "State" -> "AL", "Reinsurance Payment Amount (or Not Eligible)" -> 147985.51, 
  "Individual Market Risk Adjustment Transfer Amount" -> -805865.4, "Small Group Risk Adjustment Transfer Amount" -> 2012.29|>, 
 <|"Unique" -> "80863AZ69477", "NAIC/MFA Plan ID" -> 69477, "Parent Name (MFA)" -> "ASSURANT INC GRP", "State" -> "AZ", "Reinsurance Payment Amount (or Not Eligible)" -> 1.23484246*^7, 
  "Individual Market Risk Adjustment Transfer Amount" -> 1.187858757*^7, "Small Group Risk Adjustment Transfer Amount" -> 355617.92|>, 
 <|"Unique" -> "92137CO82406", "NAIC/MFA Plan ID" -> 82406, "Parent Name (MFA)" -> "UNITEDHEALTH GRP", "State" -> "CO", "Reinsurance Payment Amount (or Not Eligible)" -> 73070.96, 
  "Individual Market Risk Adjustment Transfer Amount" -> 168907.5, "Small Group Risk Adjustment Transfer Amount" -> 0.|>, 
 <|"Unique" -> "79509CO73288", "NAIC/MFA Plan ID" -> 73288, "Parent Name (MFA)" -> "HUMANA GRP", "State" -> "CO", "Reinsurance Payment Amount (or Not Eligible)" -> 1.44166137*^6, 
  "Individual Market Risk Adjustment Transfer Amount" -> 664060.25, "Small Group Risk Adjustment Transfer Amount" -> -95294.25|>, 
 <|"Unique" -> "23307AZ95885", "NAIC/MFA Plan ID" -> 95885, "Parent Name (MFA)" -> "HUMANA GRP", "State" -> "AZ", "Reinsurance Payment Amount (or Not Eligible)" -> 676010.67, 
  "Individual Market Risk Adjustment Transfer Amount" -> 856409.75, "Small Group Risk Adjustment Transfer Amount" -> -1.94864142*^6|>, 
 <|"Unique" -> "67138CA9330300", "NAIC/MFA Plan ID" -> 9330300, "Parent Name (MFA)" -> "HEALTH NET INC GRP", "State" -> "CA", 
  "Reinsurance Payment Amount (or Not Eligible)" -> 4.187840112*^7, "Individual Market Risk Adjustment Transfer Amount" -> -1.2563585594*^8, 
  "Small Group Risk Adjustment Transfer Amount" -> -9.52563747*^6|>}]

One wants to apply the same function to some of the columns. As a commentator noted in response to an earlier version of this post, in some cases it's very simple (and documented). Here's a simple way to get the total of each column.

myds[Total, {"Reinsurance Payment Amount (or Not Eligible)", 
  "Individual Market Risk Adjustment Transfer Amount"}]

But for other functions, this won't work. Consider this attempt to obtain the minimum value in each column...

myds[Min, {"Reinsurance Payment Amount (or Not Eligible)", 
  "Individual Market Risk Adjustment Transfer Amount"}]

It does not work. Here's how one could do it using Merge. Construct a query.

q = Query[
  Merge[#,Min] &, {"Reinsurance Payment Amount (or Not Eligible)", 
   "Individual Market Risk Adjustment Transfer Amount", 
   "Small Group Risk Adjustment Transfer Amount"}]

Now just apply the query to the dataset.

tots=q[myds]

Obviously, this could all be done as a one-liner or using different syntax. I've broken it up here for clarity. And my thanks to a commenter for wisely pointing out the simple way of writing the code when one is attempting to obtain a Total.

POSTED BY: Seth Chandler
4 Replies
Posted 9 years ago

A more compact construct as given by in the documentation of Dataset is:

myds[Total, {"Reinsurance Payment Amount (or Not Eligible)", 
  "Individual Market Risk Adjustment Transfer Amount", 
  "Small Group Risk Adjustment Transfer Amount"}]
POSTED BY: Michael Helmle

Also, this seems to work:

myds[Min, #] & /@ {"Reinsurance Payment Amount (or Not Eligible)", 
  "Individual Market Risk Adjustment Transfer Amount"}

By this I mean that it gives the minimum of the 2 columns specified, equivalently

Map[myds[Min, #] &, {"Reinsurance Payment Amount (or Not Eligible)", 
  "Individual Market Risk Adjustment Transfer Amount"}]

HTH.

POSTED BY: Aeyoss Antelope

I wish someone could explain to me why your original construction with Min[] did not work. My only guess is that it is related to the difference between these:

Options[Min]
Options[Total]

The second one gives Method -> Automatic. I cannot see how this matters. But I can hope for (eventual) enlightenment.

POSTED BY: Aeyoss Antelope

You may also use the operator form of Merge which compacts the Dataset query a bit.

myds[Merge[Min], 
 {"Reinsurance Payment Amount (or Not Eligible)", "Individual Market Risk Adjustment Transfer Amount"}]

Hope this helps.

POSTED BY: Edmund Robinson
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract