Message Boards Message Boards

0
|
285 Views
|
5 Replies
|
2 Total Likes
View groups...
Share
Share this post:

Question about applying a function to dataset values

The attached notebook explains a problem that I'm grappling with in which I want to programatically apply an evaluated expression, including a function and named variables, to all values in a dataset. Any advice on how to achieve the desired result shown at the end of the notebook would be much appreciated. Thanks in anticipation...

POSTED BY: Ian Williams
5 Replies
Posted 1 month ago

Okay, I'll try to explain. Yes, it has to do with the evaluation sequence. Let's consider a simpler example. We want to construct the body of a Function ahead of defining the actual function, so something like this:

expression = Slot[1] + Slot[2]

At some later point, we want to construct an actual Function, in your case with parameters but that doesn't really matter, even this simple case will demonstrate the problem.

function = Function[Null, expression]
(* This is a verbose form, but it is equivalent to Function[expression] and expression & *)

And now we want to apply our function:

function[a, b]

but the result is

#1 + #2

You can see what happened with Trace:

function[a, b] // Trace
(*
{{function,expression&},
 (expression&)[a,b],
 expression,#1+#2}
*)

The Function symbol has the HoldAll attribute. So, expression doesn't get evaluated until the Function starts evaluating. Specifically, Function will hold its body unevaluated until the arguments have been bound.

Now, in this simple case, we could actually address this with Evaluate:

function2 = Function[Null, Evaluate[expression]];
function2[a, b]
(* a + b *)

But Evaluate is not a solution to all such problems, because it only works at the first level. If it's buried deeper, then it doesn't force evaluation until outer levels have been evaluated. Specifically, you wouldn't be able to add Evaluate directly to this

ds[All,Association[#,expression]&] 

like this

ds[All, Association[#, Evaluate[expression]] &]

because it's buried too deep. There is probably some way to unravel all of your computations to get this to work, but I just don't think it's worth trying. My suggestion decomposes the problem more clearly and would be easier to update for new scenarios (in my opinion).

POSTED BY: Eric Rimbey

Agreed. Thanks again for the detailed explanation. Much appreciated.

POSTED BY: Ian Williams
Posted 1 month ago

I would go about it like this (there are probably refinements, but this is a working starting point):

  • Create a function to map keys to types:

    MapThread[(keyType[#1] = #2) &, {keys, types}]
    
  • An alternate to your toType function (I simplified it a bit, but if any of this is wrong you can figure out how to adjust it):

    typedVal[type_String, ""] := Missing["Empty string", type];
    typedVal[type_String?(StringEndsQ[{"DP", "SF"}]), val_String] := ToExpression[val];
    typedVal[type_String?(StringEndsQ[{"SCI", "U"}]), val_String] := Interpreter["Number"][val];
    typedVal[type_String, val_] := val
    

    You don't actually need to name the first argument, since we don't actually need to reference that argument's name (except for the first form, which is probably overkill anyway).

  • Create a function to map the original pairs to pairs with typed values:

    toTypedPair[key_String, val_] := key -> typedVal[keyType[key], val]
    
  • Apply it to ds:

    ds[All, Association@*KeyValueMap[toTypedPair]]
    
POSTED BY: Eric Rimbey
Posted 1 month ago

Fixed

POSTED BY: Eric Rimbey

Thanks Eric. I've made some modifications to your code and got it working - see final section of the attached notebook. I would, however, still like to understand why using the variable 'expression' in my original notebook failed whereas pasting the evaluated form of 'expression' gave the desired result. I suspect it'll be to do with the evaluation process. But since using variables for keys in Slot is a useful thing to be able to do so I'd like to understand how it works (or, in this case, doesn't). Thanks again, Ian

POSTED BY: Ian Williams
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract