Message Boards Message Boards

0
|
4157 Views
|
6 Replies
|
2 Total Likes
View groups...
Share
Share this post:

Computed Column on Dataset

Posted 3 years ago

Can someone tell me how to write a Query that adds some columns, computed from several other columns in a Dataset?

POSTED BY: Pablo Gil
6 Replies

A simple way:

data = Dataset[{<|"a" -> 1, "b" -> 2|>, <|"a" -> 3, "b" -> 4|>}]
data[All, <|"f1" -> #a^2 + #b, "f2" -> N[Sin[#b + #a]]|> &]

If you want to append the collumn:

data[All, <|#, "f1" -> #a^2 + #b, "f2" -> N[Sin[#b + #a]]|> &]
POSTED BY: Rodrigo Murta
Posted 3 years ago

ok, I was wrong, I was handling a List of associations like a Dataset, for a List you have to use a Query function, nor for a Dataset, as I said in my question

data1 = {<|"a" -> 1, "b" -> 2|>, <|"a" -> 3, "b" -> 4|>};

now appending two new computed columns to the List:

Query[All, {"a", "b"} /* <| "a" -> #a &, "b" -> #b &, 
    "f1" -> #a ^2 + #b &, "f2" -> N[Sin[#b + #a]] & |>]@data1

many thanks Rodrigo

POSTED BY: Pablo Gil
Posted 3 years ago

Hi Pablo,

Appending columns (which @Rodrigo mentioned) can also be done with Query.

Query[All, {"a", "b"} /* (<|#, "f1" -> #a^2 + #b, "f2" -> N[Sin[#b + #a]]|> &)]@data1

(* <|"a" -> 1, "b" -> 2, "f1" -> 3, "f2" -> 0.14112|>, 
   <|"a" -> 3, "b" -> 4, "f1" -> 13, "f2" -> 0.656987|>} *)
POSTED BY: Rohit Namjoshi
Posted 3 years ago

I'm confused, I get an error the first time I execute this, then if I execute In[3] again there's no error

In[1]:= data = {<|"a" -> 1, "b" -> 2|>, <|"a" -> 3, "b" -> 4|>}
In[2]:= ds = Dataset[data]
In[3]:= Query[All, {"a", "b"} /* <| #&, "f1" -> #a ^2 + #b &, "f2" -> N[Sin[#b + #a]] & |>]@ds

however this works fine

In[1]:= data = {<|"a" -> 1, "b" -> 2|>, <|"a" -> 3, "b" -> 4|>}
In[2]:= ds = Dataset[data]
In[3]:= Query[All, {"a", "b"} /* <| #&, "f1" -> #a ^2 + #b &, "f2" -> N[Sin[#b + #a]] & |>]@data

this is what motivated my comment above

POSTED BY: Pablo Gil
Posted 3 years ago

The pure function has to be defined correctly

data = {<|"a" -> 1, "b" -> 2|>, <|"a" -> 3, "b" -> 4|>}
ds = Dataset[data];

Query on Dataset

Query[All, {"a", "b"} /* (<|#, "f1" -> #a^2 + #b, "f2" -> N[Sin[#b + #a]]|> &)]@ds

Query on Association

Query[All, {"a", "b"} /* (<|#, "f1" -> #a^2 + #b, "f2" -> N[Sin[#b + #a]]|> &)]@data

In both cases the Query expression is identical

qFun = Query[All, {"a", "b"} /* (<|#, "f1" -> #a^2 + #b, "f2" -> N[Sin[#b + #a]]|> &)]

qFun@ds
qFun@data
POSTED BY: Rohit Namjoshi
Posted 3 years ago

Ok, it works fine, Thank you Rohit.

POSTED BY: Pablo Gil
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract