Message Boards Message Boards

GROUPS:

How to modify a Dataset?

Posted 7 years ago
10985 Views
|
9 Replies
|
14 Total Likes
|

Hello all, I have the following problem. I have a Dataset with three columns. Now I want to a a fourth column with combining the second and third column (e.g. store the sum of two numbers (second and third column) in the fourth column). I tried a lot but had no success. Maybe it is very simple, but I do not see how to manage this. Can anyone give me a hint.

Greetings from Germany

Mike

9 Replies

Dear Mike,

let's generate a data set:

dataset = RandomReal[1, {10, 3}]

This is one way:

{#[[1]], #[[2]], #[[3]], #[[2]] + #[[3]]} & /@ dataset

This is another

Flatten[{#, #[[2]] + #[[3]]}] & /@ dataset

If you prefer procedural programming you can use

Table[Flatten[{dataset[[i]], dataset[[i, 2]] + dataset[[i, 3]]}], {i, 1, Length[dataset]}]

This one works, too

Transpose[Append[Transpose[dataset] , dataset[[All, 2]] + dataset[[All, 3]]]]

Cheers,

Marco

Marco, I believe that Michael is asking about a Version 10 Dataset

http://reference.wolfram.com/language/ref/Dataset.html

rather than a simple rectangular array of data.

Dear David,

I am sorry. You are right. I should have read the question more carefully. My bad!

Here's the answer for a "Dataset" dataset ...

dataset = Dataset[Table[<|"a" -> RandomReal[], "b" -> RandomReal[], "c" -> RandomReal[]|>, {i, 1, 10}]]

enter image description here

Then you can do

dataset2 = Append[#, "d" -> #["b"] + #["c"]] & /@ dataset

enter image description here

This one is a bit shorter

dataset2 = Append[#, "d" -> #b + #c] & /@ dataset

Alternatively, you can use the Join command

dataset2 = Join[#, <|"d" -> #b + #c|>] & /@ dataset

I guess that it can also be done using procedural programming, e.g.

dataset2 = Dataset[Table[Normal[Join[dataset[[i]], <|"d" -> dataset[[i, 2]] + dataset[[i, 3]]|>]], {i,1, 10}]]

but this version is neither fast, nor elegant nor readable.

Cheers, M.

Thank you very much, David! when I see the solution it is really not too complicated - but I was not able to manage it by myself :-( . The documentation for Dataset in V10 is really not very instructive, so it is great to get a quick help here :-)

Greetings from Germany

Mike

Thanks Marco for the nice quick tutorial!

I just noticed that this can also be easily done with pattern matching:

Normal[dataset] /. x_Association :>  Join[x , <|{"d" -> x[[2]] + x[[3]]}|>]

That would then cover all three programming paradigms, I guess.

Cheers,

M.

Posted 6 years ago

Suppose I want to test whether the value in column A is in some other list or set, and then append the result of the test (True or False) as a separate column to the dataset. I am used to doing this in SQL, but not sure how best to do this using Datasets in Mathematica 10.

Get a data set

In[16]:= Clear[daS1]
daS1 = Dataset[Table[<|"a" -> o, "b" -> RandomChoice[Characters["caitlin ramsey"]]|>, {o, 12}]]

and another one

In[24]:= Clear[daS2]
daS2 = Dataset[Table[<|"c" -> o, "d" -> Characters["caitlin"][[o]]|>, {o, StringLength["caitlin"]}]]

to create a third one using approaches mentioned in this discussion already

In[33]:= daS1 /. x_Association :> Join[x, <|{"Q" -> If[Intersection[{x[[2]]}, Normal[Query[All, "d"]@daS2]] != {}, True, False]}|>]

to point out a best way to do it one needs some criterion (what means good, what means bad) and at least two ways to get the job done to make a non-trivial decision ...

Posted 6 years ago

I think that will be quite sufficient, Udo, thank you. In this case, I was mainly interested in the syntax to incorporate the comparison test into the dataset modification. No requirement for optimality here -- I could have said "reasonably efficient" rather than "best" to express the problem -- and thank you for considering that detail!

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract