Message Boards Message Boards

GROUPS:

Does the Apply function work on a dataset?

Posted 6 years ago
5705 Views
|
6 Replies
|
3 Total Likes
|

I am quite new to Mathematica so I hope my question isn't too basic. I would like to apply a function to the rows of a dataset where selected columns are the input to the function. I can do this with a list but I am eventually going to deal with a large file with a lot of columns and would like to choose the columns based on the header. I never got that far because I cannot get Apply to work on more than one row of a dataset.

Example: CREATE A SIMPLE FUNCTION

In[1]:= simplefunc[a_, b_, c_] := (a + b)*c

In[2]:= simplefunc[1, 2, 3]

Out[2]= 9

IMPORT A LIST (MATRIX)

In[4]:= noheader = Import["D:/test/noheader.csv"]

({ {1, 2, 3},{2, 3, 4}, {3, 4, 5},{4, 5, 6}})

APPLY THE FUNCTION TO THE FIRST ROW - IT WORKS

In[13]:= simplefunc @@ noheader[[1]]

Out[13]= 9

APPLY THE FUNCTION TO ALL ROWS - IT WORKS

In[5]:= simplefunc @@@ noheader

Out[5]= {9,20,35,54}

USE SEMANTIC IMPORT TO CREATE A DATASET WITH HEADERS

In[6]:= yesheader = SemanticImport["D:/test/yesheader.csv"]

Out[6]= 

a b c

1   2 3

2   3 4

3   4 5

4   5 6

2 levels | 4elements | 12elements total \[SpanFromLeft] \[SpanFromLeft]

APPLY THE FUNCTION TO THE FIRST ROW -- IT WORKS

In[9]:= simplefunc @@ yesheader[[1]]

Out[9]= 9

APPLY THE FUNCTION TO THE ENTIRE DATASET - IT FAILS

In[14]:= simplefunc @@@ yesheader

Out[14]= Failure[Dataset,<|MessageTemplate:>Dataset::invqf,MessageParameters-><|Symbol->Apply,Head->Apply,Arguments->{simplefunc,TypeSystem`Vector(TypeSystem`Struct({a,b,c},{TypeSystem`Atom(Integer),TypeSystem`Atom(Integer),TypeSystem`Atom(Integer)}),4),{1}}|>|>]

I have tried other approaches with no success. Does Apply simply not work with datasets? Is there a way to get to my final goal of applying a function to selected columns by header names?

Thank you!

Ed

6 Replies

The trick is that Dataset[] is Apply[], in a way. It is a computational structure that applies functions to Associations in a highly unusual (but pretty much regular) way. See the Dataset[] documentation and work through the example. Takes a long time, but the notation and system is sufficiently unusual that there isn't any other way.

First, though, see attached file DatasetAndApply.nb for examples. It gives a hint of what is going on.

Attachments:
Posted 2 months ago

How do I download the DatasetAndApply.nb?

We are trying to locate the attachment file DatasetAndApply.nb, but in meantime if @Bill Lewis currently has it, could you please attach it to a new comment?

Posted 2 months ago

I have noticed that many old posts have missing attachments. Would be great if they can be found and restored. e.g. this post, response by Miguel Olivo-V is supposed to have an attachment but the link is missing.

Here is DatasetAndApply.nb. I have also attached \Dataset_Tutorial.nb , which is another attempt at describing what Dataset does.

The current presentation of Dataset[] as a mere display convenience ignores most of what Dataset can do. It is excellent at rapid processing of large amounts of data, in some cases at least processing data that List[] could not fit into memory.

Fantastic tutorial, thanks!

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract