Message Boards Message Boards

0
|
11547 Views
|
6 Replies
|
3 Total Likes
View groups...
Share
Share this post:

Does the Apply function work on a dataset?

Posted 9 years ago

I am quite new to Mathematica so I hope my question isn't too basic. I would like to apply a function to the rows of a dataset where selected columns are the input to the function. I can do this with a list but I am eventually going to deal with a large file with a lot of columns and would like to choose the columns based on the header. I never got that far because I cannot get Apply to work on more than one row of a dataset.

Example: CREATE A SIMPLE FUNCTION

In[1]:= simplefunc[a_, b_, c_] := (a + b)*c

In[2]:= simplefunc[1, 2, 3]

Out[2]= 9

IMPORT A LIST (MATRIX)

In[4]:= noheader = Import["D:/test/noheader.csv"]

({ {1, 2, 3},{2, 3, 4}, {3, 4, 5},{4, 5, 6}})

APPLY THE FUNCTION TO THE FIRST ROW - IT WORKS

In[13]:= simplefunc @@ noheader[[1]]

Out[13]= 9

APPLY THE FUNCTION TO ALL ROWS - IT WORKS

In[5]:= simplefunc @@@ noheader

Out[5]= {9,20,35,54}

USE SEMANTIC IMPORT TO CREATE A DATASET WITH HEADERS

In[6]:= yesheader = SemanticImport["D:/test/yesheader.csv"]

Out[6]= 

a b c

1   2 3

2   3 4

3   4 5

4   5 6

2 levels | 4elements | 12elements total \[SpanFromLeft] \[SpanFromLeft]

APPLY THE FUNCTION TO THE FIRST ROW -- IT WORKS

In[9]:= simplefunc @@ yesheader[[1]]

Out[9]= 9

APPLY THE FUNCTION TO THE ENTIRE DATASET - IT FAILS

In[14]:= simplefunc @@@ yesheader

Out[14]= Failure[Dataset,<|MessageTemplate:>Dataset::invqf,MessageParameters-><|Symbol->Apply,Head->Apply,Arguments->{simplefunc,TypeSystem`Vector(TypeSystem`Struct({a,b,c},{TypeSystem`Atom(Integer),TypeSystem`Atom(Integer),TypeSystem`Atom(Integer)}),4),{1}}|>|>]

I have tried other approaches with no success. Does Apply simply not work with datasets? Is there a way to get to my final goal of applying a function to selected columns by header names?

Thank you!

Ed

POSTED BY: Edward Roberts
6 Replies

Fantastic tutorial, thanks!

The trick is that Dataset[] is Apply[], in a way. It is a computational structure that applies functions to Associations in a highly unusual (but pretty much regular) way. See the Dataset[] documentation and work through the example. Takes a long time, but the notation and system is sufficiently unusual that there isn't any other way.

First, though, see attached file DatasetAndApply.nb for examples. It gives a hint of what is going on.

Attachments:
POSTED BY: Bill Lewis

How do I download the DatasetAndApply.nb?

POSTED BY: Gianluigi Salvi

We are trying to locate the attachment file DatasetAndApply.nb, but in meantime if @Bill Lewis currently has it, could you please attach it to a new comment?

POSTED BY: Moderation Team
Posted 2 years ago

I have noticed that many old posts have missing attachments. Would be great if they can be found and restored. e.g. this post, response by Miguel Olivo-V is supposed to have an attachment but the link is missing.

POSTED BY: Rohit Namjoshi

Here is DatasetAndApply.nb. I have also attached \Dataset_Tutorial.nb , which is another attempt at describing what Dataset does.

The current presentation of Dataset[] as a mere display convenience ignores most of what Dataset can do. It is excellent at rapid processing of large amounts of data, in some cases at least processing data that List[] could not fit into memory.

POSTED BY: Bill Lewis
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract