Group Abstract

Message Boards

10.8K Views

4 Replies

0 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Data Science External Programs and Systems Import and Export Wolfram Language

Posted 7 years ago

I would like to import a large excel spreadsheet with many columns into Mathematica. The first row of the spreadsheet contains the column headings and the remaining rows contain the data. After importing I first separate the column headings and data into separate variables and would like to convert the data to a dataset. Here is an example of what I have tried. These steps simulate of importing the spreadsheet and separating into columns and data cols = {"col1", "col2", "col3", "col4", "col5"}; dat = RandomReal[{0, 1}, {10, 5}]; I now need to create an association to use to map the data to the columns but here is where I am having trouble. The best I am able to come up with is to generate assoc = Table[cols[[k]] -> #[[k]], {k, 1, 5}] This results in some error messages, but produces {"col1" -> 1, "col2" -> #1[[2]], "col3" -> #1[[3]], "col4" -> #1[[4]], "col5" -> #1[[5]]} which is nearly the result desired. I can then copy that result, paste it into a new cell, and edit that cell by hand, changing the first element of the list, applying an association function, and applying a function to get the desired function to apply to the data assoc1 = Association[{"col1" -> #[[1]], "col2" -> #1[[2]], "col3" -> #1[[3]], "col4" -> #1[[4]], "col5" -> #1[[5]]}] & Map[assoc1,dat] With this association I can then generate the dataset. Is there any way of generating this function without the necessity of the hand editing step?

POSTED BY: Mike Luntz

4 Replies

Sort By:

Posted 6 years ago

Marcin, If `condition1` is a `Dataset` then `Normal[condition1]` should be a list of 85 lists each of length 26. You want each of the 85 to be assigned the label `1`. One way to do this is `Thread[Normal[condition1] -> 1]]`. Do the same for `condition2` with label `2`. Passing both results to `Classify` should work. Rohit

POSTED BY: Rohit Namjoshi

Posted 6 years ago

To get help have a look in Mathematica help of XLSX. A follow up question: I imported Excel file as Rohit suggested and wanted to use it in Classify function. I can import Excel file as "Data" or "Dataset". Each row contains data for each patient, as Rohit has shown for a table of customers. I have two Excel sheets each for Condition1 and Condition2, both with same 26 columns structure for about 85 patients. My problem is that Classify[ Condition1->1, Condition2->2] uses each table as if this was one patient, so as if I have two patients for training and not 85. I tried to use Partition function to force reading each row as one patient bit without success. Partition[ Condition1, 26] or Partition[ Condition1, {26,1}] gives me an error of wrong dimension, which I cannot correct. Do I have to read Excel file row after row, our there is a more clever way?

POSTED BY: Marcin Balcerzyk

Posted 7 years ago

Many thanks Rohit. I was not aware that you could import directly to a dataset. That solves my problem. Is that capability documented anywhere? By the way, when I imported the file with your suggested command I got an error message stating "Import::hdrs: The value 1 for the option "HeaderLines" is outside of the data range." In spite of the error message the file apparently imported correctly.

POSTED BY: Mike Luntz

Posted 7 years ago

Not sure if this is exactly what you are after, but you can `Import` Excel directly as a `Dataset`. dataset = Import["~/Downloads/Times_Square_Hotels.xlsx", "Dataset", HeaderLines -> 1]

POSTED BY: Rohit Namjoshi

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback