Message Boards Message Boards


Automate converting an imported excel spreadsheet to a dataset?

Posted 2 months ago
2 Replies
0 Total Likes

I would like to import a large excel spreadsheet with many columns into Mathematica. The first row of the spreadsheet contains the column headings and the remaining rows contain the data. After importing I first separate the column headings and data into separate variables and would like to convert the data to a dataset. Here is an example of what I have tried.

These steps simulate of importing the spreadsheet and separating into columns and data

cols = {"col1", "col2", "col3", "col4", "col5"};
dat = RandomReal[{0, 1}, {10, 5}];

I now need to create an association to use to map the data to the columns but here is where I am having trouble. The best I am able to come up with is to generate

assoc = Table[cols[[k]] -> #[[k]], {k, 1, 5}]

This results in some error messages, but produces

{"col1" -> 1, "col2" -> #1[[2]], "col3" -> #1[[3]], "col4" -> #1[[4]],
  "col5" -> #1[[5]]}

which is nearly the result desired. I can then copy that result, paste it into a new cell, and edit that cell by hand, changing the first element of the list, applying an association function, and applying a function to get the desired function to apply to the data

assoc1 = Association[{"col1" -> #[[1]], "col2" -> #1[[2]], 
    "col3" -> #1[[3]], "col4" -> #1[[4]], "col5" -> #1[[5]]}] &

With this association I can then generate the dataset. Is there any way of generating this function without the necessity of the hand editing step?

2 Replies
Posted 2 months ago

Not sure if this is exactly what you are after, but you can Import Excel directly as a Dataset.

dataset = Import["~/Downloads/Times_Square_Hotels.xlsx", "Dataset",  HeaderLines -> 1]

enter image description here

Posted 2 months ago

Many thanks Rohit. I was not aware that you could import directly to a dataset. That solves my problem.

Is that capability documented anywhere? By the way, when I imported the file with your suggested command I got an error message stating

"Import::hdrs: The value 1 for the option "HeaderLines" is outside of the data range."

In spite of the error message the file apparently imported correctly.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract