Message Boards Message Boards

0
|
1399 Views
|
7 Replies
|
4 Total Likes
View groups...
Share
Share this post:

How do I merge rows with same row name in a dataset?

Posted 1 year ago

I have a file with a similar structure to figure 1 below, how do I combine the data with the same row names (cylinder type) to look like figure 2 such that every data with the same row name are summed together? enter image description here

POSTED BY: Tobi Babatunde
7 Replies
Posted 1 year ago

For simplicity, let's say your dataset looks like this:

origData =
 Dataset[
  {<|"Thing" -> "A", "Type" -> "4 cyl", "2019" -> 1, "2020" -> 4|>,
   <|"Type" -> "6 cyl", "2019" -> 1, "2020" -> 2|>,
   <|"Thing" -> "B", "Type" -> "4 cyl", "2019" -> 3, "2020" -> 4|>,
   <|"Type" -> "6 cyl", "2019" -> 0, "2020" -> 2|>,
   <|"Thing" -> "C", "Type" -> "4 cyl", "2019" -> 2, "2020" -> 3|>,
   <|"Type" -> "6 cyl", "2019" -> 6, "2020" -> 4|>}]

Then maybe you want something like this:

origData[GroupBy["Type"], Total][All, {"2019", "2020"}]

enter image description here

POSTED BY: Eric Rimbey
Posted 1 year ago

Remembered that the operations can be combined:

origData[GroupBy["Type"], Total, {"2019", "2020"}]
POSTED BY: Eric Rimbey
Posted 1 year ago

Okay, well I'd suggest you add column headers to the first two columns, say "Category" and "Fuel". Then export it as csv. We don't want the title, so if you can select specifically what to export, just don't select the first two rows. If you can't do that before the export, then just delete the first two lines of the resulting file. What we want is a file that looks like this (just showing a few lines):

Category,Fuel,2018,2019,2020,2021 Combines vehicles,Petrol,1964,1735,1523,1406 ,Diesel,14904,13044,11506,10316 ,Paraffin,0,0,0,0

We don't really need the first column either, but it'll work either way.

Now we need to import as a Dataset (replace pathToFile with the actual path):

theData = Import[pathToFile, "Dataset", HeaderLines -> 1]

You should now have a Dataset. Now do the totalling:

theData[GroupBy["Fuel"], Total, {"2019", "2020"}]

enter image description here

POSTED BY: Eric Rimbey

Thanks for the reply Eric, however, it's difficult to carry out this operation on the dataset I have because it's not an association list. I tried to convert the lists of list to an association but the repeated row names it isn't making it any easier.

POSTED BY: Tobi Babatunde
Posted 1 year ago

Show us the format you have. Your picture looks like a spreadsheet. That should be easy to export as csv. That in turn would be easy to import as Dataset.

POSTED BY: Eric Rimbey

Hi Eric, I've attached the excel file. Thanks.

Attachments:
POSTED BY: Tobi Babatunde

Thanks a lot for the insight, this was very helpful.

POSTED BY: Tobi Babatunde
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract