Message Boards Message Boards

Build a structured data set from an Excel file

Posted 4 years ago

Hello!

I'm trying to process an Excel data set (attached) about fishing, with 27 heterogeneous variables... It seemed natural to use SemanticImport for that.

Notice I want (first) to split the data in function of the character "nom_sp" (name of the species catched).

ClearAll["Global`*"];

Place = NotebookDirectory[];
SetDirectory[Place <> "/Données"];
(*Brut=Flatten[Import["Peches2005.xlsx"],1];
EspècesP=DeleteDuplicates@Rest@Flatten@Take[Transpose@Brut,{-5}];
CodesVar=First@Brut;
Donnees=Dataset[Brut];
Trouve=GroupBy[Donnees,#[[Flatten@Position[CodesVar,VarInteret]]]&]*)

Donnees = SemanticImport["Peches2005.xlsx"]

VarInteret = "nom_sp";
Trouve = GroupBy[Donnees, VarInteret]

Seems to work fine, except that the dates are not correctly imported : "DateObject[{2002, 8, 14, 0, 0, 0.}, \"\"Instant\"\", \"\"Gregorian\"\ \", 1.]", for instance. Now, I try to extract data associated with the trout, coded "TRF".

Cherche = "TRF";
EspècesP = DeleteDuplicates@Donnees[All, VarInteret];
CodesTrouve = Flatten@Position[EspècesP, Cherche];
LePoisson = Trouve[CodesTrouve];
Print@%

It seems correct (except for the dates and hours...) but I cannot extract a column from this Dataset object :

LePoisson[All, "Total"]
LePoisson[4]

are wrong.

What should I do to obtain a proper dataset, with proper dates and hours?

Claude

Attachments:
POSTED BY: Claude Mante
2 Replies

Short and effective!

Thanks, Rohit!

POSTED BY: Claude Mante
Posted 4 years ago

Hi Claude,

(* Import as Dataset *}
dataset = Import["~/Downloads/Peches2005.xlsx", {"Dataset", 1}, HeaderLines -> 1]

(* Select rows that have "TRF" in the "EC1" column *)
dataset[Select[#EC1 == "TRF" &]]

(* Extract the "Total" column for "TRF" *)
dataset[Select[#EC1 == "TRF" &], "Total"]

The dates look correctly parsed to me. Import converts them to DateObject. You can format the DateObject to a String in a very flexible way. Check the documentation.

DateString[Now, "ISODateTime"]
(* 2020-01-22T11:11:53 *)
POSTED BY: Rohit Namjoshi
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract