Message Boards Message Boards

Quickly clean up missing values from hundreds of XLS files?

I have hundreds of XLS files that need to be aggregated, but I find that each file has more or less missing values or error messages. Is there any way to quickly clean up these missing values or error messages? Because I still don't know how to clean up a lot of data, so the following code only extract the data in different files.

(*   Get a list of files   *)
dir = "D:\\Data";
fileList = FileNames["*", dir];
For[i = 1, i = Length[fileList], i++,
 If[i == 1,
      rawdata = Import[fileList[[i]], {"XLS", "Data"}][[1]];    (*i=1 olny *)
     ,
  {
   tempData = Rest@Import[fileList[[i]], {"XLS", "Data"}][[1]];        (*i != 1*)
   rawdata = AppendTo[rawdata, tempData];
   }
  ]
 ]
POSTED BY: Tsai Ming-Chou
4 Replies

Tsai,

You should use Map to Import all your files instead of For

dir = "D:\\Data";
fileList = FileNames["*", dir];
data =  Map[Import[#, {"XLS", "Data"}]&,fileList];

You can then flatten the list one level if you want it in one list instead of a list of lists (I do not know how you will be using it).

What do you mean by "Clean up" data? For example if a value is "15#" should that be deleted or become "15" or something else? What should happen to blank data? Do you want to make them all the Mathematica symbol Missing[]?

Regards,

Neil

POSTED BY: Neil Singer
Attachments:

In the attachment, the field names are 01 to 24, representing hourly measurement data. I want to mark the contents of these fields as the value of "blank data" uniformly marked as "NA" and delete characters (for example, "15#" is marked as 15). Since the data is aggregated and will calculate on a time-by-time basis, data from different files should flatten into a significant data set for subsequent matrix operations.

data = Flatten[Map[Import[#, {"XLS", "Data"}] &, fileList], 2];
POSTED BY: Tsai Ming-Chou

Thank you very much for your demonstration. I will follow this idea to practice.

POSTED BY: Tsai Ming-Chou
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract