Group Abstract Group Abstract

Message Boards Message Boards

0
|
6.3K Views
|
7 Replies
|
5 Total Likes
View groups...
Share
Share this post:

Issues Importing Dates and Times from Excel

Posted 2 years ago

When I import this simple file from Excel, the dates and times are coming in incorrectly (dates are way off). In the Excel file (attached) the dates and times are in different columns (not uncommon) so I would like to import and combine in Mathematica as a single date object for plotting. Since Excel is so commonly used, I would like to see better tools for Importing data for use in Mathematica. I left a few blanks and missing data columns for some help in detecting bad data (also must be common).

Thanks for any help, I know this is probably a beginner question. Roger

POSTED BY: Roger Leventhal
7 Replies

Thank you again for your very clear and helpful explanation. You should write a book or a manual for people trying to learn how to use Mathematica.

POSTED BY: Roger Leventhal

Ok, one last beginner question (I wish there was a basic questions forum). I applied the changes above (thank you) and the data looks correct (I replaced the missing or bad data with "Missing[]" as online guidance suggests). But when I try and plot the data (using DateListPlot) with the date object on the x-axis and one of the data columns (say column 3 (sysBP) as y-axis values it doesn't work. I tried to transpose the data to create a {date, value} list and that works for all numerical data - but doesn't seem to work when one column is a date object. This seems so basic, import data with a date stamp and plot against another column of readings. I have read the online guidance but its pretty thin for date object import and plotting. I appreciate any advice on this basic ask.

POSTED BY: Roger Leventhal
Posted 2 years ago

Let's start from this point:

BPr1= NormalizeDateTime/@BProws

Let's get your plot first, and then later I'll explain some of the issues with your code.

DateListPlot[BPr1[[All, {1, 3}]]]

enter image description here

DateListPlot can work with a couple different forms, but we have date-value pairs in our data, so we just need extract them and pass them to DateListPlot. This is the bit that extracts the pairs:

BPr1[[All, {1, 3}]]

You'll want to become familiar with the various forms for Part (double brackets are just syntactic sugar for Part). BPr1 is 2-dimensional, so we can use two index specifications separated by a comma. So, BPr1[[1,3]] would be the third element in the first row, for example. But Part can also take a list for an index specification, and it will fetch according to each index in the list at that level. And then there is the special form of All, which is like a list that includes every index at that level. So, BPr1[[All, {1, 3}]] is saying "fetch from all rows the 1st and 3rd elements".

Okay, now let's critique your code.

  • TableForm

    bp = Import["/Users/ericr/WolframWorkspace/MmaCommunity/BP_readings_mathematica_test.xlsx", {"Data", 1}, HeaderLines -> 4] // TableForm
    

    Use TableForm for presentation, but don't pollute your data with it. What you have in bp at this point is a TableForm expression with your data inside of it. This will make all future processing of your data difficult, as you'll need to work around the TableForm head. Now, you can't see an explicit TableForm in your output, so what am I talking about? You should become familiar with FullForm. Everything in Mathematica is an expression of a very specific syntactic form. We have syntactic sugar on the input side so we don't have to deal with that standard from all of the time. We have presentation functionality in the notebook interface to make the output look nice. But under the covers, every expression is actually in a very specific form, and it's rather verbose. To force the UI to show you this form, use FullForm. So, try this

    FullForm[bp]
    

    You should see the TableForm at the head of this expression. The standard thing to do if you want to see your data in TableForm is to first assign the variable and then display the variable. So, something like this:

    bp = <...the import goes here...>;
    TableForm[bp]
    
  • Mutation doesn't occur automatically

    Let's look at this:

    BPr1/."  --"->Missing[]/.""->Missing[]
    

    Quick side note: you can use a list with ReplaceAll to apply multiple replacement rules:

    BPr1 /. {"  --" -> Missing[], "" -> Missing[]}
    

    Now, this will evaluate fine and produce an output, but you haven't saved that result to a variable, so it's difficult to use it later. I might be reading too much into your code, but it looks to me like you're assuming that BPr1 will now have the "cleaned" data in it, but it won't. BPr1 didn't change at all. You can mutate BPr1 directly or use a different variable:

    BPr1 = BPr1 /. {"  --" -> Missing[], "" -> Missing[]}
    (* Or *)
    BPr1Clean = BPr1 /. {"  --" -> Missing[], "" -> Missing[]}
    

    I tend to use the latter, as it's often useful to have a sort of record of what I did. I sometimes need to start over from a particular spot.

  • Pay attention to the forms used in the documentation

    Now let's look at this:

    DateListPlot[{datetime},{sysBP}]
    

    That doesn't really match any of the patterns in the documentation. datetime and sysBP are already lists, but you've wrapped them in another layer of List. DateListPlot doesn't know what to do with that.

    Similarly with:

    Transpose[{{datetime},{sysBP}}]
    

    What you probably wanted here was

    Transpose[{datetime, sysBP}]
    

    And this would have worked just fine:

    DateListPlot[Transpose[{datetime, sysBP}]]
    
POSTED BY: Eric Rimbey

Roger: A bit late but hope the following helps.

SetDirectory[$UserDocumentsDirectory];
data = Transpose[
   Delete[Transpose
     [ReplaceAll[
      Import["BP_readings_mathematica_test.xlsx", 
       {"Data", 2},
        HeaderLines -> 1, 
       "EmptyField" -> Missing[]],
      {"  --" -> Missing[], 
       " -" -> Missing[], {Missing[], Missing[], Missing[], Missing[],
          Missing[], Missing[], Missing[]} -> Nothing}]] ,
    3]
   ];
bprange[s_, d_] := Which[
   And[LessEqual[s, 120], LessEqual[d, 80]], 
   RGBColor[0.65, 0.81, 0.22],
   And[Between[s, {121, 129}], LessEqual[d, 80]], RGBColor[1, 0.93, 0],
   Or[Between[s, {130, 139}], Between[d, {81 , 89}]], 
   RGBColor[1, 0.71, 0],
   Or[Between[s, {140, 179}], Between[d, {90 , 119}]], 
   RGBColor[0.73, 0.23, 0.01],
   Or[GreaterEqual[s, 180], GreaterEqual[d, 120]], 
   RGBColor[0.6, 0.03, 0.07]
   ];
createBPseries[{date_, time_, systolic_, diastolic_, pulse_, 
    comment_}] := {
   Tooltip[
    Style[
     {
      DateObject[
       Join[DateValue[date, {"Year", "Month", "Day"}],
        DateValue[time, {"Hour", "Minute", "Second"}]
        ]
       ],
      Interval[{systolic, diastolic}]
      }, bprange[systolic, diastolic]],
    comment]};
{women1839, men1839, women4059, men4059, women60, 
   men60} = {{110, 68}, {119, 70}, {122, 74}, {124, 77}, {139, 
    68}, {133, 69}};
fixeddata = createBPseries /@ data;
DateListPlot[Flatten[fixeddata], 
 Joined -> False,
  PlotRange -> 
  MinMax[Flatten[Transpose[{data[[All, 3]], data[[All, 4]]}]]] + {-5, 
    5}, 
 AspectRatio -> 1/GoldenRatio,
 IntervalMarkers -> "Bars", IntervalMarkersStyle -> Thick,
 GridLines -> {None, men4059}
 ]

Changing {"Data", 2}, to {"Data", 1}, gets the first sheet instead of he second sheet of your spreadsheet.

enter image description here

POSTED BY: Hans Michel

Thanks for the suggestions, It worked . Is NormalizeDateTime a built in function? (I searched but it didnt come up) or a user defined function? (I was told to use lower case for user defined functions which is why I ask).

How did it know to join the first two columns with the date join command (is that default?). I was looking for a Part command and it didn't seem to be needed. And what does "rest" do...

And is it easy enough to change GMT to EST with a command? How would one do that?

and yes, Excel can be difficult or a mess to deal with . Unfortunately, it is commonly used for data collection so would be nice if Mathematica could anticipate the idiosyncrasies of Excel just because its so common. I will try saving as a cvs file and see if that works as well.,

thanks so much for your help, Roger

POSTED BY: Roger Leventhal
Posted 2 years ago

Is NormalizeDateTime a built in function?

No. I showed how I defined it.

I was told to use lower case for user defined functions

Yeah, this is common advice, but it's silly. The system will alert you if you've tried to override a built-in function, so there's no real danger, and I like maintaining the uniform capitalization.

How did it know to join the first two columns with the date join command?

Because I used pattern matching. It's too much to go into as part of this answer. It sounds like you should run through some of the tutorials for how to program in Mathematica.

And is it easy enough to change GMT to EST with a command?

Look into TimeZoneConvert.

POSTED BY: Eric Rimbey
Posted 2 years ago

It looks to me like those times (second column) are actually dates. Maybe in Excel choosing to format something as a time also causes the internal representation to become a date. I don't know what magic Excel is doing under the covers. I would either figure out how to force those dates into actual times, if that's possible, or just avoid importing the xlsx file directly. It might be safer to export from Excel into csv, and then import the csv.

But, you could also just fix the data once you have it in Mathematica.

rawBP = Import[pathToFile, {"Data", 1}, HeaderLines -> 1];
BProws = DeleteCases[rawBP, {"" ...}];

We now have non-empty rows. Let's define a function to combine the dates and "times":

NormalizeDateTime[{date_, time_, rest___}] := 
  {DateObject[Join[DateValue[date, {"Year", "Month", "Day"}], DateValue[time, {"Hour", "Minute", "Second"}]]], rest}

And apply it to the data:

NormalizeDateTime /@ BProws
POSTED BY: Eric Rimbey
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard