Group Abstract Group Abstract

Message Boards Message Boards

2
|
9.6K Views
|
7 Replies
|
6 Total Likes
View groups...
Share
Share this post:

DateObject isn't Listable, is there a faster way?

I come across this a lot. I optimize the importing of a file, but I'm stuck parsing timestamps. I have to map DateObject across these timestamps. Mapping is slower than WL functions optimized with a Listable attribute. Is there a better way to handle timestamps?

I thought I could speed it up by specifying $DefaultStringFormat in Block, but that didn't help. Any other ideas?

Block[{$DateStringFormat = {"Month", "/", "Day", "/", "Year", " ", "Hour12", ":", "Minute", ":", "Second", " ", "AMPM"}},
   DateObject /@ listOfTimestamps
];

This is taking 10s when the import takes 0.1s.

POSTED BY: Eric Smith
7 Replies

Eric: Please try the following:

Clear[osplit, opm, dlist];
dlist = Table[
   Block[{$DateStringFormat = {"Month", "/", "Day", "/", "Year", " ", 
       "Hour12", ":", "Minute", ":", "Second", " ", "AMPM"}}, 
    DatePlus[DateString[], RandomReal[{-1000, 0}]]], {10000}];
osplit[x_] := Module[{}, StringSplit[x, {"/", " ", ":"}] ];
opm[y_] := Module[{}, 
   ToExpression[
    If[Last[y] == "PM", 
     Permute[Most[ReplacePart[y, 4 -> ToExpression[y[[4]]] + 12]], 
      Cycles[{{3, 1, 2}}]], Permute[Most[y], Cycles[{{3, 1, 2}}]]]]];

Basically generating 10,000 samples that match your format with "AM" and "PM". Then subsequent function that operate on that string format whose aim is to get the list into WL DateList-like format {yyyy, mm, dd, hh, mm, ss} .

DateObject /@ 
  Map[opm[#] &, Map[osplit[#] &, dlist]]  // AbsoluteTiming
(* {0.854198,{Fri 7 Dec 2018 09:40:06GMT-5., ...*)

Functions can be improved and combined to get even more efficiencies. Will Block or With be faster than Module

Please note the AM, PM logic is simplified to just add 12 if PM but it is a bit more complicated than that. But tweaking that should not add too much more time

POSTED BY: Hans Michel

Eric,

What is the format of your timestamp?

Also, how many are you importing? I tried this on 5000 values and its fairly fast:

In[105]:= bar = Table[DateString[], 5000];

In[110]:= AbsoluteTiming[DateObject /@ bar;]

Out[110]= {1.34714, Null}

Regards,

Neil

POSTED BY: Neil Singer

Thanks for the correction. I also found that StringReplace works well too, about the same performance:

convertString[dateTime_]:=With[{ns=NumberString},
    StringReplace[{m:ns~~"/"~~d:ns~~"/"~~y:ns~~" "~~h:ns~~":"~~min:ns~~":"~~s:ns~~" "~~ampm:WordCharacter..:>
    "{"<>y<>","<>m<>","<>d<>","<>h<>Switch[ampm,"AM","","PM","+12"]<>","<>min<>","<>s<>"}"}]
    [dateTime]//ToExpression//DateObject
]

I optimized this some, instead of using If, Switch is a tiny bit faster. And instead of using StringRiffle, inserting the commas as strings with StringJoin gets the 20,000 timestamps converted in less than 0.6s

POSTED BY: Eric Smith

Eric:

Thank you for trying out the code. I hope you saw the note that this first version did not have a proper dealing with 24 Hour format. Try the following with final function set with listable attributes. Also feel free to rename the functions for better readability.

Clear[osplit, opm, pmto24, ldo, dlist];
dlist = Table[
   Block[{$DateStringFormat = {"Month", "/", "Day", "/", "Year", " ", 
       "Hour12", ":", "Minute", ":", "Second", " ", "AMPM"}}, 
    DatePlus[DateString[], RandomReal[{-1000, 0}]]], {10000}];
osplit[x_] := Module[{}, StringSplit[x, {"/", " ", ":"}] ];
pmto24[z_] := 
  Module[{}, 
   If[Last[z] == "PM"  &&  First[z]  =!= "12", 
    ToExpression[First[z]] + 12 , 
    If[ First[z] == "12" && Last[z] == "AM", 0, 
     ToExpression[First[z]]]]];
opm[y_] := Module[{}, 
   ToExpression[
    Permute[Most[ReplacePart[y, 4 -> pmto24[{y[[4]], Last[y]}]]], 
     Cycles[{{3, 1, 2}}]]]];
ldo[s_String] := DateObject[opm[osplit[s]]] 
SetAttributes[ldo, Listable]
ldo@dlist // AbsoluteTiming

If the listable version is slower than you can use style of the previous post to just using Map. Putting the string into a DateList-like format helps but it is not all the source of timing issues. I use a Permute function to swap out the year from 3rd to 1st position but there are other functions that could be faster.

POSTED BY: Hans Michel

Genius! So not having the string in DateList form is the slowdown. It's only taking 0.8 seconds to do 20,000. Thank you!

POSTED BY: Eric Smith

Thanks for replying @Neil Singer and @Rohit Namjoshi,

These are large files from our lab equipment. The one I'm looking at has 20,000 rows of data. Just for an apples-to-apples comparison, I converted the first 5000 lines. It takes 2.5s on a 2017 MacBook Pro (15" i7), so comparable to your timings.

Is there an easy way to compile a mapped DateObject function? Passing a list of strings to the new compiler isn't obvious to me. And it feels like this could be faster considering the import times.

POSTED BY: Eric Smith
Posted 6 years ago

Eric,

I get close to the same timing as Neil for his example. To simulate your format:

randomDateTime[years_, n_] := 
 With[{range = {UnixTime[] - years*365*24*60*60, UnixTime[]}}, 
  FromUnixTime /@ RandomInteger[range, n]]

dateStringFormat = {"Month", "/", "Day", "/", "Year", " ", "Hour12", ":", "Minute", ":", "Second", " ", "AMPM"};    

timeStamps = DateString[#, dateStringFormat] & /@ randomDateTime[5, 5000];

Block[{$DateStringFormat = dateStringFormat}, DateObject /@ timeStamps;] // AbsoluteTiming

(* {1.92389, Null} *)
POSTED BY: Rohit Namjoshi
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard