Group Abstract Group Abstract

Message Boards Message Boards

2
|
10K Views
|
7 Replies
|
6 Total Likes
View groups...
Share
Share this post:

DateObject isn't Listable, is there a faster way?

I come across this a lot. I optimize the importing of a file, but I'm stuck parsing timestamps. I have to map DateObject across these timestamps. Mapping is slower than WL functions optimized with a Listable attribute. Is there a better way to handle timestamps?

I thought I could speed it up by specifying $DefaultStringFormat in Block, but that didn't help. Any other ideas?

Block[{$DateStringFormat = {"Month", "/", "Day", "/", "Year", " ", "Hour12", ":", "Minute", ":", "Second", " ", "AMPM"}},
   DateObject /@ listOfTimestamps
];

This is taking 10s when the import takes 0.1s.

POSTED BY: Eric Smith
7 Replies

Thanks for the correction. I also found that StringReplace works well too, about the same performance:

convertString[dateTime_]:=With[{ns=NumberString},
    StringReplace[{m:ns~~"/"~~d:ns~~"/"~~y:ns~~" "~~h:ns~~":"~~min:ns~~":"~~s:ns~~" "~~ampm:WordCharacter..:>
    "{"<>y<>","<>m<>","<>d<>","<>h<>Switch[ampm,"AM","","PM","+12"]<>","<>min<>","<>s<>"}"}]
    [dateTime]//ToExpression//DateObject
]

I optimized this some, instead of using If, Switch is a tiny bit faster. And instead of using StringRiffle, inserting the commas as strings with StringJoin gets the 20,000 timestamps converted in less than 0.6s

POSTED BY: Eric Smith
POSTED BY: Hans Michel

Genius! So not having the string in DateList form is the slowdown. It's only taking 0.8 seconds to do 20,000. Thank you!

POSTED BY: Eric Smith

Eric: Please try the following:

Clear[osplit, opm, dlist];
dlist = Table[
   Block[{$DateStringFormat = {"Month", "/", "Day", "/", "Year", " ", 
       "Hour12", ":", "Minute", ":", "Second", " ", "AMPM"}}, 
    DatePlus[DateString[], RandomReal[{-1000, 0}]]], {10000}];
osplit[x_] := Module[{}, StringSplit[x, {"/", " ", ":"}] ];
opm[y_] := Module[{}, 
   ToExpression[
    If[Last[y] == "PM", 
     Permute[Most[ReplacePart[y, 4 -> ToExpression[y[[4]]] + 12]], 
      Cycles[{{3, 1, 2}}]], Permute[Most[y], Cycles[{{3, 1, 2}}]]]]];

Basically generating 10,000 samples that match your format with "AM" and "PM". Then subsequent function that operate on that string format whose aim is to get the list into WL DateList-like format {yyyy, mm, dd, hh, mm, ss} .

DateObject /@ 
  Map[opm[#] &, Map[osplit[#] &, dlist]]  // AbsoluteTiming
(* {0.854198,{Fri 7 Dec 2018 09:40:06GMT-5., ...*)

Functions can be improved and combined to get even more efficiencies. Will Block or With be faster than Module

Please note the AM, PM logic is simplified to just add 12 if PM but it is a bit more complicated than that. But tweaking that should not add too much more time

POSTED BY: Hans Michel

Thanks for replying @Neil Singer and @Rohit Namjoshi,

These are large files from our lab equipment. The one I'm looking at has 20,000 rows of data. Just for an apples-to-apples comparison, I converted the first 5000 lines. It takes 2.5s on a 2017 MacBook Pro (15" i7), so comparable to your timings.

Is there an easy way to compile a mapped DateObject function? Passing a list of strings to the new compiler isn't obvious to me. And it feels like this could be faster considering the import times.

POSTED BY: Eric Smith
Posted 6 years ago

Eric,

I get close to the same timing as Neil for his example. To simulate your format:

randomDateTime[years_, n_] := 
 With[{range = {UnixTime[] - years*365*24*60*60, UnixTime[]}}, 
  FromUnixTime /@ RandomInteger[range, n]]

dateStringFormat = {"Month", "/", "Day", "/", "Year", " ", "Hour12", ":", "Minute", ":", "Second", " ", "AMPM"};    

timeStamps = DateString[#, dateStringFormat] & /@ randomDateTime[5, 5000];

Block[{$DateStringFormat = dateStringFormat}, DateObject /@ timeStamps;] // AbsoluteTiming

(* {1.92389, Null} *)
POSTED BY: Rohit Namjoshi

Eric,

What is the format of your timestamp?

Also, how many are you importing? I tried this on 5000 values and its fairly fast:

In[105]:= bar = Table[DateString[], 5000];

In[110]:= AbsoluteTiming[DateObject /@ bar;]

Out[110]= {1.34714, Null}

Regards,

Neil

POSTED BY: Neil Singer
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard