Message Boards Message Boards

2
|
3988 Views
|
0 Replies
|
2 Total Likes
View groups...
Share
Share this post:

DateObject operations seem slow. Any thoughts? Also some remarks on memory.

Is there any reason for this time discrepancy?

RepeatedTiming[DateObject[{2012, 1, 2}] - DateObject[{2012, 1, 1}]]
(*{0.00075, Quantity[1, "Days"]} *)

RepeatedTiming[UnitConvert[Quantity[AbsoluteTime@DateObject[{2012, 1, 2}] - AbsoluteTime@DateObject[{2012, 1, 1}], "Seconds"], "Days"]] 
(*{0.00022, Quantity[1, "Days"]}*)

Even going through conversions after conversion, we still get 3 times better speed than trying to do it directly.

Obviously, being less convoluted, we even get better performances over the "standard":

RepeatedTiming[b = DateObject[{2000, 1, 1}];] 
(*{6.26*10^-6, Null}*)

RepeatedTiming[b - Quantity[1, "Days"];] 
(*{0.00315, Null}*)

RepeatedTiming[DateObject[AbsoluteTime[b] - 24*3600];] 
(*{0.0000277, Null}*)

RepeatedTiming[DateObject[AbsoluteTime[b] - QuantityMagnitude[UnitConvert[Quantity[1, "Days"], "Seconds"]]];] 
(*{0.00022, Null}*)

It looks very difficult to get speeds worse than going through DateObject. And yet, we are also getting a DateObject at the end.

Another interesting example:

RepeatedTiming[data1 = Table[{DateObject[{2000, 1, 1, 0, 0, 0}] + Quantity[i, "Days"], i^2}, {i, 1, 1000}];]
(*{2.842, Null}*)

RepeatedTiming[data2 = Table[{3155673600 + i*24*3600, i^2}, {i, 1, 1000}]; data3 = {DateObject[#[[1]]], #[[2]]} & /@ data2; ]
(*{0.020, Null}*)

data1 == data3
(*True*)

That is, two orders of magnitude faster when going through the long way.

(As a side note, a TimeSeriesMap with two arguments (Value and Time) can be very practical to operate on the Value, depending on its date.)

On the memory front:

I know that DateObject is not an Atom, and hence there's little magic available underneath. But would it be possible to store the information inside DateObject with a less memory hungry data structure?

When working with dates, it's typical that we work with millions or billions of records. Having speed and memory optimized DateObject related functions is fundamental.

In[84]:= ByteCount[DateObject[{2000, 1, 1}]] 
Out[84]= 160 

In[85]:= ByteCount[AbsoluteTime[DateObject[{2000, 1, 1}]]] 
Out[85]= 16

This is a 10 times fold.

The same happens for the following different cases:

In[98]:= ByteCount[DateObject[{2012, 1, 1, 12, 5, 5.1}]]
Out[98]= 488

In[97]:= ByteCount[AbsoluteTime@DateObject[{2012, 1, 1, 12, 5, 5.1}]]
Out[97]= 88

Not 10 times, but more than 5x.

Obviously, the more specifications we add, the worst it gets:

In[103]:= ByteCount[DateObject[{2012, 1, 1}]]
Out[103]= 160

In[102]:= ByteCount[DateObject[{2012, 1, 1}, TimeZone -> 8]]
Out[102]= 240

In[101]:= ByteCount[DateObject[{2012, 1, 1}, TimeZone -> "America/New_York"]]
Out[101]= 280

So, my conclusion questions are:

-> What is justifying the speed differences observed above?

-> Is there some memory optimisation possible, besides going through time, instead of date?

POSTED BY: Pedro Fonseca
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract