Puzzled by FindClusters on a set of dates with a custom DistanceFunction

Posted 4 years ago
4313 Views
|
2 Replies
|
2 Total Likes
|
 Here is a set of dates: events = {{2014, 12, 14, 15, 26, 20.}, {2014, 12, 14, 15, 38, 31.}, {2014, 12, 14, 15, 41, 14.}, {2014, 12, 19, 11, 55, 11.}, {2014, 12, 19, 11, 55, 47.}} Now I would like to find clusters of dates by using a custom DistanceFunction based on DateDifference. Here is the code for this--with some Print statements in it to see what dates are being chosen as the DistanceFunction is applied and the result of those applications. FindClusters[events, DistanceFunction -> (With[{num = Abs@QuantityMagnitude[ DateDifference[Floor@#1, Floor@#2]]}, {Print[Floor@#1, " ", Floor@#2, " ", num]}; N@num] &)] Note that the presense of the Floors is there to make sure that the dates have, for example, integer years, months and days as it seems that FindClusters numericalizes the data before sending it to the DistanceFunction (which is itself slightly annoying and perhaps a bug in my opinion...). The result of executing this generates the following errors: DateDifference::twoarg: Argument {2014,13,19,19,45,44} is not a time unit or a list of time units, nor can it be interpreted as a date. >> DateDifference::date: Expression {Gregorian,{2013.,11.,17.,6.,36.,28.}} cannot be interpreted as a date specification. >> FindClusters::xnum: A non-numeric, negative, or complex dissimilarity value was computed; dissimilarities must be non-negative and real valued. >> And the Print statements give the following--note the very peculiar dates that are being used: {2014,12,14,15,26,20} {2014,12,14,15,38,31} 0.00846065 {2014,12,14,15,26,20} {2014,12,14,15,41,14} 0.0103472 {2014,12,14,15,26,20} {2014,12,19,11,55,11} 4.85337 {2014,12,14,15,26,20} {2014,12,19,11,55,47} 4.85378 {2014,12,14,15,38,31} {2014,12,14,15,41,14} 0.00188657 {2014,12,14,15,38,31} {2014,12,19,11,55,11} 4.84491 {2014,12,14,15,38,31} {2014,12,19,11,55,47} 4.84532 {2014,12,14,15,41,14} {2014,12,19,11,55,11} 4.84302 {2014,12,14,15,41,14} {2014,12,19,11,55,47} 4.84344 {2014,12,19,11,55,11} {2014,12,19,11,55,47} 0.000416667 {2014,12,16,5,49,35} {2013,10,14,1,36,29} 428.176 {2014,12,16,5,49,35} {2014,12,15,3,60,24} 1.07582 {2014,12,16,5,49,35} {2013,12,19,25,40,39} 361.173 {2014,12,16,5,49,35} {2013,12,16,12,39,30} 364.715 {2013,10,14,1,36,29} {2014,12,15,3,60,24} 427.1 {2013,10,14,1,36,29} {2013,12,19,25,40,39} 67.0029 {2013,10,14,1,36,29} {2013,12,16,12,39,30} 63.4604 {2014,12,15,3,60,24} {2013,12,19,25,40,39} 360.097 {2014,12,15,3,60,24} {2013,12,16,12,39,30} 363.64 {2013,12,19,25,40,39} {2013,12,16,12,39,30} 3.54247 {2014,10,12,4,59,31} {2013,11,14,1,39,30} 332.139 {2014,10,12,4,59,31} {2014,12,17,8,49,33} 66.1597 {2014,10,12,4,59,31} {2014,10,15,14,48,47} 3.40921 {2014,10,12,4,59,31} {2014,12,18,10,52,14} 67.2449 {2013,11,14,1,39,30} {2014,12,17,8,49,33} 398.299 {2013,11,14,1,39,30} {2014,10,15,14,48,47} 335.548 {2013,11,14,1,39,30} {2014,12,18,10,52,14} 399.384 {2014,12,17,8,49,33} {2014,10,15,14,48,47} 62.7505 {2014,12,17,8,49,33} {2014,12,18,10,52,14} 1.0852 {2014,10,15,14,48,47} {2014,12,18,10,52,14} 63.8357 {2013,11,17,6,36,28} {2014,13,19,19,45,44} Abs[QuantityMagnitude[DateDifference[{2013,11,17,6,36,28},{2014,13,19,19,45,44}]]] Does anyone have any insight into this behavior/failure?
2 Replies
Sort By:
Posted 4 years ago
 If you are dealing with dates, then its better to deal with DateObjects: events = DateObject /@ {{2014, 12, 14, 15, 26, 20.}, {2014, 12, 14, 15, 38, 31.}, {2014, 12, 14, 15, 41, 14.}, {2014, 12, 19, 11, 55, 11.}, {2014, 12, 19, 11, 55, 47.}}; Then, your function doesn't appear to need the Floor operators: FindClusters[events, DistanceFunction -> (With[{num = Abs@QuantityMagnitude[DateDifference[#1, #2]]}, {Print[#1, " ", #2, " ", num]}; N@num] &)] This evaluates cleanly for me. The older notation of using lists for dates is ambiguous since it could just be a vector/list of numbers. Does this do what you want?