# Most efficient way to count occurences in a list?

Posted 2 months ago
415 Views
|
7 Replies
|
6 Total Likes
|
 What would be the most efficient way to count occurences in a list L of elements that return True for a given boolean function f? Something like Length[Select[L,f]] works but I'm not sure if it is efficient. The function Count does the trick for patterns but I can't see a way to use it for a function.
7 Replies
Sort By:
Posted 2 months ago
 Fold seems to be quite a bit faster than Select or Count In[1]:= n = 10^6; L = Table[RandomInteger[{0, 9}], {i, n}]; Timing[Length[Select[L, # > 5 &]]] Timing[Count[L, x_ /; x > 5]] Timing[Fold[#1 + Boole[#2 > 5] &, 0, L]] Out[3]= {0.374402, 399567} Out[4]= {0.343202, 399567} Out[5]= {0.0312002, 399567} 
Posted 2 months ago
 This can be done ~5 times faster by using UnitStep: L=RandomInteger[{0,9},1000000]; RepeatedTiming[Count[L,x_?(#>5&)]] bool[x_]:=x>5 RepeatedTiming[Count[bool/@L,True]] RepeatedTiming[Count[L,x_/;x>5]] RepeatedTiming[Count[Thread[L>5],True]] RepeatedTiming[Count[UnitStep[5-L],0]] RepeatedTiming[Fold[#1+Boole[#2>5]&,0,L]] RepeatedTiming[Total[Boole[#>5]&/@L]] RepeatedTiming[Total[1-UnitStep[5-L]]] RepeatedTiming[Length[L] - Total[UnitStep[5 - L]]] gives me: {0.474,400314} {0.4834,400314} {0.314,400314} {0.26,400314} {0.038,400314} {0.0301,400314} {0.022,400314} {0.0067,400314} {0.0045, 400314} 
Posted 2 months ago
 May be this helps L = Table[RandomInteger[{0, 9}], {i, 50}] Count[L, x_ /; x > 5] or ( with the L from above ) bool[x_] := x > 5 Count[bool /@ L, x_ /; x == True] 
Posted 2 months ago
 For short, concise and general, try CountsBy.If your data is numeric, and the condition is also numeric, try my BoolEval package. http://szhorvat.net/mathematica/BoolEval In[199]:= arr = RandomInteger[1000, 10000000]; In[201]:= CountsBy[arr, # > 100 &] // Timing Out[201]= {5.6509, <|True -> 8991315, False -> 1008685|>} In[202]:= << BoolEval In[203]:= BoolCount[arr > 100] // Timing Out[203]= {0.356869, 8991315} To count the ones whose square is less than 100, one would use BoolCount[arr^2 < 100]BoolEval provides a readable and maintainable syntax to the same type of computation that Sander showed.
Posted 2 months ago
 Thanks! That's a nice way to transform it to a pattern. I did a quick test with the Timing function and was a bit surprised to see that the one with Select was a bit quicker: In[1]:= bool[x_] := x > 5; n = 10^6; L = Table[RandomInteger[{0, 9}], {i, n}]; Timing[Length[Select[L, bool]]] Timing[Count[bool /@ L, x_ /; x == True]] Out[1]= {1.03125, 400526} Out[2]= {1.57813, 400526} But that is probably because the length function will have a shorter list to work with?
 This variant is even faster: Total[Boole[# > 5] & /@ L] `