I'm happy I could hook you up with some food for thought (or research). I'm afraid I can't help you with the questions you raise, since computer science is definitely not my field of expertise. The Wolfram language is probably a hard nut to crack in terms of performance prediction. So much is automated and an expression may or may not be passed on to more or less optimized implementations in the background, whether that's a different algorithm to solve a mathematical problem, or compilation, or parallelization, or maybe just the use of sparse / packed array structures or floats instead of integers... There's just so much that could happen, and while it's all traceable, it's not something the language communicates to a user who doesn't ask for it. Much of the performance tuning advice you'll get is based largely on experience.
By the way, here's an impressively simple example to highlight the role of duck typing. Just use more iterations if your computer is too fast to notice the difference.
AbsoluteTiming[Do[Exp[2], 10^6];]
{0.411852, Null}
AbsoluteTiming[Do[Exp[2.0], 10^6];]
{0.198113, Null}
So far, so good, and you're probably not surprised, but now look at these guys (let's ignore that they also gather the results in lists):
AbsoluteTiming[Array[Exp[2.0] &, 10^6];]
{0.0573083, Null}
AbsoluteTiming[Table[Exp[2.0], 10^6];]
{0.023209, Null}
AbsoluteTiming[ConstantArray[Exp[2.0], 10^6];]
{0.00207215, Null}
Relative speed is a bit different if you use RandomInteger[]
instead of 2
and RandomReal[]
instead of 2.0
(and obviously ConstantArray
will not do the same thing in the random case), but the trend still holds, which means that something similar will happen for different sorts of real-life datasets as well.
... and this, essentially, is the kind of example that gives loops in Wolfram a bad name. Things get much worse once Append
enters the stage, and generally speaking, people who use Do
indiscriminantly often also like Append
a lot. You can imagine the near-magical speed-up going from Do[Append[...]...]
to Map
or Table
, and the resulting shouts of "I'll never Do
again!".
Here's a scary Append
example. Let's say we'd like to find all primes up to 10^6. I'd consider the first example the intuitive Wolfram-esque approach, I'm glad to see it is in fact the fastest one:
AbsoluteTiming[Select[Range[10^6],PrimeQ];]
{0.40464, Null}
AbsoluteTiming[Reap[Do[If[PrimeQ[ii],Sow[ii]],{ii,1,10^6}]][[2,1]];]
{0.651222, Null}
AbsoluteTiming[list={};Do[If[PrimeQ[ii],AppendTo[list,ii]],{ii,1,10^6}];list;]
{15.224, Null}
... even these not-so-straight-forward solutions are faster than Do-Append
:
AbsoluteTiming[Select[Prime[Range[10^6]],#<10^6&];]
{3.01896, Null}
AbsoluteTiming[Pick[Range[10^6],PrimeQ[Range[10^6]]];]
{0.517066, Null}
AbsoluteTiming[ii=Prime[1];Reap[While[ii<10^6,Sow[ii];ii=NextPrime[ii];]][[2,1]];]
{1.11314, Null}