Functional programming does not, in general, result in more efficient code. In fact, the opposite is true:
https://en.wikipedia.org/wiki/Functional_programming#Efficiency_issues
What you may be referring to is that functional style is a better fit for the Wolfram Language specifically. When there is a direct functional equivalent for a procedural construct in Mathematica, the functional version will typically be faster. One reason for this is that the functional way to solve a task is often higher level and conceptually closer to the task itself. The procedural way often requires spelling out specific details. There are often multiple ways to do this and not all of them are equally efficient. If the problem is expressed at a higher level, that gives Mathematica more leeway to choose an efficient way to do the computation.
Example:
Sum the elements of an array.
The procedural way is something like this, in pseudocode:
sum = 0
i = 0
while i < length(array):
sum = sum + array[i]
i = i+1
This is a very specific way to solve the summing problem. Spelling out all the steps constrains the order of operations. This can't easily be parallelized automatically.
However, if there is a single command to sum a list, that can have a more efficient multi-core implementation.
More generally, if we say that we want to reduce the array (Fold
operation) with the +
operator, knowing that +
is associative, and knowing that evaluating +
has no side effects, the problem can be straightforwardly and automatically parallelized. Associativity tells us that the order of operations does not matter, so we are not constrained to sequential evaluation.
Here's a related discussion:
https://mathematica.stackexchange.com/questions/134609/why-should-i-avoid-the-for-loop-in-mathematica
This is not about functional vs procedural, but about why it is better to avoid For
in Mathematica. However, many of the things mentioned there illustrate the disadvantages of procedural approaches, or rather low-level approaches.
You might notice that I am not so much talking about functional vs procedural here, but about higher or lower level ways to express a concept. In inherently slow interpreted languages, such as the Wolfram Language, higher level expression typically results in better performance than lower level expression, simply because a low-level expression of an algorithm involves more (slow) interpreted steps, while the high-level expression can make use of highly optimized atomic operations. (In a low-level fast language often the opposite is true.)
I'd say that one reason why functional code is better in such languages is that functional code naturally produces a higher-level expression. Additionally, what Matthew said is that it is often close to the way one thinks.