I came across a curious phenomena in my work recently. I am trying to fit a parametric distribution to some truncated data that also includes a weight for each observation using Mathematica. Maximum likelihood obviously does this pretty neatly. However, I used two different ways of doing it in Mathematica:

d = WeightedData[rawdat[[All, 2]], rawdat[[All, 1]]]

EstimatedDistribution[d, TruncatedDistribution[{0, 150000}, LogNormalDistribution[\[Mu], \[Sigma]]], {\[Mu], \[Sigma]}]]

LogLikelihood[%[[2]], d]

Let's say this gives me: mu = 9.86564 and sigma = 0.905846, and the log-likelihood has a value of -11.1108 at the solution. However, this took 14 sec.

Or I do this:

NMaximize[{

(1/Total[rawdat[[All, 1]]]) *

Total[rawdat[[All, 1]].Log[PDF[TruncatedDistribution[{0, 150000}, LogNormalDistribution[\[Mu], \[Sigma]]], #] & /@ rawdat[[All, 2]]]],

\[Sigma] > 0}, {\[Mu],\[Sigma]}]]

Now the answer is: mu = 9.86567 and sigma = 0.905853, though the log-likelihood has the same value at the solution as before. Most importantly, this took only 2 sec.

What gives? Why are the solutions

*slightly* different and why is EstimatedDistribution[] so much slower? Is it that the WeightedData[] function is harder for Mathematica to deal with? Does anyone here know?

Markus

PS: This was a sub-sample (n = 250) of the full data that I am using. With the full data, the difference in calculation time is 5+ minutes vs. 50 seconds.