Dear all, I have some sample data about the frequency of losses due to credit card fraud and I'd like to fit a discrete probability distribution to it for modeling and simulation purposes. The data is as follows:
data={0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8, 9, 12, 12, 13, 15, 15}
I fit a Poisson first and then I check how good the fit is:
estimatedpoisson =
EstimatedDistribution[data, PoissonDistribution[\[Mu]]];
\[ScriptCapitalP] =
DistributionFitTest[data, estimatedpoisson,
"HypothesisTestData"];
\[ScriptCapitalP]["TestDataTable", All]
The p-value is very close to 0 indicating a poor fit. I have also tried to fit both a binomial and a negative binomial since the data exhibits over dispersion with the variance being about 4 times bigger than the mean but both fits are still quite poor as well. Any other suggestions? Many thanks in advance, Ruben