Message Boards Message Boards

Using multiple data fields as input to FindDistribution[ ]

Posted 1 year ago

Hi; In my attached notebook, I accessed four fields (SepalLength, SepalWidth, PetalLength, PetalWidth) from the Fisher's Irises data repository with the intention of using the FindDistribution[] function to obtain a Multinormal distribution. However, it appears that the FindDistribution[] function is having difficulty with the multiple fields.

Since I am new to probability in Mathematica, it is most likely something that I am doing incorrectly. I would certainly appreciate any help.

Thanks,
Mitch Sandlin

Attachments:
POSTED BY: Mitchell Sandlin
2 Replies
Posted 1 year ago

If you just want to estimate the parameters of a multivariate normal with the Iris data, then the following will work:

m = {m1, m2, m3, m4};
cov = {{v11, v12, v13, v14},
          {v12, v22, v23, v24},
          {v13, v23, v33, v34},
          {v14, v24, v34, v44}};
FindDistributionParameters[Values[data], MultinormalDistribution[m, cov]]
(* {m1 -> 5.84333, m2 -> 3.05733, m3 -> 3.758, m4 -> 1.19933, 
 v12 -> -0.0421511, v13 -> 1.26582, v14 -> 0.512829, v23 -> -0.327459,
  v24 -> -0.120828, v34 -> 1.28697, v11 -> 0.681122, v22 -> 0.188713, 
 v33 -> 3.0955, v44 -> 0.577133} *)
POSTED BY: Jim Baldwin

From the "Details and Options" section of the documentation for FindDistribution:

The data must be a list of possible outcomes from a univariate distribution.

So there are a couple of problems

  1. Your code is passing a list of associations for data rather than a list of numbers
  2. FindDistribution can only handle univariate data
POSTED BY: Rohit Namjoshi
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract