Group Abstract Group Abstract

Message Boards Message Boards

Performance measures for FindClusters?

POSTED BY: Tsai Ming-Chou

At the moment the clustering metrics are all internal and used to optimize hyper-parameters. We have a plan to expose them and if there is some interest all the better.

For the time being, and keeping in mind that is code might change in the future, you can directly use the internal function

data = RandomReal[1, {1000, 2}];
clusters = FindClusters[data];

ClusterValidation = MachineLearning`PackageScope`ClusterValidation;

Some criteria that measure the "goodness" of a cluster are reversed to for every measure the lower the better

Table[
 Last@ClusterValidation[type, "" -> {"", clusters}],
 {type,
  {"StandardDeviation", "RSquared", "Dunn", "CalinskiHarabasz", "Silhouette"}}
 ]

(* {0.337803, 373.229, -0.00908816, -684.364, -0.380691} *)
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard