Message Boards Message Boards

Looking for dendrogram / hierarchical clustering functionality


I am looking for a package that can do hierarchical clustering and plot publication quality dendrograms.

The following should be supported at a minimum:

  • Provide a (symbolic) representation of dendrograms.
  • Operations on dendrograms: cut at a certain level (to obtain clusters or sub-dendrogram), split into a given number of clusters (or sub-dendrograms), truncate at certain level, convert to/from the "merges matrix" representation that MATLAB and numerous other packages use.
  • Cluster the given data and return a (symbolic) dendrogram (using the specified method)
  • Take a precomputed dissimilarity matrix and return a (symbolic) dendrogram
  • Visualize a dendrogram with flexible styling options. In particular, specify labels, format labels (rotate them!), highlight groups with custom styles, specify aspect ratio, specify orientation.
  • Visualization must use predictable coordinates so that the result can be easily combined with other graphics.

The HierarchicalClustering built-in package comes close, but it's not there yet: cluster highlighting is unusably ugly and there aren't enough operations on symbolic dendrograms. Also, this package is abandoned. It conflicts with builtins and many options are highlighted in red.

There's the builtin Dendrogram, but I find it barely usable in practice. It doesn't have a practically usable symbolic representation of dendrograms, and the styling options are limited. It seems like one step forward and two steps backwards, compared to the package.

Does anyone know of alternatives? MATLAB has much much better dendrogramming functionality. It can even optimize the leaf node ordering.

POSTED BY: Szabolcs Horvát
2 months ago

If such a package does not yet exist, it would be a nice project for anyone willing to take it up :-)

POSTED BY: Szabolcs Horvát
2 months ago

Group Abstract Group Abstract