Hi Robert, thanks for the input. The codes I use are as follows:
SetDirectory[NotebookDirectory[]];
fns = FileNames["*.jpg"];
imgs = Import /@ fns;
imgs;
ClusteringTree[imgs]
The EMD refers to the Earth Movers Distance or Wasserstein Index, and yes I did lift the code directly from the example page. The EMD concept is easier for me to explain in general terms.
distances =
Table[ImageDistance[imgs[[i]], imgs[[j]],
DistanceFunction -> "EarthMoverDistance"], {i, Length[l]}, {j,
i + 1, Length[imgs]}];
With[{mtemp = PadLeft[#, Length[imgs]] & /@ distances},
distmatrix = mtemp + Transpose[mtemp]];
NumberForm[distmatrix, 3]
adjmatrix =
1 - Unitize[
Threshold[distmatrix, Quantile[Flatten[distances], 1/3]]];
GraphPlot[adjmatrix,
VertexShapeFunction -> (Inset[l[[#2]], #, Center, .5] &),
SelfLoopStyle -> None, Method -> "SpringEmbedding", ImageSize -> 500]
The Iris dataset is either nominal, discrete or continuous. Can we mix images with different types of data for clustering? The main objective for me is to evaluate whether there is a pattern between terrestrial and epiphytic orchids (based on their appearance). I also have data on their pollinator insects. It would be great if I can combine all of them together for cluster analysis.