I posted a notebook that uses the resource function PhylogeneticTreePlot
to create a dendrogram of several hundred full sequences of SARS-CoV-2 genomes.
https://community.wolfram.com/groups/-/m/t/1961461
This uses, among other things, the Wolfram Data Repository item containing said sequences.
As another related note, an article has recently appeared that uses similar methods for genome sequence classification.
Gurjit S. Randhawa,, Maximillian P. M. Soltysiak , Hadi El Roz, Camila P. E. de Souza, Kathleen A. Hill, Lila Kari.Machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: COVID-19 case study. PLoSONE 15(4):e0232391, April 24, 2020, 24 pages.
doi: 10.1371/journal.pone.0232391
Here is a direct link to the paper:
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0232391
I will also note here, as I do in today's Community post, that I first learned about the Chaos Game Representation from a 2016 talk given by Lila Kari at the University of Western Ontario.Clearly it was a great talk, from my perspective-- I've been using the Chaos Game representation ever since, in projects on genome identification and even authorship identification.