Nice post, Anton! In terms of interpretable model, I'm always a big fan of traditional statistical analysis!
One central interest on characters and fonts (not just CJK) is the topological relationship between strokes. It's somehow related but beyond properties in image space. There are neural net approach to this topic, like FontRNN, etc. I guess they captured those topology nicely, so I wonder how can we "extract" the information, and what can we say from applying "traditional" statistical analysis on them. If we can say something meaningful from that, then we get insight and they become interpretable deep learning models.