I think I understand the internals of an "Empirical" DataDistribution (dd) with one exception: I don't know what dd[[2,3]] (i.e., False) signifies. Can anyone tell me?
Here's the backstory:
I'm modeling conditional categorical distributions as nested Associations. For my purposes this implementation seems to be much faster than using the built-in CategoricalDistribution. The extensive functionality that exists for Associations provides the tools to do a variety of manipulations and do them quickly.
I don't need DataDistributions but I'm toying with the idea of using them as containers. The "weights" are then categorical distributions represented as Associations (or perhaps even DataDistributions themselves). I have no problem setting this up and I can easily convert back and forth between DataDistributions and Associations by dipping into the internals of DataDistribution. But I'm curious as to what I might be getting involved with.
BTW, I know it's dangerous to rely on the internal structure of a built-in object. I've done that in the past and lived to regret it. Many years ago I relied on the internal structure of InterpolatingFunction to implement functionality that was absent at the time. If I remember correctly, the functionality was eventually added, but not before the internals were changed.
It looks like dd[[2,3]] is where the bandwidth is stored for a "SmoothKernel" DataDistribution. So False makes sense for an "Empirical" DataDistribution, which has no bandwidth.