Message Boards Message Boards

[?] Solve problem with PDF and discrete variates?

Posted 6 years ago

Can you help me understand why these two plots differ? It seems like in first case variate suddenly stopped being discrete

d1 = TransformedDistribution[x + x, x\[Distributed]DiscreteUniformDistribution[{1,6}]];
DiscretePlot[PDF[d1, {a}], {a,0,15}]

d2 = TransformedDistribution[x + x, {x\[Distributed]DiscreteUniformDistribution[{1,6}], y\[Distributed]DiscreteUniformDistribution[{1,6}]}];
DiscretePlot[PDF[d2, {b}], {b,0,15}]

Thank you.

POSTED BY: Michael Kilburn
8 Replies

About problem #2 -- pretty sure it is a bug in Mathematica (I assume it is Wolfram backend). After many hours spent banging my head against the desk I found a workaround -- to "fix" distribution generated by TransformedDistribution. Like this:

p := DiscreteUniformDistribution[{1,6}]

fixd[dist_] := Module[{x,k,min,max}, 
  (* unfortunately Quantile returns -inf for TransformedDistribution, another bug? *)
  {min,max} := {0, Quantile[dist, {0, 1}][[2]]};
  EmpiricalDistribution[ Table[Probability[x==k, x\[Distributed]dist], {k,min,max}]->Table[k, {k, min, max}] ]]

y := fixd[ TransformedDistribution[a + b, {a\[Distributed]p, b\[Distributed]p}] ]
w := fixd[ TransformedDistribution[c + d, {c\[Distributed]y, d\[Distributed]p}] ]

pdf1 = PDF[y];
DiscretePlot[pdf1[s1], {s1, 0, 20}, ExtentSize -> 0.8]

pdf2 = PDF[w];
DiscretePlot[pdf2[s2], {s2, 0, 20}, ExtentSize -> 0.8]

The only problem is detection of min/max values for given variate ([2,40] range used in fixd). Looking for a way to do it automatically. If you know how to do it -- please help me.

Problem #1 is still a mystery, but it looks like a bug to me.

P.S. Unfortunately for any non-trivial transformation PDF[fixd[transformed_distr[...]]] takes way too much time -- so, at the end it was useless

POSTED BY: Michael Kilburn

Does this look like a bug to anyone? d1 can be fed into Plot function and it shows uninterrupted line suggesting Wolfram treats it as continuous distribution.

Here is another problem (maybe related) -- here it looks like TransformedDistribution can't apply to itself:

p = DiscreteUniformDistribution[{1,6}];
y = TransformedDistribution[a + b, {a \[Distributed] p, b\[Distributed]p}];
w = TransformedDistribution[c + d, {c \[Distributed] y, d \[Distributed]p}];

pdf1 = PDF[y]
DiscretePlot[pdf1[s1], {s1, 0, 20}, ExtentSize -> 0.8]

pdf2 = PDF[w]
DiscretePlot[pdf2[s2], {s2, 0, 20}, ExtentSize -> 0.8]

pdf2 refuses to be evaluated to anything meaningful (to me).

POSTED BY: Michael Kilburn

For the first problem, use PDF[d1, a] (without the curly brackets around a). Then DiscretePlot[] gives the following output, which looks right to me:

Plot of PDF of twice a discrete uniform variate

For #2, what version of Mathematica are you using? In my copy (which is 11.0.1.0), Here's what I get when running your last two lines:

Plot of iteratively applying TransformedDistribution[]

Thank you for helping me with case #1. Do you reckon it is a bug or I am missing some important distinction between a and {a} in this context?

About case #2 -- I am using Wolfram Programming Lab (free subscription), no idea about backend it uses and version. I need to run few simple calcs involving throwing dices -- ended up spending my evenings for more than a week, fighting weird syntax issues and digging in help pages. :)

Here is what I get when running second part: enter image description here

POSTED BY: Michael Kilburn

Thank you for helping me with case #1. Do you reckon it is a bug or I am missing some important distinction between a and {a} in this context?

You should avoid using {a} in this context, since this is supposed to output a list containing the value of the PDF at a, rather than just outputting the value at a. But there does seem to be a bug; when I evaluate

PDF[TransformedDistribution[x + x, x \[Distributed] DiscreteUniformDistribution[{1, 6}]], {3}]

the output is {1/6}, rather than {0}, which is what it should be. And indeed

PDF[TransformedDistribution[x + x, x \[Distributed] DiscreteUniformDistribution[{1, 6}]], 3]

correctly evaluates to 0.

In general, it seems like there's something screwy about the internal representation of PDFs of discrete distributions. For example,

PDF[DiscreteUniformDistribution[{1, 6}]]

should be a pure function which evaluates to $1/6$ on the integers 1, 2, 3, 4, 5, 6 and to 0 on everything else, but instead it is

Pure function

which will happily evaluate to $1/6$ when you plug in $1.1$ (or, presumably, any other real number between 1 and 6), but PDF[DiscreteUniformDistribution[{1, 6}], 1.1] correctly evaluates to 0.

As for your second issue, I have no idea what the problem is. There's obviously some difference between the Mathematica back end and the Wolfram Programming Lab back end, but again the issue seems to be with discrete distributions. Replacing DiscreteUniformDistribution[{1,6}] with UniformDistribution[{1,6}] in your code produces entirely reasonable output even in Wolfram Programming Lab. So it may well be the case that both of your issues have the same root cause, whatever it is.

Conclusion: there are bugs in the Mathematica/Wolfram implementation of (PDFs of) discrete probability distributions.

I reported this bug, maybe they'll fix it.

The whole domain is unusable to me. You can calculate relatively complex things if you type all parameters, but the moment you want to build few functions (to use them in later calculations) -- it all falls apart. For example in code below I wanted to build a function that will give me a distribution of sum of N independent variates with same distribution. Alas, it doesn't work for N > 1. Ok, I came up with a "fix" -- but that fix takes inordinate computation time if I use something more complex than uniform distribution -- most non-trivial computations simply time out after 1 minute.

(*simpled := DiscreteUniformDistribution[{1,6}]*)
simpled := BinomialDistribution[10,0.3]

transd := Module[{m,d},
  TransformedDistribution[Max[m - d, 0], {m\[Distributed]simpled, d\[Distributed]simpled}]
]

sumd[n_Integer, dist_] := Module[{v,i,j,distlst}, 
  distlst = Table[v[j] \[Distributed] dist,{j,n}];
  TransformedDistribution[Sum[v[i],{i,n}],distlst]
]

fixd[dist_] := Module[{x,k,min,max}, 
  (* unfortunately Quantile returns -inf for TransformedDistribution, so we'll use 0 for min (ok for our calcs) *)
  {min,max} = {0, Quantile[dist, {0, 1}][[2]]};
  EmpiricalDistribution[ Table[Probability[x==k, x\[Distributed]dist], {k,min,max}]->Table[k, {k, min, max}] ]
]

d0 := sumd[2, simpled]
PDF[d0]

d1 := sumd[1, transd]
PDF[d1]

(* this one is broken *)
d2 := sumd[2, transd]
PDF[d2]

d3 := sumd[2, fixd[transd]]
PDF[d3]

(* this one works but very slow with non-trivial simpled *)
d4 := sumd[6, fixd[transd]]
PDF[d4]

... and you can't use Set (as opposed to SetDelayed) with sumd because Table[...] refuses to evaluate with unknown parameter.

POSTED BY: Michael Kilburn

I agree that there seem to be some issues with PDF[], but part of the issue is that computing explicit PDFs of nontrivial distributions is not so easy.

The distributions themselves can be worked with in various ways, even if PDF[] doesn't give you what you want. For example, here are some computations of the mean of d4, the expectation of $e^{-x}$ when $x$ is distributed as d4, and a histogram of a million random samples from d4:

Computations with derived distribution

It is correct, but I must point out that actual amount of underlying computations is minuscule -- we are talking about probability mass function of discrete variable in a very small range ([0, 100] tops). That distribution is defined by array of values no longer than 100 elements -- transforming it (e.g. calculating distribution of Max[x1 - x2, 0]) is trivial convolution of two short arrays, modern CPU can do thousands of these per second.

I suspect Wolfram Lab backend spends all this time in logic that attempts to comprehend and simplify symbolic representation of related functions, but it seems that it is very inefficient. Probably because the same thing PDF is used for both discrete and continuous distributions. Probably a reason behind these bugs too.

POSTED BY: Michael Kilburn
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract