Message Boards

WOLFRAM COMMUNITY

4505 Views

8 Replies

2 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

[?] Solve problem with PDF and discrete variates?

Michael Kilburn

Posted 6 years ago

Can you help me understand why these two plots differ? It seems like in first case variate suddenly stopped being discrete d1 = TransformedDistribution[x + x, x\[Distributed]DiscreteUniformDistribution[{1,6}]]; DiscretePlot[PDF[d1, {a}], {a,0,15}] d2 = TransformedDistribution[x + x, {x\[Distributed]DiscreteUniformDistribution[{1,6}], y\[Distributed]DiscreteUniformDistribution[{1,6}]}]; DiscretePlot[PDF[d2, {b}], {b,0,15}] Thank you.

Can you help me understand why these two plots differ? It seems like in first case variate suddenly stopped being discrete

d1 = TransformedDistribution[x + x, x\[Distributed]DiscreteUniformDistribution[{1,6}]];
DiscretePlot[PDF[d1, {a}], {a,0,15}]

d2 = TransformedDistribution[x + x, {x\[Distributed]DiscreteUniformDistribution[{1,6}], y\[Distributed]DiscreteUniformDistribution[{1,6}]}];
DiscretePlot[PDF[d2, {b}], {b,0,15}]

Thank you.

POSTED BY: Michael Kilburn

8 Replies

Sort By:

Michael Kilburn

Posted 6 years ago

About problem #2 -- pretty sure it is a bug in Mathematica (I assume it is Wolfram backend). After many hours spent banging my head against the desk I found a workaround -- to "fix" distribution generated by `TransformedDistribution`. Like this: p := DiscreteUniformDistribution[{1,6}] fixd[dist_] := Module[{x,k,min,max}, (* unfortunately Quantile returns -inf for TransformedDistribution, another bug? *) {min,max} := {0, Quantile[dist, {0, 1}][[2]]}; EmpiricalDistribution[ Table[Probability[x==k, x\[Distributed]dist], {k,min,max}]->Table[k, {k, min, max}] ]] y := fixd[ TransformedDistribution[a + b, {a\[Distributed]p, b\[Distributed]p}] ] w := fixd[ TransformedDistribution[c + d, {c\[Distributed]y, d\[Distributed]p}] ] pdf1 = PDF[y]; DiscretePlot[pdf1[s1], {s1, 0, 20}, ExtentSize -> 0.8] pdf2 = PDF[w]; DiscretePlot[pdf2[s2], {s2, 0, 20}, ExtentSize -> 0.8] The only problem is detection of min/max values for given variate ([2,40] range used in `fixd`). Looking for a way to do it automatically. If you know how to do it -- please help me. Problem #1 is still a mystery, but it looks like a bug to me. P.S. Unfortunately for any non-trivial transformation PDF[fixd[transformed_distr[...]]] takes way too much time -- so, at the end it was useless

POSTED BY: Michael Kilburn

Michael Kilburn

Posted 6 years ago

Does this look like a bug to anyone? d1 can be fed into Plot function and it shows uninterrupted line suggesting Wolfram treats it as continuous distribution. Here is another problem (maybe related) -- here it looks like TransformedDistribution can't apply to itself: p = DiscreteUniformDistribution[{1,6}]; y = TransformedDistribution[a + b, {a \[Distributed] p, b\[Distributed]p}]; w = TransformedDistribution[c + d, {c \[Distributed] y, d \[Distributed]p}]; pdf1 = PDF[y] DiscretePlot[pdf1[s1], {s1, 0, 20}, ExtentSize -> 0.8] pdf2 = PDF[w] DiscretePlot[pdf2[s2], {s2, 0, 20}, ExtentSize -> 0.8] `pdf2` refuses to be evaluated to anything meaningful (to me).

POSTED BY: Michael Kilburn

Clayton Shonkwiler

Clayton Shonkwiler, Colorado State University

Posted 6 years ago

For the first problem, use `PDF[d1, a]` (without the curly brackets around `a`). Then `DiscretePlot[]` gives the following output, which looks right to me: For #2, what version of Mathematica are you using? In my copy (which is 11.0.1.0), Here's what I get when running your last two lines:

POSTED BY: Clayton Shonkwiler

Michael Kilburn

Posted 6 years ago

Thank you for helping me with case #1. Do you reckon it is a bug or I am missing some important distinction between `a` and `{a}` in this context? About case #2 -- I am using Wolfram Programming Lab (free subscription), no idea about backend it uses and version. I need to run few simple calcs involving throwing dices -- ended up spending my evenings for more than a week, fighting weird syntax issues and digging in help pages. :) Here is what I get when running second part:

POSTED BY: Michael Kilburn

Clayton Shonkwiler

Clayton Shonkwiler, Colorado State University

Posted 6 years ago

Thank you for helping me with case #1. Do you reckon it is a bug or I am missing some important distinction between a and {a} in this context? You should avoid using `{a}` in this context, since this is supposed to output a list containing the value of the PDF at `a`, rather than just outputting the value at `a`. But there does seem to be a bug; when I evaluate PDF[TransformedDistribution[x + x, x \[Distributed] DiscreteUniformDistribution[{1, 6}]], {3}] the output is `{1/6}`, rather than `{0}`, which is what it should be. And indeed PDF[TransformedDistribution[x + x, x \[Distributed] DiscreteUniformDistribution[{1, 6}]], 3] correctly evaluates to 0. In general, it seems like there's something screwy about the internal representation of PDFs of discrete distributions. For example, PDF[DiscreteUniformDistribution[{1, 6}]] should be a pure function which evaluates to $1/6$ on the integers 1, 2, 3, 4, 5, 6 and to 0 on everything else, but instead it is which will happily evaluate to $1/6$ when you plug in $1.1$ (or, presumably, any other real number between 1 and 6), but `PDF[DiscreteUniformDistribution[{1, 6}], 1.1]` correctly evaluates to 0. As for your second issue, I have no idea what the problem is. There's obviously some difference between the Mathematica back end and the Wolfram Programming Lab back end, but again the issue seems to be with discrete distributions. Replacing `DiscreteUniformDistribution[{1,6}]` with `UniformDistribution[{1,6}]` in your code produces entirely reasonable output even in Wolfram Programming Lab. So it may well be the case that both of your issues have the same root cause, whatever it is. Conclusion: there are bugs in the Mathematica/Wolfram implementation of (PDFs of) discrete probability distributions.

POSTED BY: Clayton Shonkwiler

Michael Kilburn

Posted 6 years ago

I reported this bug, maybe they'll fix it. The whole domain is unusable to me. You can calculate relatively complex things if you type all parameters, but the moment you want to build few functions (to use them in later calculations) -- it all falls apart. For example in code below I wanted to build a function that will give me a distribution of sum of N independent variates with same distribution. Alas, it doesn't work for N > 1. Ok, I came up with a "fix" -- but that fix takes inordinate computation time if I use something more complex than uniform distribution -- most non-trivial computations simply time out after 1 minute. (simpled := DiscreteUniformDistribution[{1,6}]) simpled := BinomialDistribution[10,0.3] transd := Module[{m,d}, TransformedDistribution[Max[m - d, 0], {m\[Distributed]simpled, d\[Distributed]simpled}] ] sumd[n_Integer, dist_] := Module[{v,i,j,distlst}, distlst = Table[v[j] \[Distributed] dist,{j,n}]; TransformedDistribution[Sum[v[i],{i,n}],distlst] ] fixd[dist_] := Module[{x,k,min,max}, (* unfortunately Quantile returns -inf for TransformedDistribution, so we'll use 0 for min (ok for our calcs) ) {min,max} = {0, Quantile[dist, {0, 1}][[2]]}; EmpiricalDistribution[ Table[Probability[x==k, x\[Distributed]dist], {k,min,max}]->Table[k, {k, min, max}] ] ] d0 := sumd[2, simpled] PDF[d0] d1 := sumd[1, transd] PDF[d1] ( this one is broken ) d2 := sumd[2, transd] PDF[d2] d3 := sumd[2, fixd[transd]] PDF[d3] ( this one works but very slow with non-trivial simpled *) d4 := sumd[6, fixd[transd]] PDF[d4] ... and you can't use `Set` (as opposed to `SetDelayed`) with `sumd` because Table[...] refuses to evaluate with unknown parameter.

I reported this bug, maybe they'll fix it.

The whole domain is unusable to me. You can calculate relatively complex things if you type all parameters, but the moment you want to build few functions (to use them in later calculations) -- it all falls apart. For example in code below I wanted to build a function that will give me a distribution of sum of N independent variates with same distribution. Alas, it doesn't work for N > 1. Ok, I came up with a "fix" -- but that fix takes inordinate computation time if I use something more complex than uniform distribution -- most non-trivial computations simply time out after 1 minute.

(*simpled := DiscreteUniformDistribution[{1,6}]*)
simpled := BinomialDistribution[10,0.3]

transd := Module[{m,d},
  TransformedDistribution[Max[m - d, 0], {m\[Distributed]simpled, d\[Distributed]simpled}]
]

sumd[n_Integer, dist_] := Module[{v,i,j,distlst}, 
  distlst = Table[v[j] \[Distributed] dist,{j,n}];
  TransformedDistribution[Sum[v[i],{i,n}],distlst]
]

fixd[dist_] := Module[{x,k,min,max}, 
  (* unfortunately Quantile returns -inf for TransformedDistribution, so we'll use 0 for min (ok for our calcs) *)
  {min,max} = {0, Quantile[dist, {0, 1}][[2]]};
  EmpiricalDistribution[ Table[Probability[x==k, x\[Distributed]dist], {k,min,max}]->Table[k, {k, min, max}] ]
]

d0 := sumd[2, simpled]
PDF[d0]

d1 := sumd[1, transd]
PDF[d1]

(* this one is broken *)
d2 := sumd[2, transd]
PDF[d2]

d3 := sumd[2, fixd[transd]]
PDF[d3]

(* this one works but very slow with non-trivial simpled *)
d4 := sumd[6, fixd[transd]]
PDF[d4]

... and you can't use Set (as opposed to SetDelayed) with sumd because Table[...] refuses to evaluate with unknown parameter.

POSTED BY: Michael Kilburn

Clayton Shonkwiler

Clayton Shonkwiler, Colorado State University

Posted 6 years ago

I agree that there seem to be some issues with `PDF[]`, but part of the issue is that computing explicit PDFs of nontrivial distributions is not so easy. The distributions themselves can be worked with in various ways, even if `PDF[]` doesn't give you what you want. For example, here are some computations of the mean of `d4`, the expectation of $e^{-x}$ when $x$ is distributed as `d4`, and a histogram of a million random samples from `d4`:

POSTED BY: Clayton Shonkwiler

Michael Kilburn

Posted 6 years ago

It is correct, but I must point out that actual amount of underlying computations is minuscule -- we are talking about probability mass function of discrete variable in a very small range ([0, 100] tops). That distribution is defined by array of values no longer than 100 elements -- transforming it (e.g. calculating distribution of Max[x1 - x2, 0]) is trivial convolution of two short arrays, modern CPU can do thousands of these per second. I suspect Wolfram Lab backend spends all this time in logic that attempts to comprehend and simplify symbolic representation of related functions, but it seems that it is very inefficient. Probably because the same thing PDF is used for both discrete and continuous distributions. Probably a reason behind these bugs too.

POSTED BY: Michael Kilburn

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Group Abstract

Feedback