Group Abstract

Message Boards

WOLFRAM COMMUNITY

7.9K Views

1 Reply

0 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Data Science Engineering Mathematics Graphics and Visualization Wolfram Language Statistics and Probability

Why Does ProbabilityScalePlot Display Censored Observations?

Michael Cushing

Posted 10 years ago

I am attempting to reproduce a Weibull probability plot from Wayne Nelson's Applied Life Data Analysis text (pp. 147-9). I enter failure data for 16 field windings thus: windingdata = {31.7, 39.2, 57.5, 65, 65.8, 70, 75, 75, 87.5, 88.3, 94.2, 101.7, 105.8, 109.2, 110, 130}; Seven of the observations are failures and nine are censored observations. In the list below each failure is denoted with a 0 and each right-censored observation is denoted with a 1: censorlist = {0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1}; I use EventData to assemble these observations: evdata = EventData[windingdata, censorlist] I generate the probability plot thus: ProbabilityScalePlot[evdata, "Weibull", Method -> {"ReferenceLineMethod" -> "Fit"}] The resulting plot includes points for the censored observations, in addition to points for the failures; this is not correct. While the censored observations are not plotted, they cannot simply be deleted --- they are needed in order to determine the plotting positions for the failure points. The reference info for ProbabilityScalePlot indicates that it accepts EventData. Any help would be appreciated. Michael Cushing

POSTED BY: Michael Cushing

1 Reply

Sort By:

Raspi Rascal

Raspi Rascal, novato, contributor, pseudo-wannabe (not even tryhard)

Posted 5 years ago

Interesting, thanks for sharing. Also interesting, a Google search on <probabilityscaleplot site:community.wolfram.com> gives only 3 different hits as of 2021-03-19, which could mean that not too many people use this function. Related to your problem, I also cannot confirm that `ProbabilityScalePlot[]` works or works well with `WeightedData[]` as argument. I completed a textbook problem on the theoretical and empirical distribution of the sum of five dice rolls including various plots, they all looked correct: I have completed and mastered the problem, as confirmed by the official solutions manual. As a finishing touch I transformed/converted the given empirical table to weighted data (which is a legit technique and a very common thing to do in working with probability theory problems or statistics problems; I have much successful experience with this technique); but when I try to use this "data" as argument for the function, the graph shows wrong points, wrong slope, a graph which seemingly has nothing to do with my input. I am not calling it a bug because of course there is some chance that I might be doing something wrong, like wrong syntax/wrong usage (user error). If I am wrong, then I don't mind, I have moved on yet. Anyway, unless an identified Wolfram developer asks me to provide the full problem/solution/my work, I will leave it like that. I am busy with other things (even if it's "only" watching tennistv), it's not my job, I have other things to do than helping Wolfram correct their product (sorry for my poor attitude lol). Whoever reads this post take it as a warning: the `ProbabilityScalePlot[WeightedData[]]`-idiom might not produce the desired output, or you should double check if the plot really does show the correct graph. Personally, I don't care if that idiom produces the wrong output (and I believe that only very few people in the world have tried it). I really don't. I am posting this warning only to get the info out, so that others become aware. I took a note (my problem/solution is written in German) in my .nb-file, so I am over it. But somebody at Wolfram should care. @Wolfram developers (the individual responsible for the statistics functions), you have all the info you need for investigation (sum of 5 dice rolls, generate pseudo-empirical distribution (~tally), convert to weighted data, use probabilityscaleplot and question what you're seeing as result!). Ah with, here is some code snippet for the developer to study/investigate: n = 2048; lis1 = {0, 0, 4, 4, 29, 35, 41, 66, 122, 145, 165, 214, 216, 191, 205, 162, 144, 115, 89, 43, 31, 14, 6, 3, 3, 1, 0}; lis2 = Range[4.5, 31.5, 1]; Total[lis1] == n;(True) d\[ScriptCapitalD] = DataDistribution["Histogram", {lis11/(n1), lis2}, 1, n]; data = Transpose@{Range[5, 30], Most@lis1}; data2 = Transpose@{Range[7, 29], Take[lis1, {3, 25}]}; wdata = WeightedData[data[[All, 1]], data[[All, 2]]]; wdata2 = WeightedData[data2[[All, 1]], data2[[All, 2]]]; rvdata = RandomVariate[d\[ScriptCapitalD], 10^3]; ProbabilityScalePlot[{wdata, rvdata, wdata2}, PlotLegends -> {"wdata", "rvdata", "wdata2"}] When you roll 5 dice, take their sum and examine the probability distribution of the sum, the rvdata graph shows the correct plot. The plot of wdata* has to be a joke. No offense. Now let me continue to enjoy my tennis, bye :P

POSTED BY: Raspi Rascal

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback