Message Boards Message Boards

GROUPS:

On Annoying A Statistician, xkcd-style

Posted 1 month ago
678 Views
|
6 Replies
|
13 Total Likes
|

The webcomic xkcd featured an interesting comic for March 1, 2019:

xkcd 2118: Normal Distribution

Rarely would I be accused of taking a joke (too) seriously, but I found this intriguing enough to try doing this in Mathematica.

That is, what should the lower and upper limits be such that the vertical area is equal to $\frac12$ (recalling that the area under a PDF should be $1$)?

Let's first define the PDF of the normal distribution (i.e. the usual bell curve):

pdf[x_] = PDF[NormalDistribution[], x]
   E^(-(x^2/2))/Sqrt[2 π]

As in the comic, I will make the arbitrary and capricious choice of centering the area of interest at

pdf[0]/2
   1/(2 Sqrt[2 π])

One thing to do is to find the abscissas corresponding to the upper and lower limits; that is, if we let the height of the area be 2 h, find the value of x such that pdf[x] == 1/(2 Sqrt[2 π]) - h for the lower limit, and the value of x such that pdf[x] == 1/(2 Sqrt[2 π]) + h for the upper limit. Due to the symmetry of the Gaussian function, it suffices to give the restriction x > 0:

{xm[h_], xp[h_]} = Assuming[-(1/(2 Sqrt[2 π])) < h < 1/(2 Sqrt[2 π]), 
                            Simplify[(x /. First[Solve[pdf[x] == 1/(2 Sqrt[2 π]) + #
                                                       && x > 0, x]]) & /@ {-h, h}]]
   {Sqrt[2] Sqrt[-Log[1/2 - h Sqrt[2 π]]], Sqrt[2] Sqrt[-Log[1/2 + h Sqrt[2 π]]]}

where I have taken the liberty to assign the results to functions for later use.

At this point, one might wish to construct a Manipulate[] object to assist visualization, like so:

Manipulate[Plot[{Min[pdf[x], 1/(2 Sqrt[2 π]) - h], 
                 Min[pdf[x], 1/(2 Sqrt[2 π]) + h], pdf[x]}, {x, -4, 4}, 
                Exclusions -> None, Filling -> {2 -> {{1}, Opacity[0.6, Pink]}}, 
                PlotLabel -> StringForm["h=`1`, Area=`2`", h, 
                                        Quiet[2 NIntegrate[
                                              Min[pdf[x], 1/(2 Sqrt[2 π]) + h] - 
                                              Min[pdf[x], 1/(2 Sqrt[2 π]) - h],
                                              {x, 0, xm[h]}]]], 
                PlotStyle -> {None, None, ColorData[97, 1]}],
            {h, 0, 1/(2 Sqrt[2 π])}]

Manipulate for normal distribution

where I used Min[] to clip the Gaussian function at appropriate limits.

Note that in the example I gave, a value of $h\approx 0.105$ gives an area that is nearly equal to $\frac12$. To solve for the right value of $h$, let us get an explicit expression for the integral:

area = Assuming[0 < h < 1/(2 Sqrt[2 π]), 
                Simplify[Integrate[Min[pdf[x], 1/(2 Sqrt[2 π]) + h] - 
                                   Min[pdf[x], 1/(2 Sqrt[2 π]) - h],
                                   {x, -xm[h], xm[h]}]]]

which yields a complicated expression involving the error function Erf[]. This can now be plugged into Solve[]:

hs = h /. First[Solve[area == 1/2 && 0 < h < 1/(2 Sqrt[2 π]), h]];

which yields a complicated Root[] object that nevertheless can be easily evaluated numerically:

N[hs, 20]
   0.10508624993771281943

We also find that

2 hs/(pdf[0]) 100 // N
   52.6824

that is, the height of the area being considered covers $\approx 52.7\%$ of the whole height of the bell curve.

Let's show a hopefully more understandable way to solve for $h$. The area of interest can also be determined through a sequence of subtractions, which can be depicted pictorially:

how to compute the area

To do this, we need the expression for the area under a symmetric portion of the bell curve:

Assuming[a > 0, Integrate[pdf[x], {x, -a, a}]]
   Erf[a/Sqrt[2]]

Then, the necessary area expression is

area2 = (Erf[xm[h]/Sqrt[2]] - (2 xm[h]) (1/(2 Sqrt[2 π]) - h)) -
        (Erf[xp[h]/Sqrt[2]] - (2 xp[h]) (1/(2 Sqrt[2 π]) + h));

from which one can solve for the necessary value of $h$:

h /. First[Solve[area2 == 1/2 && 0 < h < 1/(2 Sqrt[2 π]), h]]

which should again yield the same Root[] expression.

POSTED BY: J. M.
Answer
6 Replies

Fun fact: 52.7% of mathematicians use the term "abscissas" and the other half use "abscissae".

By the way, the prior xkcd, contrasting differentiating with integrating, was sent around at work. It elicited a few reactions (my own take being the author basically nailed it).

https://www.xkcd.com/2117

enter image description here

presumably because on of the steps was "Install Mathematica"

No, it is the aptness of the flowchart.

enter image description here - Congratulations! This post is now a Staff Pick as distinguished by a badge on your profile! Thank you, keep it coming, and consider contributing your work to the The Notebook Archive!

Consider me annoyed (but impressed).

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract