Group Abstract Group Abstract

Message Boards Message Boards

0
|
9.9K Views
|
7 Replies
|
8 Total Likes
View groups...
Share
Share this post:

Calculation of variance of weighted data

The WDC example (/Scope/Data: "Find the variance of WeightedData") merges the internal intermediary result (of the mean) and shows the end result in a single line only, so i cannot back track anymore.

Inspired by that WDC example, please could anyone demonstrate how the $\frac{8800}{23}$ was calculated (just the start/from which definition)? Feel free to use two lines: 1 line for the numeric mean, 1 for the variance using that numeric mean.

In[1]:= data = {-30, 10, 10, 10, 10, 10, 10, 10, 20, 20};(* sample data *)
{Mean[data], Variance[data]}(* bias-corrected sample variance*)

Out[2]= {8, 1760/9}

In[3]:= edis = EmpiricalDistribution[data];(* population *)
{Mean[edis], Variance[edis]}(* population variance *)

Out[4]= {8, 176}

In[5]:= wdata = WeightedData[{-30, 10, 20}, {1/10, 7/10, 2/10}];
wedis = EmpiricalDistribution[wdata];
{Mean[wedis], Variance[wedis]}(* okay,as expected *)

Out[7]= {8, 176}

In[8]:= {Mean[wdata], Variance[wdata]}(* which formula/definition used here, why? *)

Out[8]= {8, 8800/23}

I cannot figure it out, thank you! Best wishes.

POSTED BY: Raspi Rascal
7 Replies
Posted 4 years ago

@Raspi Rascal

Yes, I am enjoying this discussion. And for me, it is timely. My job is requiring me to do more and more statistics.

Keep up the good work.

POSTED BY: Mike Besso
POSTED BY: Raspi Rascal
Posted 4 years ago

@MichaelHelmle did give you enough information to determine the answer for WeightedData. And it is readily available at a Wiki page (and elsewhere).

Using Michael's notation the weighted mean is given by

$$\bar{v}=\frac{\sum _{i=1}^n v_i w_i}{\sum _{i=1}^n w_i}$$

And the weighted variance is given by

$$\frac{\sum _{i=1}^n w_i (v_i-\bar{v})^2}{V_1-\frac{V_2}{V_1}}$$

where

$$V_i=\sum_{j=1}^n w_j^i$$

The weights are assumed to be known and not random variables.

POSTED BY: Jim Baldwin
Posted 4 years ago

Raspi:

Thank you for the great question. Though I am not sure I fully understand your question, I think I might understand your frustration.

The documentation for Variance states that Variance takes either a Distribution or a List as its argument.

The head of a WeightedData is WeightedData. While one of the properties of WeightedData is an EmpericalPDF, I do not think that a Probability Density Function is the same thing as a Distribution.

So, my best guess is that there is an undocumented use of [Variance] that takes a WeightedData argument.

It would be great if someone from Wolfram jumped in to help us better understand.

POSTED BY: Mike Besso
Posted 4 years ago

Since Mathematica can do many caclulations symbolic, you can build a small symbolic weighted data list and use this:

wdata2 = WeightedData[Array[v, 3], Array[w, 3]]

Here you get the formula how Mean and Variance are calculated

Mean[wdata2]
Variance[wdata2]
POSTED BY: Michael Helmle

Not a question today. I answered it on my own and took these notes, hopefully someone agrees or disagrees or finds it useful.

POSTED BY: Raspi Rascal

Thanks but that's exactly the point which i was making. Your output shows the end result, not the definition. I cannot verify where the output comes from. You will not find such an output as the literal definition of the variance, in any reference book, textbook, or elsewhere.

I shouldn't give up here.

The topic is relevant in practice. Knowing what you're doing is relevant in practice ... because weighted data is everywhere (including introtexts) and the calculation of variance is everywhere too. So if you're not sure what you're doing, you will end up producing the wrong variance value for your problem. Thanks to the documentation of Mathematica :P

POSTED BY: Raspi Rascal
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard