Message Boards Message Boards

1
|
11945 Views
|
8 Replies
|
5 Total Likes
View groups...
Share
Share this post:

How to implement a formal expectation operator over an unknown distribution

Posted 11 years ago
I need to do some simplification of an expression involving averages over a stochastic variable (in order to verify a long analytical calculation). The easiest way to do that, I figured, were if I could implement an operator which would basically be short-hand for the averaging procedure, with all the appropriate properties. Then of course this operator would be present in the final expression, which is fine, and would enable me to compare easily with my own calculations. So assuming I use x for the stochastic variable, I tried defining av using
av[y_ + z_] := av[y] + av[z] 
av[c_ y_] := c av[y] /; FreeQ[c, x]?
av[c_] := c /; FreeQ[c, x] 
Then when I write
D[av[x y], y]
I get
?av[x]
which is fine, but when I write
D[av[Exp[-x y]], y]
I get
-E^(-x y) y 
instead of
-y av[Exp[-x y]]
as I want; i.e., the av is removed somehow.
I tried using UpValue for teaching Mathematica that it could interchange differentiation and av, but apparently that is not the problem. I might be going about this entirely the wrong way, but I'd be grateful for any input. Note the builtin Expectation function does not accomplish it either.
POSTED BY: Sune Jespersen
8 Replies
I think I got it:
In[1]:= av /: D[av[f___], x_] := av[D[f, x]]
In[2]:= av[y_ + z_] := av[y] + av[z]
In[3]:= av[c_ y_] := c av[y] /; FreeQ[c, x]
In[4]:= av[c_] := c /; FreeQ[c, x]
In[5]:= D[av[x y], x]
Out[5]= y
In[6]:= D[av[Exp[-x y]], x]
Out[6]= -y av[E^(-x y)]
Sune
POSTED BY: Sune Jespersen
Well, turns out there's still a problem:
In[9]:= D[Log[av[Exp[-b x]]], b]
Out[9]= -((E^(-b x) x)/av[E^(-b x)])

instead of
-(av[(E^(-b x) x)]/av[E^(-b x)])

What's going on?
Sune
POSTED BY: Sune Jespersen
Posted 11 years ago
I have a guess as to what's going on.

Your definition of derivative for av disobeys the chain rule. I think Mathematica is first applying the chain rule, then your rule.

We can 'hack' our way around this (kind of ugly)
 Unprotect[D];
 ClearAttributes[D, Protected]
 
 (* avoid Mathematica's chain rule *)
 D[f_[HoldPattern[av][g_]], x_] := Module[{u},
   av[D[g, x]] (D[f[u], u] /. u -> av[g])
 ]
 
 In[1]:= D[Log[av[Exp[-b x]]], b]
Out[1]= -(av[E^(-b x) x]/av[E^(-b x)])
POSTED BY: Greg Hurst
My first thought would be to use a fake derivative - call it inertD.
Then you can probably make it behave however you want w.r.t. av, and at the end replace it with D if you need to
POSTED BY: Peter Barendse
Hi chip.


Thanks for your reply. I think you're right that it has to do with the chain rule, but I don't exactly understand it. When applying the chain rule, it still needs to apply the derivative to the av operator, and then my rule would apply, wouldn't it?

I guess your hack could be formulated as a upvalue also, which if so probably would be better practice.

@Peter: I didn't want to introduce a new operator, because I would like to take advantage of functions such as Series, which call D.

S
POSTED BY: Sune Jespersen
Posted 11 years ago
When applying the chain rule, it still needs to apply the derivative to the av operator, and then my rule would apply, wouldn't it?
When applying the chain rule, Mathematica would only need to evaluate D[av(u), u], which is 1 (df/dx = df/du du/dx).


Now that I look at it again, I think both answers (yours and Mathematica's) are correct. In fact, can't we use both results to conclude av(x) is the identity function?
Your way (special D rule): D[Log[av[f(x)]], x] == D[av[f(x)], x] / av[f(x)] == av[D[f(x), x]] / av[f(x)] == av[f'(x)] / av[f(x)].
Mathematica (chain rule): D[Log[av[f(x)]], x] == D[av[f(x)], x] / av[f(x)] == (D[av(u), u] /. u -> f(x)) D[f(x), x] / av[f(x)] == 1 * f'(x) / av[f(x)].

This means av[f'(x)] == f'(x), and letting f(x) == x^2/2 gives us av(x) == x?
POSTED BY: Greg Hurst
No, it's definitely not the identity operator. That would be a special case. It's important to remember that it is an operator, not a function. One example could be
av[f[x]]=Integrate[f[x] p[x], {x,-Infinity,Infinity}]
where the p could be the Gaussian distribution for example. (Note that it is strictly speaking incorrect to write it in this way, because on the left-hand side, x is a stochastic variable, whereas it is the integration variable on the right-hand side.)


So the differential operator does not act on it, but commutes with it. I still don't understand why Mathematica does not apply my rule, specifically in your second line of code (Mathematica (chain rule), after the first == ... that's exactly the pattern to which my rule applies, and the chain rule has already been applied once at this stage.
POSTED BY: Sune Jespersen
Just to avoid misunderstanding, I should add that when applying the differential operator, it should not be with respect to the stochastic variable, but with respect to some parametric variable characterizing for example a family of stochastic variables. To be concrete, one could consider diffusion where x(t) is the position as a function of time, i.e. a stochastic variable. Then we could consider for example the average position av[x(t)], or the average velocity av[D[x(t),t]]=D[av[x(t)],t]. 
POSTED BY: Sune Jespersen
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract