Message Boards Message Boards

Optimizing the generation of data from custom probability distribution?

Posted 2 years ago

Basically I need to do a loop that has a big number of iterations so I am trying to make the code as fast as possible.

these are my initializations

eps = 8.85*10^(-12);
q = 80;
m = 50;
L = 2;
nparticles = 5000;
ngrid = 500;
vth = 2;
u = 1;

dx = L/ngrid;

dist = ProbabilityDistribution[
   1./(2.*Pi*Power[vth, 2.])^(0.5)*Exp[-(v - u)^2./(2.*vth^2.)] + 
    1./(2.*Pi*Power[vth, 2])^(0.5)*
     Exp[-(v + u)^2./(2. vth^2)], {v, -Infinity, Infinity}, 
   Method -> "Normalize"];

And this For loop is one of the places in the code that I would like to improve. (I tried ParallelDo and it takes much longer)

\[Rho] = Table[q*nparticles/(dx*ngrid), ngrid + 1]; // AbsoluteTiming
\[Rho][[ngrid + 1]] = 0;

v = RandomVariate[dist, nparticles];
x = RandomVariate[UniformDistribution[{0, L}], nparticles];
r = 0;

xgrid = Range[0, L, dx] // N;


For[i = 1, i <= nparticles, i++,
   Do[If[Abs[x[[i]] - xgrid[[j]]] <= 
      dx, \[Rho][[j]] = \[Rho][[j]] - 
       q*(1 - (Abs[x[[i]] - xgrid[[j]]])/dx)/dx; r += 1, 0] , {j, 
     ngrid + 1}]]; // AbsoluteTiming
\[Rho][[1]] = \[Rho][[1]] + \[Rho][[ngrid + 1]];

Is there some way to make this faster?

as I side note I would really apreciate if somebody can explain to me why I need to restart the kernel each time I run the dist function

POSTED BY: Diogo Carvalho
3 Replies
Posted 2 years ago

I changed the For loop for a compile

rho = Compile[{{xp, _Real, 1}, {xg, _Real, 
    1}, {carga, _Real}, {np, _Real}, {ng, _Real}, {dx, _Real}},
  n = ng + 1;
  \[Rho] = Table[carga*np/(dx*ng), n];
  \[Rho][[ng + 1]] = 0;

  Do[If[Abs[xp[[i]] - xg[[j]]] <= 
     dx, \[Rho][[j]] = \[Rho][[j]] - 
      carga*(1 - (Abs[xp[[i]] - xg[[j]]])/dx)/dx; r += 1, 0] , {i, 1, 
    np}, {j, 1, ng + 1}];
  \[Rho][[1]] = \[Rho][[1]] + \[Rho][[ng + 1]];]

Is it possible to optimize even further?

POSTED BY: Diogo Carvalho
Posted 2 years ago

Hi Diogo,

The uncompiled version takes ~2.5 sec on my laptop. How much faster do you need it to be? What value of nparticles do you intend to use? Parallelizing will not show much improvement unless it is large.

why I need to restart the kernel each time I run the dist function

The problem is that the definition of dist depends on v so after evaluating this

v = RandomVariate[dist, nparticles];

the definition of dist uses that value of v. Use a different symbol name to avoid that.

POSTED BY: Rohit Namjoshi
Posted 2 years ago

Hi In my code I have 2 For loops like that and will have a sow and reap algorthim (haven't done that part yet) to build a Table where I storage a list (very simillar to [Rho]) in each iteration. And I think I will need a big number of iteratios. So I am trying to make the code as fast as possible. (if you have a fast solution to implement this storage algorithm I am also interested)

While I was experimenting with nparticles=10 000 and ngrid=1000 that For loop took 15 secs. That was unaceptable. With the compile version it takes around 1.5 secs. So if there are any more good solutions to optimize the code I am very interested. As I think these numbers for nparticles and ngrid are as high as I am interested to do and they might be too high I am trying to make a functional code for them.

Thanks for the information about the dist function.

POSTED BY: Diogo Carvalho
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract