Message Boards Message Boards

Better approach to compute a large tree of posterior distributions?

Posted 8 years ago

So I've gotten access to a large computing cluster, and I'm reworking some old MMA code to 1. exploit the parallel computing capabilities of the cluster, and 2. compute a large dataset of a certain system dynamics for subsequent study.

Specifically, for any input I need to compute and store 2^T probability distributions, each of which is a list (vector of probabilities) of the same length. Previously, I've imagined the nodes "{start", "0","1","00","01","10","11",...} of a binary tree of depth T being associated to to each distribution, and I'd like to work with this framework unless it's too memory demanding. Finally, each probability list only depends on its "parent" probability list for computation. So, if B is this function, then pAssc["0101"] = B[1,pAssc["010"]].

To utilize the parallel processors on the cluster, my current idea is essentially to maintain a "queue" of jobs, and then export it immediately after it's no longer needed for subsequent computation. Something like

`queue={"0","1"};
While[EmptyQ[queue],
node={#<>"0",#<>"1"}&@(Key@Top[queue]);
Table[ParallelSubmit[{r},EnQueue[queue,<|node[[r]]->B[r,Value[Top[queue]]]|>],{r,0,1}];
WriteString["file.txt",Top[queue]];
DeQueue[queue];
]`

I want to ask if anyone has any good/better ideas on how to handle this kind of situation.The documentation on the Queue package is scant, the myriad of considerations around making Parallel Computation work, as well as how to actually work with (i.e. query, compute, etc.) the resulting dataset once it's generated all remain as ominous uncertainties.

So, Mathematica wizards, any clever ideas? Well-known frameworks? Etc? Any and all ideas much appreciated!

POSTED BY: Crescent Curtis
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract