Message Boards Message Boards

3
|
9117 Views
|
3 Replies
|
7 Total Likes
View groups...
Share
Share this post:

ParallelTable without parallelism restricted by outermost iterator

Posted 11 years ago
A few months ago, a Mathematica user asked me about ParallelTable's behavior with respect to an input expression like this ---
ParallelTable[expr, {i, 2}, {j, 10}]
--- on, for example, a four-core machine with four configured subkernels (and a license subkernel limit of at least four). As documented at ref/ParallelTable under Properties & Relations, the parallelism of ParallelTable is restricted by the outermost iterator. In a case like the one just described, only two subkernels would be used and some speed-up would be foregone.

At the time, I provided this user with the outline of a workaround that preserved the dimensionality of the resulting list, but it was still a bit sub-par in several ways; e.g. the example I provided broke down if any inner iterator depended on an outer one. This weekend, I finally took a moment to implement the workaround properly, and I thought it could be a useful thing to share.

Here is the alternative function I wrote, which I've checked for correctness pretty thoroughly. If you find any misbehaviors or inconsistencies, please post a reply with the details. Thanks!
 ClearAll@altParallelTable
 SetAttributes[altParallelTable, HoldAll]
 altParallelTable[expr_, Longest@iters__List, opts___] :=
  Module[{getIterName, indexedElement, emptyIterPos},
   SetAttributes[getIterName, HoldAll];
   getIterName@{i_Symbol, __} := SymbolName@Unevaluated@i;
   getIterName@{i_Unique, __} := SymbolName@i;
   Fold[
    Insert[#, {}, #2] &,
   GatherBy[
     With[
      {namedIters =
        Replace[
         Hold@iters, {x_} :>
          With[{s = Unique[]}, {s, x} /; True],
         {1}
         ]},
      With[
       {compoundIndex = Unique[],
        localIndices =
         ToExpression[
          ToString@
           ReleaseHold@
            {getIterName /@ namedIters},
          InputForm,
          Unevaluated
          ]},
       ParallelTable[
        Module[localIndices,
         localIndices = compoundIndex;
         indexedElement[Evaluate@localIndices, expr]
         ],
        {compoundIndex,
         Flatten[
          With[
           {compoundValues =
             Table @@ Prepend[namedIters, localIndices]},
           emptyIterPos =
            Position[compoundValues, {}, Infinity];
           compoundValues
           ],
          Length@Hold@iters - 1
          ]},
        opts
        ]
       ]
      ],
     Table[
      With[{i = i}, #[[1, i]] &],
      {i, Length@Hold@iters - 1}
      ]
     ] /. indexedElement[_, elt_] :> elt,
   emptyIterPos
   ]
  ]

An example usage & comparison:
 In[164]:= ClearAll@f
 
 In[165]:= seqRes =
    Table[
     Pause@.01;
     f[$KernelID, i + j + k + l + m + n + o],
     {i, 1},
     {j, 2 i},
     {2},
    {k, -1, i + j, 1},
    {l, k, 2 k},
    {m, {2, 3, 5}},
    {n, {-1, 1}},
    {o, n, 0, 1/2}
    ]; // AbsoluteTiming

Out[165]= {2.956792, Null}

In[166]:= parRes =
   ParallelTable[
    Pause@.01;
    f[$KernelID, i + j + k + l + m + n + o],
    {i, 1},
    {j, 2 i},
    {2},
    {k, -1, i + j, 1},
    {l, k, 2 k},
    {m, {2, 3, 5}},
    {n, {-1, 1}},
    {o, n, 0, 1/2}
    ]; // AbsoluteTiming

Out[166]= {2.984495, Null}

In[167]:= altRes =
   altParallelTable[
    Pause@.01;
    f[$KernelID, i + j + k + l + m + n + o],
    {i, 1},
    {j, 2 i},
    {2},
    {k, -1, i + j, 1},
    {l, k, 2 k},
    {m, {2, 3, 5}},
    {n, {-1, 1}},
    {o, n, 0, 1/2}
    ]; // AbsoluteTiming

Out[167]= {1.589101, Null}

In[168]:= $KernelCount

Out[168]= 2

In[169]:= (seqRes /. f[_Integer, r_] :> r) ===
(parRes /.
   f[_Integer, r_] :> r) ===
(altRes /. f[_Integer, r_] :> r)

Out[169]= True
POSTED BY: William Rummler
3 Replies
William's code is a very detailed instruction about advanced parallelism with queueing. I realize that William's function creates a list of hold expression, then ParallelTable function picks them up and handles the computation as a regular list. I like this idea because not only it shows a solution to the question from that user but explains how ParallelSubmit works, of which the documentation needs some solid code like this to understand deeper. 

I put a print function in his code at this location: 
ParallelTable[
   Module[localIndices,
    localIndices = compoundIndex;

    Print[compoundIndex];
    indexedElement[Evaluate@localIndices, expr]] ...
This Print function shows everything in the compoundIndex during the parallel computation process. The lists printed are the corresponding indices before I throw them into the summation. 

Mathematica has this function called ParallelSubmit, which creates a list of hold expression until one calls WaitAll[...]. William's code is lightweight version of this build -in function. I can use the build in function across the same indexed variables.  If you read the list of vectors on the left hand side in the following screenshot (ignore the "1" at the beginning), you would find the indices are identical to the boxed expressions on the right hand side. This result can also quickly verify that William's code works fine.  
Table[
ParallelSubmit[{i, j, k, l, m, n, o}, i + j + k + l + m + n + o],
  {i,1}, {j, 2 i}, {2}, {k, -1, i + j, 1}, {l, k, 2 k}, {m, {2, 3, 5}}, {n, {-1, 1}}, {o, n, 0, 1/2}]

POSTED BY: Shenghui Yang
Posted 11 years ago
I'm a bit late following up here... Shenghui's note about the equivalence of altParallelTable's results with those of Table/ParallelSubmit/WaitAll is a good one. It's also worth pointing out that ParallelSubmit may be slower in most cases, specifically when the computation to be parallelized takes a nontrivial amount of time. WaitAll basically distributes one EvaluationObject at a time as each kernel becomes available, which will be slower than parallelization using the coarser granularity control available through the Method mechanism of ParallelTable. (For instance, the EvaluationObject/WaitAll approach is about 50% slower than altParallelTable for the example above, using two kernels, ~2.25 seconds vs. ~1.5 seconds.) See Properties & Relations at ref/ParallelSubmit for more details.
POSTED BY: William Rummler
Posted 11 years ago
Cool stuff! Is this a better way to write altParallelTable? It gives the same results for the given example, but I think it's easier to read.
 ClearAll@altParallelTable2
 SetAttributes[altParallelTable2, HoldAll]
 
 altParallelTable2[expr_, Longest@iters__List, opts___] :=
 Module[{i = 1,
    flatResults =
     With[{symbols = Cases[{iters}[[All, 1]], _Symbol]},
      ParallelTable[
       expr /. Thread[symbols -> values], {values,
       Flatten[Table[symbols, iters], Length@{iters} - 1]}]]},
  Table[flatResults[[i++]], iters]]
POSTED BY: Michael Hale
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract