A few months ago, a
Mathematica user asked me about ParallelTable's behavior with respect to an input expression like this ---
ParallelTable[expr, {i, 2}, {j, 10}]
--- on, for example, a four-core machine with four configured subkernels (and a license subkernel limit of at least four). As documented at ref/ParallelTable under Properties & Relations, the parallelism of ParallelTable is restricted by the outermost iterator. In a case like the one just described, only two subkernels would be used and some speed-up would be foregone.
At the time, I provided this user with the outline of a workaround that preserved the dimensionality of the resulting list, but it was still a bit sub-par in several ways; e.g. the example I provided broke down if any inner iterator depended on an outer one. This weekend, I finally took a moment to implement the workaround properly, and I thought it could be a useful thing to share.
Here is the alternative function I wrote, which I've checked for correctness pretty thoroughly. If you find any misbehaviors or inconsistencies, please post a reply with the details. Thanks!
ClearAll@altParallelTable
SetAttributes[altParallelTable, HoldAll]
altParallelTable[expr_, Longest@iters__List, opts___] :=
Module[{getIterName, indexedElement, emptyIterPos},
SetAttributes[getIterName, HoldAll];
getIterName@{i_Symbol, __} := SymbolName@Unevaluated@i;
getIterName@{i_Unique, __} := SymbolName@i;
Fold[
Insert[#, {}, #2] &,
GatherBy[
With[
{namedIters =
Replace[
Hold@iters, {x_} :>
With[{s = Unique[]}, {s, x} /; True],
{1}
]},
With[
{compoundIndex = Unique[],
localIndices =
ToExpression[
ToString@
ReleaseHold@
{getIterName /@ namedIters},
InputForm,
Unevaluated
]},
ParallelTable[
Module[localIndices,
localIndices = compoundIndex;
indexedElement[Evaluate@localIndices, expr]
],
{compoundIndex,
Flatten[
With[
{compoundValues =
Table @@ Prepend[namedIters, localIndices]},
emptyIterPos =
Position[compoundValues, {}, Infinity];
compoundValues
],
Length@Hold@iters - 1
]},
opts
]
]
],
Table[
With[{i = i}, #[[1, i]] &],
{i, Length@Hold@iters - 1}
]
] /. indexedElement[_, elt_] :> elt,
emptyIterPos
]
]
An example usage & comparison:
In[164]:= ClearAll@f
In[165]:= seqRes =
Table[
Pause@.01;
f[$KernelID, i + j + k + l + m + n + o],
{i, 1},
{j, 2 i},
{2},
{k, -1, i + j, 1},
{l, k, 2 k},
{m, {2, 3, 5}},
{n, {-1, 1}},
{o, n, 0, 1/2}
]; // AbsoluteTiming
Out[165]= {2.956792, Null}
In[166]:= parRes =
ParallelTable[
Pause@.01;
f[$KernelID, i + j + k + l + m + n + o],
{i, 1},
{j, 2 i},
{2},
{k, -1, i + j, 1},
{l, k, 2 k},
{m, {2, 3, 5}},
{n, {-1, 1}},
{o, n, 0, 1/2}
]; // AbsoluteTiming
Out[166]= {2.984495, Null}
In[167]:= altRes =
altParallelTable[
Pause@.01;
f[$KernelID, i + j + k + l + m + n + o],
{i, 1},
{j, 2 i},
{2},
{k, -1, i + j, 1},
{l, k, 2 k},
{m, {2, 3, 5}},
{n, {-1, 1}},
{o, n, 0, 1/2}
]; // AbsoluteTiming
Out[167]= {1.589101, Null}
In[168]:= $KernelCount
Out[168]= 2
In[169]:= (seqRes /. f[_Integer, r_] :> r) ===
(parRes /.
f[_Integer, r_] :> r) ===
(altRes /. f[_Integer, r_] :> r)
Out[169]= True