Group Abstract

Message Boards

WOLFRAM COMMUNITY

10.5K Views

2 Replies

7 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Wondering why I am not getting Parallel speed up on Cases[] and similar..

W. Craig Carter

W. Craig Carter, MIT

Posted 14 years ago

I have some large lists that I need to prune. I expected that this should parallize quite well, but it doesn't. Is there someone out there who can explain why? Example: In[21]:= $KernelCount Out[21]= 8 rn = RandomInteger[{-1000, 1000}, 5^10]; Timing[Parallelize[Select[rn, Positive]]][[1]] Timing[Cases[rn, a_ /; (a > 0 )]][[1]] Timing[Parallelize[Cases[rn, a_ /; (a > 0 )]]][[1]] Timing[ParallelCombine[(Cases[#, a_ /; a > 0] &), rn, List];][[1]] Out[28]= 4.728799 Out[29]= 5.137632 Out[30]= 5.653225 Out[31]= 5.012263 I would have guessed that ParallelDeleteCases, ParallelPosition, ParallelSelect, etc would have been natural externsions of ParallelMap...

POSTED BY: W. Craig Carter

2 Replies

Sort By:

Patrick Scheibe

Patrick Scheibe, University Leipzig

Posted 14 years ago

The problem with Parallelize and friends is that they suggest parallelization is easy. Just surround your code by Parallelize and that's it. It not! If it were as simple as that, then this would be built in per default. I will not go into details here but let me say a few words to your problem: The underlying task of your problem is very simple, it's just to look whether a number is greater zero. This is problematic, because then the overhead of the parallel calculation is likely to be greater then the benefit you are gaining. Let's make an example. I create a list like you but I use a length which is divisible by 8 so we can manually distribute 8 smaller lists to the parallel subkernels LaunchKernels[]; rn = RandomInteger[{-1000, 1000}, 8^7]; First let's see what we get without parallelization AbsoluteTiming[Select[rn, Positive]] // First (* 0.375135 ) Now that we know this, we can "estimate" what we expect if we divide the work by 8 and run it in parallel. For this, I partition the list into 8 sublists and use ParallelMap. Afterwards, I flatten the resulting sublists into one big result Flatten[ParallelMap[Select[#, Positive] &, Partition[rn, 8^6]]] // AbsoluteTiming//First ( 0.432450 ) Looks like we lost some time and indeed there are several points which we haven't considered: we have to split the input into smaller tasks for the parallel subkernels we have to transfer the input for each subkernel to its memory we have to transfer the result back to the main kernel we have to build up the final result from all the smaller parts And now you see instantly the basics of what you have to consider. Those points made the parallel approach obviously slower than a simple non-parallel call. Let's consider a second theoretical example: you want to find the roots of a complicated function numerically. The function has parameters and you want to find the roots for different, say 8 parameter sets. Let's assume the root-finding is very slow, say .5 seconds and now reconsider the situation: The splitting of the input is very cheap, because you are not splitting up a large list, you only have 8 parameter sets. Transfering a parameter set (lets say 5 parameters) to a subkernel costs you nothing compared to transfering a list with 8^6 entries The result is the root of the function, one number which is again very fast if it is transfered back to the main kernel Building up the result list of 8 roots is again not even worth mentioning. What's left is, that the .5 seconds of calculation are done in parallel and your overall calculation time drops from 0.58= 4 seconds to basically 0.5 seconds. For your current problem, I would consider a different kind of parallelization which you might not be aware of because this is already built in. The thing I mean is that many functions are Listable which means you give a list as input and they act on each element. Positive is one of those functions. The good thing is, that when you supply a list of numerical values to Positive it's calculated in parallel. The only thing you have to do is to reorder your calculation. You compute first which numbers in your list are positive and Pick then the matching out. AbsoluteTiming[Pick[rn, Positive[rn]]] // First (* 0.117210 *)

POSTED BY: Patrick Scheibe

Shenghui Yang

Shenghui Yang, WOLFRAM

Posted 13 years ago

Further notation, this build-in parallel or multi-threading technology does not take the license limit in general. Say you have 8 threads but only 4 kernels allowed via LaunchKernels[], you can still use the 8 threads when you use this type of function.

POSTED BY: Shenghui Yang

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback