The problem with Parallelize and friends is that they suggest parallelization is easy. Just surround your code by Parallelize and that's it. It not! If it were as simple as that, then this would be built in per default. I will not go into details here but let me say a few words to your problem:

The underlying task of your problem is very simple, it's just to look whether a number is greater zero. This is problematic, because then the overhead of the parallel calculation is likely to be greater then the benefit you are gaining. Let's make an example. I create a list like you but I use a length which is divisible by 8 so we can manually distribute 8 smaller lists to the parallel subkernels

LaunchKernels[];

rn = RandomInteger[{-1000, 1000}, 8^7];

First let's see what we get without parallelization

AbsoluteTiming[Select[rn, Positive]] // First (* 0.375135 *)

Now that we know this, we can "estimate" what we expect if we divide the work by 8 and run it in parallel. For this, I partition the list into 8 sublists and use ParallelMap. Afterwards, I flatten the resulting sublists into one big result

Flatten[ParallelMap[Select[#, Positive] &, Partition[rn, 8^6]]] // AbsoluteTiming//First (* 0.432450 *)

Looks like we lost some time and indeed there are several points which we haven't considered:

- we have to split the input into smaller tasks for the parallel subkernels
- we have to transfer the input for each subkernel to its memory
- we have to transfer the result back to the main kernel
- we have to build up the final result from all the smaller parts

And now you see instantly the basics of what you have to consider. Those points made the parallel approach obviously slower than a simple non-parallel call. Let's consider a second theoretical example: you want to find the roots of a complicated function numerically. The function has parameters and you want to find the roots for different, say 8 parameter sets. Let's assume the root-finding is very slow, say .5 seconds and now reconsider the situation:

- The splitting of the input is very cheap, because you are not splitting up a large list, you only have 8 parameter sets.
- Transfering a parameter set (lets say 5 parameters) to a subkernel costs you nothing compared to transfering a list with 8^6 entries
- The result is the root of the function, one number which is again very fast if it is transfered back to the main kernel
- Building up the result list of 8 roots is again not even worth mentioning.

What's left is, that the .5 seconds of calculation are done in parallel and your overall calculation time drops from 0.5*8= 4 seconds to basically 0.5 seconds.

For your current problem, I would consider a different kind of parallelization which you might not be aware of because

*this is already built in*. The thing I mean is that many functions are Listable which means you give a list as input and they act on each element. Positive is one of those functions. The good thing is, that when you supply a list of numerical values to Positive it's calculated in parallel. The only thing you have to do is to reorder your calculation. You compute first which numbers in your list are positive and Pick then the matching out.

AbsoluteTiming[Pick[rn, Positive[rn]]] // First (* 0.117210 *)