Group Abstract

Message Boards

WOLFRAM COMMUNITY

0

|

7.9K Views

|

14 Replies

|

0 Total Likes

View groups...

Follow this post

Share

Share this post:

GROUPS:

Mathematica

Gathering from large set of random numbers

Al Guy

Posted 10 years ago

How can I generate a very large (billions) list of sublists consisting of randomly permuted integers from 1 to N and then gather them in subsublists where all elements differ only by one unit? I have especially hard time with the last operation. I guess it has something to do with functional programming. Will appreciate any help.

POSTED BY: Al Guy

14 Replies

Sort By:

0

Okkes Dulgerci

Posted 10 years ago

t = Table[Sort[RandomSample[Range[12], 7]], {10}]
{{1, 3, 4, 7, 9, 11, 12}, {2, 4, 6, 7, 8, 10, 11}, {2, 4, 5, 7, 9, 11,
   12}, {3, 5, 6, 8, 9, 11, 12}, {2, 3, 4, 7, 9, 10, 12}, {1, 2, 4, 8,
   9, 10, 12}, {2, 3, 4, 5, 8, 9, 10}, {2, 3, 5, 6, 7, 8, 9}, {1, 3, 
  6, 7, 9, 10, 11}, {1, 2, 3, 6, 7, 10, 11}}
Table[Split[t[[i]], #2 - #1 == 1 &], {i, Length@t}]
{{{1}, {3, 4}, {7}, {9}, {11, 12}}, {{2}, {4}, {6, 7, 8}, {10, 
   11}}, {{2}, {4, 5}, {7}, {9}, {11, 12}}, {{3}, {5, 6}, {8, 9}, {11,
    12}}, {{2, 3, 4}, {7}, {9, 10}, {12}}, {{1, 2}, {4}, {8, 9, 
   10}, {12}}, {{2, 3, 4, 5}, {8, 9, 10}}, {{2, 3}, {5, 6, 7, 8, 
   9}}, {{1}, {3}, {6, 7}, {9, 10, 11}}, {{1, 2, 3}, {6, 7}, {10, 
   11}}}

I am not sure this is what you want..

POSTED BY: Okkes Dulgerci

0

Al Guy

Posted 10 years ago

`Split` is a problem here. It's up in a few comments.

POSTED BY: Al Guy

0

Bill Simpson

Posted 10 years ago

No idea whether this is faster or slower

In[1]:= t = Table[Sort[RandomSample[Range[12], 7], Less], {10}]

Out[1]= {{2, 3, 4, 6, 7, 11, 12}, {3, 4, 6, 8, 9, 11, 12}, {1, 2, 3, 4, 8, 9, 10}, {1, 2, 5, 6, 8, 9, 12},
   {2, 7, 8, 9, 10, 11, 12}, {1, 2, 4, 5, 6, 7, 9}, {2, 3, 4, 6, 7, 8, 12}, {3, 4, 7, 8, 9, 11, 12},
   {1, 2, 3, 5, 6, 9, 11}, {2, 4, 5, 6, 7, 8, 11}}

In[2]:= one = Table[1, {7 - 1}];
Map[1 + Total[Unitize[Rest[#] - Most[#] - one]] &, t]

Out[3]= {3, 4, 2, 4, 2, 3, 3, 3, 4, 3}

POSTED BY: Bill Simpson

0

Al Guy

Posted 10 years ago

This is great! Faster 20 times! I need one more order of magnitude and that's it.

POSTED BY: Al Guy

0

Bill Simpson

Posted 10 years ago

Perhaps this is what you want to learn next.

POSTED BY: Bill Simpson

0

Al Guy

Posted 10 years ago

Just to elaborate a little: t = Table[Sort[RandomSample[ll, 7], Less], {10^5}] How can I write a function (I assume with `Map`) so that it will return the number of non-consecutive neighbor discontinuities in each sublist, say 12356710 has two discontinuities? The above written `Split` is slow. I think there should be other faster way without drastically reorganizing the list.

POSTED BY: Al Guy

0

Al Guy

Posted 10 years ago

I think the problem is Split function, it takes a lot of time to execute. I was trying to put something like `CountDistnictBy` but i don't know how to supply the test function that would count when the neighbors' difference is more than 2.

POSTED BY: Al Guy

0

Al Guy

Posted 10 years ago

I think the compilation wasn't good. Here is what CompilePrint returned:

No argument
       8 Integer registers
       2 Real registers
       1 Tensor register
       Underflow checking off
       Overflow checking off
       Integer overflow checking on
       RuntimeAttributes -> {}

       I0 = 0
       I7 = 1000000
       I1 = 12
       I6 = 7
       I3 = 1
       Result = I4

1   V17 = MainEvaluate[ Function[{}, k = 0][ ]]
2   V17 = MainEvaluate[ Function[{}, ll = Range[12]][ ]]
3   V17 = MainEvaluate[ Function[{}, n = 7][ ]]
4   I5 = I7
5   I4 = I0
6   goto 8
7   R1 = MainEvaluate[ Function[{}, k = k + \
Length[Split[Sort[RandomSample[ll, n]], #1 - #2 == -1 & ]]][ ]]
8   if[ ++ I4 <= I5] goto 7
9   I4 = MainEvaluate[ Function[{}, k][ ]]
10  Return

POSTED BY: Al Guy

0

Bill Simpson

Posted 10 years ago

Good catch. Thank you for pointing out my mistake. I didn't take time to check that because I (incorrectly) assumed all those steps looked simple enough that they would compile without complaint. Perhaps you can experiment with small changes to this until you isolate what is causing the compile to hook back to the Mathematica evaluator and then see if you can think of a simple way to rewrite that. If I had to bet then I would expect it is the user defined function being given to Split that isn't compiling. Replacing that with some very simple iterative code might fix the problem. That then might give the 3-10x increase in speed. I would also watch for the size of the total being accumulated in k because it looks like it might overflow with larger exponents.

POSTED BY: Bill Simpson

0

Bill Simpson

Posted 10 years ago

Thank you. That helps. You are doing lots of small integer operations. My first thought is to compare your code with a compiled version of your code. I quit the kernel, evaluate x=0 to get the kernel loaded and then I evaluate your function. In[1]:= x = 0 Out[1]= 0 In[2]:= AbsoluteTiming[ k = 0; ll = Range[12]; n = 7; For[i = 1, i <= 10^6, i++, k += Length[Split[Sort[RandomSample[ll, n], Less], #1 - #2 == -1 &]] ]; k] Out[2]= {19.4494, 3500839} Now I quit the kernel, evaluate x=0 and then I evaluate the compiled version. In[1]:= x = 0 Out[1]= 0 In[2]:= cfun = Compile[{}, k = 0; ll = Range[12]; n = 7; Do[ k += Length[Split[Sort[RandomSample[ll, n]], #1 - #2 == -1 &]] , {10^6}]; k]; AbsoluteTiming[cfun[]] Out[3]= {15.3916, 3500458} So that is 19.4 seconds versus 15.4 seconds. That isn't enough of an increase. Perhaps someone else can think of another way to increase this even more.

POSTED BY: Bill Simpson

0

Al Guy

Posted 10 years ago

So far I wrote this code. But it is remarkably inefficient, I can't get beyond 10^7. Any way to accelerate? ll = Range[12]; k = 0; n = 7; AbsoluteTiming@(For[i = 1, i <= 10^9, i++, k += Length[ Split[Sort[RandomSample[ll, n], Less], #1 - #2 == -1 &]]; ];) k

POSTED BY: Al Guy

0

Bill Simpson

Posted 10 years ago

I suspect you might have a better chance of getting useful replies if you were to manually create a small example list of permuted integers for a small N, a set small enough that it is easy understand, but just large enough that it will show each of the kinds of cases that might arise. Then manually show what the result should be after you perform your operation on these sets and explain why. It might not be precisely clear to some readers exactly what "gather them in subsublists where all elements differ only by one unit" means and without that understanding it might be difficult to offer the code idea you are looking for.

POSTED BY: Bill Simpson

0

Simon Cadrin, autodidacte

Posted 10 years ago

POSTED BY: Simon Cadrin

0

Al Guy

Posted 10 years ago

Not sure what it has to do with my question.

POSTED BY: Al Guy

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback