Group Abstract

Message Boards

WOLFRAM COMMUNITY

7.8K Views

1 Reply

0 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Mathematica Tuning and Debugging

Can this simple list matching task be done faster?

Otto Linsuain

Otto Linsuain, Westinghouse Electric Company

Posted 10 years ago

Hello, I have a 2D list of about 13K rows by 10 columns containing data on test results: Dimensions[list1] {12946, 10} Only the first two columns are important for what I am trying to do. The first column has only 50 unique string entries (test case names): Dimensions[Union[Part[list1, All, 1]]] {50} The second column contains real numbers (let's say times). Even if the second column is considered, list1 has only about 10K unique entries; some test cases have two data points at the same time (say, at different positions). I have another list of about 31K rows and only 3 columns: Dimensions[list2] {31154, 3} The first column of list2 contains the same 50 different entries (test case names) as in list1. If the first two columns of list2 are taken, then all the cases and times found in the first two columns of list1 are contained (but in list2 they are not repeated): Complement[Part[list1, All, {1, 2}], Part[list2, All, {1, 2}]] {} The third column of list2 contains data on an additional parameter that was not captured in list1. What I need to do is to construct a column (a 1D list) containing the value in the third column of list2 that corresponds to a given row (as determined by the values in the first two columns) of list1. Basically, I want to add an eleventh column to my original list1. Notice that occasionally the same value needs to be extracted twice (because of the repetitions in list1). I was able to do this by creating a function that matches the values and then mapping that function onto the first two columns of list1, but it takes long to execute (I need to do this several times): myfind = Function[x, Part[Select[list2, Drop[#, -1] == x &, 1], 1, 3]]; AbsoluteTiming[Map[myfind, Take[list1, All, 2]];] {139.517903, Null} I have the impression there must be a faster way to do this, but I cannot think of one. Thanks in advance, OL.

POSTED BY: Otto Linsuain

1 Reply

Sort By:

Ted Ersek

Posted 3 years ago

First I make list1, list2 with the structure you describe. strings1=Table[StringJoin@RandomChoice[Characters["abcdefghijklmnopqrstuvwxyz"],4],12946]; times1=RandomReal[{0,500},12946]; list1=Transpose[{strings1,times1}]; strings2=Table[StringJoin@RandomChoice[Characters["abcdefghijklmnopqrstuvwxyz"],4],31000-12946]; times2=RandomReal[{0,500},31000-12946]; list2=RandomSample[Join[Transpose[{strings2,times2}],list1],31000]; list2=Transpose[Join[Transpose[list2],{RandomInteger[{1,1000000},31000]}]]; Then I think the following does what you want in less than 0.15 seconds. Scan[(fun[Most[#]]=#)&,list2]; newList=fun/@list1;

First I make list1, list2 with the structure you describe.

strings1=Table[StringJoin@RandomChoice[Characters["abcdefghijklmnopqrstuvwxyz"],4],12946];
times1=RandomReal[{0,500},12946];
list1=Transpose[{strings1,times1}];
strings2=Table[StringJoin@RandomChoice[Characters["abcdefghijklmnopqrstuvwxyz"],4],31000-12946];
times2=RandomReal[{0,500},31000-12946];
list2=RandomSample[Join[Transpose[{strings2,times2}],list1],31000];
list2=Transpose[Join[Transpose[list2],{RandomInteger[{1,1000000},31000]}]];

Then I think the following does what you want in less than 0.15 seconds.

Scan[(fun[Most[#]]=#)&,list2];
newList=fun/@list1;

POSTED BY: Ted Ersek

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback