# Can this simple list matching task be done faster?

Posted 8 years ago
4480 Views
|
|
0 Total Likes
|
 Hello,I have a 2D list of about 13K rows by 10 columns containing data on test results: Dimensions[list1] {12946, 10} Only the first two columns are important for what I am trying to do. The first column has only 50 unique string entries (test case names): Dimensions[Union[Part[list1, All, 1]]] {50} The second column contains real numbers (let's say times). Even if the second column is considered, list1 has only about 10K unique entries; some test cases have two data points at the same time (say, at different positions).I have another list of about 31K rows and only 3 columns: Dimensions[list2] {31154, 3} The first column of list2 contains the same 50 different entries (test case names) as in list1. If the first two columns of list2 are taken, then all the cases and times found in the first two columns of list1 are contained (but in list2 they are not repeated): Complement[Part[list1, All, {1, 2}], Part[list2, All, {1, 2}]] {} The third column of list2 contains data on an additional parameter that was not captured in list1. What I need to do is to construct a column (a 1D list) containing the value in the third column of list2 that corresponds to a given row (as determined by the values in the first two columns) of list1. Basically, I want to add an eleventh column to my original list1. Notice that occasionally the same value needs to be extracted twice (because of the repetitions in list1). I was able to do this by creating a function that matches the values and then mapping that function onto the first two columns of list1, but it takes long to execute (I need to do this several times): myfind = Function[x, Part[Select[list2, Drop[#, -1] == x &, 1], 1, 3]]; AbsoluteTiming[Map[myfind, Take[list1, All, 2]];] {139.517903, Null} I have the impression there must be a faster way to do this, but I cannot think of one.Thanks in advance,OL.
 First I make list1, list2 with the structure you describe. strings1=Table[StringJoin@RandomChoice[Characters["abcdefghijklmnopqrstuvwxyz"],4],12946]; times1=RandomReal[{0,500},12946]; list1=Transpose[{strings1,times1}]; strings2=Table[StringJoin@RandomChoice[Characters["abcdefghijklmnopqrstuvwxyz"],4],31000-12946]; times2=RandomReal[{0,500},31000-12946]; list2=RandomSample[Join[Transpose[{strings2,times2}],list1],31000]; list2=Transpose[Join[Transpose[list2],{RandomInteger[{1,1000000},31000]}]]; Then I think the following does what you want in less than 0.15 seconds. Scan[(fun[Most[#]]=#)&,list2]; newList=fun/@list1;