Message Boards Message Boards

0
|
6716 Views
|
3 Replies
|
0 Total Likes
View groups...
Share
Share this post:

How to search each element of a long list in a a longer one?

I have a long list (about 12000 entries) where each entry is of the form { Namei, Timei } (I call this list1 below). Basically it is a list of dimensions {~12000, 2}. Here the names are strings and the times are real numbers.

I need to look up these entries in a longer list (40000 entries) that has the form { Namei, Timei, Temp_i } (I call this list2 below). I basically want to get the value of temperature corresponding to each case in the first list. I have managed to do this using the function:

myfind = Function[x, Part[Select[list2, Drop[#, -1] == x &, 1], 1, 3]];
AbsoluteTiming[mytemps= Map[myfind, list1];]

A small detail that makes this a bit harder, is that the entries in list1 are not all unique, the same name and time can repeat a few times in the list, but this needs to be kept in the same position it appears. The longer list (list2), is unique in that each entry of list1 has a corresponding entry (and only one corresponding entry) in list2.

This works, but it is very slow. I have improved performance a bit by first throwing away everything from the second list that I do not need:

AbsoluteTiming[list2short = Select[list2, MemberQ[list1, Drop[#, -1]] &];]

This takes about 20 seconds, but helps making the previous operation faster. Even after this 'clean-up', the first operation takes about 45 sec. Does any one know of a way to speed up this kind of operations? Perhaps I should be using some database approach, rather than a purely list-based approach. I tried Sow and Reap constructs, and got it to work after a few tries, but it seemed even slower.

Thanks,

OL.

POSTED BY: Otto Linsuain
3 Replies

Problem solved! For those who are interested, I found a pretty fast way to do this!

First define a function that assigns the last value in each entry of list2 (i.e., the temp_i) to the first two values:

AbsoluteTiming[Table[TempFunc[Drop[x, -1]] = x[[-1]], {x,list2}];]

Then map this function on the first list (list1):

AbsoluteTiming[tempout = Map[TempFunc, list1];]

I tried this with list1 being almost 13000 entries and list2 a bit over 31000. The first statement took about 0.13 s and the second one about 0.018 s. I did not find Association or JoinAcross documented in my version. I am using Mathematica 9.

Thanks,

OL.

POSTED BY: Otto Linsuain

Thanks Gianluca. I am not very familiar with those functions, but will look it up.

I just thought that I could also try using the longer list to define a function (probably a cruder way to do what you suggest). Then map that function on the first list.

Thanks again,

OL.

POSTED BY: Otto Linsuain

I would consider giving Associations a try: if you rephase your data couples or triples as

Association["name"->namei, "time"->timei "temperature"->temp]

then you can use fast functions such as JoinAcross. Sorry I don't have time to look into this problem any further.

POSTED BY: Gianluca Gorni
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract