Message Boards Message Boards

1
|
9568 Views
|
10 Replies
|
17 Total Likes
View groups...
Share
Share this post:

Deleting a subset of elements from a set

Posted 12 years ago
Given two lists, for example
data = {"cat", "dog", "cat", "bird", "bird", "fish"};
sub = {"cat", "fish"};
I would like to have the following action
DeleteSubset[data_, sub_] := DeleteCases[data, x_ /; MemberQ[sub, x]]

In[1]:= DeleteSubset[data, sub]
Out[1]= {"dog", "bird", "bird"}
Are there other ways to do this?
POSTED BY: Vitaliy Kaurov
10 Replies
Chad, I am assuming that the result you desire is {"dog", "cat", "bird", "bird"}
In[1]:= Fold[DeleteCases[#1, #2, {1}, 1] &, data, sub]
Out[1]= {"dog", "cat", "bird", "bird"}
and putting another cat in the "sub" list we get
data = {"cat", "dog", "cat", "bird", "bird", "fish"};
sub = {"cat", "cat", "fish"};

In[2]:= Fold[DeleteCases[#1, #2, {1}, 1] &, data, sub]
Out[2]= {"dog", "bird", "bird"}
In[1]:= data /. Thread[sub -> Sequence[]]
Out[1]= {"dog", "bird", "bird"}
We have this same question in SE.
See this link.
POSTED BY: Rodrigo Murta
Chad, here is a very crude attempt (choosing to remove "cat" in order of occurrence)
ReplacePart[data, Thread[Map[Take[Position[data, #], Min[Count[sub, #], Count[data, #]]] &, sub] -> Sequence[]]]

(* Out[5]= {"dog", "cat", "bird", "bird"} *)
POSTED BY: Ilian Gachevski
Is there also a simple way to verify the number of elements in each list?

Using Vitaly's list above, I would want the function to return

 {"cat",  'dog", "bird", "bird"}

because only one "cat" is present in sub.
POSTED BY: Chad Knutson
data /. (# -> Sequence[] & /@ sub)
is another option, and is slightly faster is you use it many many times. If you have many many items in 'sub' it might be an idea to create a dispatch table using Dispatch to speed it up even more.

to give you a run down of the options:
data={"cat","dog","cat","bird","bird","fish"};
sub={"cat","fish"};
AbsoluteTiming[Do[DeleteCases[data,x_/;MemberQ[sub,x]],{10^5}];]
AbsoluteTiming[Do[DeleteCases[data,Alternatives@@sub],{10^5}];]
AbsoluteTiming[Do[Select[data,FreeQ[#,Alternatives@@sub]&],{10^5}];]
rules=(#->Sequence[]&/@sub);
AbsoluteTiming[Do[data/.rules,{10^5}];]
giving:
{0.995983,Null}
{0.279152,Null}
{1.359880,Null}
{0.192315,Null}
Regards,

Sander
POSTED BY: Sander Huisman
Select[data, FreeQ[#, Alternatives @@ sub] &]
POSTED BY: Sam Carrettie
DeleteSubset[data_, sub_] := DeleteCases[data, Alternatives @@ sub]

In[1]:= DeleteSubset[data, sub]
Out[1]= {"dog", "bird", "bird"}
POSTED BY: Daniel Lichtblau
Complement[data,sub]
Look also at the 'neighbour' function Union and Intersect.
POSTED BY: Sander Huisman
Sander, I thought of that but this does not work in the same way - it gives a different result. Complement will delete all repeated elements. For this particular case you get a one "bird" while original functions preserves two of them.
In[1]:= Complement[data, sub]
Out[1]= {"bird", "dog"}
Compare to
In[2]:= DeleteSubset[data, sub]
Out[2]= {"dog", "bird", "bird"}
It also rearranges elements. I basically would like structure of non-deleted elements to be the same.
POSTED BY: Vitaliy Kaurov
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract