Group Abstract Group Abstract

Message Boards Message Boards

0
|
391 Views
|
5 Replies
|
11 Total Likes
View groups...
Share
Share this post:

What is the "best" or "Wolfram canonical" way to remove elements in a list according to a condition?

Posted 1 month ago

I have tripletList, a list of triplets of integers:

tripletList = {{1, 2, 5}, {1, 2, 6}, {2, 1, 4}, {2, 2, 3}, {2, 2, 4}};

I wish to delete triplets whose second part is 1. In other words, I wish to operate on tripletList to obtain this result:

{{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}

I realized this is a somewhat nebulous, subjective question, but... what is the "best" or "Wolfram canonical" way to accomplish this? I can think of three approaches:

(* (1) *)
DeleteCases[tripletList, _?(#[[2]] == 1 &)] // AbsoluteTiming

(* (2) *)
Replace[tripletList,
    element_ /; (element[[2]] == 1) -> Nothing, {1}] // AbsoluteTiming

(* (3) *)
Delete[#, Position[#, {_, 1, _}]] &@tripletList // AbsoluteTiming

(* (4) *)
DeleteCases[tripletList, {_, 1, _}] // AbsoluteTiming

(* OUTPUT from (1), (2), (3), and (4), respectively: *)
(* {0.0000147, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} *)
(* {0.0000125, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} *)
(* {9.2*10^(-6), {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} *)
(* {3.8*10^(-6), {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} *)

Clearly, approach (4) is the fastest, being about four times faster than approach (1). Is it fair to say that Wolfram is more or less built on the pattern-matching paradigm?

POSTED BY: Andrew D
5 Replies

If timing is more important, then size of the array and whether it is packed makes a difference.

tripletList = RandomInteger[5, {10000, 3}]; (* packed, large-ish *)

Discard[tripletList, #[[2]] === 1 &]; // RepeatedTiming
DeleteCases[tripletList, _?(#[[2]] == 1 &)]; // RepeatedTiming
Replace[tripletList, 
   element_ /; (element[[2]] == 1) -> 
    Nothing, {1}]; // RepeatedTiming
Delete[#, Position[#, {_, 1, _}]] &@tripletList; // RepeatedTiming
DeleteCases[tripletList, {_, 1, _}]; // RepeatedTiming
Pick[tripletList, tripletList[[All, 2]] - 1 // Unitize, 
   1]; // RepeatedTiming
(*
{0.00316529, Null}
{0.00347198, Null}
{0.00272056, Null}
{0.00134858, Null}
{0.000994357, Null}
{0.00007601, Null}
*)

tripletList = Developer`FromPackedArray@RandomInteger[5, {10000, 3}]; (* unpacked, largish *)

Discard[tripletList, #[[2]] === 1 &]; // RepeatedTiming
DeleteCases[tripletList, _?(#[[2]] == 1 &)]; // RepeatedTiming
Replace[tripletList, 
   element_ /; (element[[2]] == 1) -> 
    Nothing, {1}]; // RepeatedTiming
Delete[#, Position[#, {_, 1, _}]] &@tripletList; // RepeatedTiming
DeleteCases[tripletList, {_, 1, _}]; // RepeatedTiming
Pick[tripletList, tripletList[[All, 2]] - 1 // Unitize, 
   1]; // RepeatedTiming
(*
{0.00252688, Null}
{0.00283443, Null}
{0.00247282, Null}
{0.00132409, Null}
{0.000371479, Null}
{0.000183832, Null}
*)
POSTED BY: Michael Rogers

Without question it is option (4) DeleteCases[tripletList, {_, 1, _}] I don't even care about the timing results, although I see it's fastest, it's just the most clean and clear choice.

POSTED BY: Jason Biggs

How does Discard[tripletList,#[[2]]===1&] perform for you?

POSTED BY: Arben Kalziqi
Posted 1 month ago

How does Discard[tripletList, #[[2]] === 1 &] perform for you?

Unfortunately, I'm running version 14.0, and Discard was introduced in 14.2. Fortunately, I will be upgrading soon.

Thank you for pointing it out, though -- Discard looks incredibly useful -- especially in that it is complementary to Select. I'm curious why it wasn't introduced earlier, though (Select and DeleteCases were introduced in 1.0 and 2.0, respectively). Maybe Discard was considered unnecessary because instead of discarding some things from a list, you can always just select the other things?

Also, maybe the introduction of Discard in 14.2 is related to the introduction of Tabular, also in 14.2.

POSTED BY: Andrew D
Posted 1 month ago

By the way, I should have been using RepeatedTiming, not AbsoluteTiming, to test performance.

For completeness, here are the timings on my computer:

Clear[tripletList];
tripletList = {{1, 2, 5}, {1, 2, 6}, {2, 1, 4}, {2, 2, 3}, {2, 2, 4}};

(* (1) DeleteCases with pattern test *)
RepeatedTiming[DeleteCases[tripletList, _?(#[[2]] == 1 &)], 5]
(* (2) Replace with Condition *)
RepeatedTiming[
 Replace[tripletList,
  element_ /; (element[[2]] == 1) -> Nothing, {1}], 5]
(* (3) Delete *)
RepeatedTiming[Delete[#, Position[#, {_, 1, _}]] &@tripletList, 5]
(* (4) DeleteCases *)
RepeatedTiming[DeleteCases[tripletList, {_, 1, _}], 5]
(* (5) Select *)
RepeatedTiming[Select[tripletList, #[[2]] != 1 &], 5]

(* OUTPUT from (1), (2), (3), (4), and (5), respectively: *)
{2.55258*10^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}}
{6.02583*10^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}}
{4.44223*10^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}}
{5.75446*10^-7, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}}
{1.89883*10^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}}

So, in terms of timing, I have:

(2) Replace with Condition
 >
(3) Delete
 >
(1) DeleteCases with pattern test
 >
(5) Select
 >
(4) DeleteCases

with (4) being about 10 times faster than (2).

POSTED BY: Andrew D
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard