Group Abstract

Message Boards

WOLFRAM COMMUNITY

503 Views

5 Replies

11 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Wolfram Language

What is the "best" or "Wolfram canonical" way to remove elements in a list according to a condition?

Andrew D

Posted 2 months ago

I have `tripletList`, a list of triplets of integers: tripletList = {{1, 2, 5}, {1, 2, 6}, {2, 1, 4}, {2, 2, 3}, {2, 2, 4}}; I wish to delete triplets whose second part is 1. In other words, I wish to operate on `tripletList` to obtain this result: {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}} I realized this is a somewhat nebulous, subjective question, but... *what is the "best" or "Wolfram canonical" way to accomplish this?* I can think of three approaches: (* (1) ) DeleteCases[tripletList, _?(#[[2]] == 1 &)] // AbsoluteTiming ( (2) ) Replace[tripletList, element_ /; (element[[2]] == 1) -> Nothing, {1}] // AbsoluteTiming ( (3) ) Delete[#, Position[#, {_, 1, _}]] &@tripletList // AbsoluteTiming ( (4) ) DeleteCases[tripletList, {_, 1, _}] // AbsoluteTiming ( OUTPUT from (1), (2), (3), and (4), respectively: ) ( {0.0000147, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} ) ( {0.0000125, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} ) ( {9.210^(-6), {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} ) (* {3.810^(-6), {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} ) Clearly, approach (4) is the fastest, being about four times faster than approach (1). Is it fair to say that Wolfram is more or less built on the pattern-matching paradigm?

I have tripletList, a list of triplets of integers:

tripletList = {{1, 2, 5}, {1, 2, 6}, {2, 1, 4}, {2, 2, 3}, {2, 2, 4}};

I wish to delete triplets whose second part is 1. In other words, I wish to operate on tripletList to obtain this result:

{{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}

I realized this is a somewhat nebulous, subjective question, but... what is the "best" or "Wolfram canonical" way to accomplish this? I can think of three approaches:

(* (1) *)
DeleteCases[tripletList, _?(#[[2]] == 1 &)] // AbsoluteTiming

(* (2) *)
Replace[tripletList,
    element_ /; (element[[2]] == 1) -> Nothing, {1}] // AbsoluteTiming

(* (3) *)
Delete[#, Position[#, {_, 1, _}]] &@tripletList // AbsoluteTiming

(* (4) *)
DeleteCases[tripletList, {_, 1, _}] // AbsoluteTiming

(* OUTPUT from (1), (2), (3), and (4), respectively: *)
(* {0.0000147, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} *)
(* {0.0000125, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} *)
(* {9.2*10^(-6), {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} *)
(* {3.8*10^(-6), {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} *)

Clearly, approach (4) is the fastest, being about four times faster than approach (1). Is it fair to say that Wolfram is more or less built on the pattern-matching paradigm?

POSTED BY: Andrew D

5 Replies

Sort By:

Michael Rogers

Michael Rogers, Emory University

Posted 2 months ago

If timing is more important, then size of the array and whether it is packed makes a difference. tripletList = RandomInteger[5, {10000, 3}]; (* packed, large-ish ) Discard[tripletList, #[[2]] === 1 &]; // RepeatedTiming DeleteCases[tripletList, _?(#[[2]] == 1 &)]; // RepeatedTiming Replace[tripletList, element_ /; (element[[2]] == 1) -> Nothing, {1}]; // RepeatedTiming Delete[#, Position[#, {_, 1, _}]] &@tripletList; // RepeatedTiming DeleteCases[tripletList, {_, 1, _}]; // RepeatedTiming Pick[tripletList, tripletList[[All, 2]] - 1 // Unitize, 1]; // RepeatedTiming ( {0.00316529, Null} {0.00347198, Null} {0.00272056, Null} {0.00134858, Null} {0.000994357, Null} {0.00007601, Null} ) tripletList = Developer`FromPackedArray@RandomInteger[5, {10000, 3}]; ( unpacked, largish ) Discard[tripletList, #[[2]] === 1 &]; // RepeatedTiming DeleteCases[tripletList, _?(#[[2]] == 1 &)]; // RepeatedTiming Replace[tripletList, element_ /; (element[[2]] == 1) -> Nothing, {1}]; // RepeatedTiming Delete[#, Position[#, {_, 1, _}]] &@tripletList; // RepeatedTiming DeleteCases[tripletList, {_, 1, _}]; // RepeatedTiming Pick[tripletList, tripletList[[All, 2]] - 1 // Unitize, 1]; // RepeatedTiming ( {0.00252688, Null} {0.00283443, Null} {0.00247282, Null} {0.00132409, Null} {0.000371479, Null} {0.000183832, Null} *)

If timing is more important, then size of the array and whether it is packed makes a difference.

tripletList = RandomInteger[5, {10000, 3}]; (* packed, large-ish *)

Discard[tripletList, #[[2]] === 1 &]; // RepeatedTiming
DeleteCases[tripletList, _?(#[[2]] == 1 &)]; // RepeatedTiming
Replace[tripletList, 
   element_ /; (element[[2]] == 1) -> 
    Nothing, {1}]; // RepeatedTiming
Delete[#, Position[#, {_, 1, _}]] &@tripletList; // RepeatedTiming
DeleteCases[tripletList, {_, 1, _}]; // RepeatedTiming
Pick[tripletList, tripletList[[All, 2]] - 1 // Unitize, 
   1]; // RepeatedTiming
(*
{0.00316529, Null}
{0.00347198, Null}
{0.00272056, Null}
{0.00134858, Null}
{0.000994357, Null}
{0.00007601, Null}
*)

tripletList = Developer`FromPackedArray@RandomInteger[5, {10000, 3}]; (* unpacked, largish *)

Discard[tripletList, #[[2]] === 1 &]; // RepeatedTiming
DeleteCases[tripletList, _?(#[[2]] == 1 &)]; // RepeatedTiming
Replace[tripletList, 
   element_ /; (element[[2]] == 1) -> 
    Nothing, {1}]; // RepeatedTiming
Delete[#, Position[#, {_, 1, _}]] &@tripletList; // RepeatedTiming
DeleteCases[tripletList, {_, 1, _}]; // RepeatedTiming
Pick[tripletList, tripletList[[All, 2]] - 1 // Unitize, 
   1]; // RepeatedTiming
(*
{0.00252688, Null}
{0.00283443, Null}
{0.00247282, Null}
{0.00132409, Null}
{0.000371479, Null}
{0.000183832, Null}
*)

POSTED BY: Michael Rogers

Jason Biggs

Jason Biggs, Wolfram Research

Posted 2 months ago

Without question it is option (4) `DeleteCases[tripletList, {_, 1, _}]` I don't even care about the timing results, although I see it's fastest, it's just the most clean and clear choice.

POSTED BY: Jason Biggs

Arben Kalziqi

Arben Kalziqi, Wolfram Research

Posted 2 months ago

How does `Discard[tripletList,#[[2]]===1&]` perform for you?

POSTED BY: Arben Kalziqi

Andrew D

Posted 2 months ago

How does `Discard[tripletList, #[[2]] === 1 &]` perform for you? Unfortunately, I'm running version 14.0, and `Discard` was introduced in 14.2. Fortunately, I will be upgrading soon. Thank you for pointing it out, though -- `Discard` looks incredibly useful -- especially in that it is complementary to `Select`. I'm curious why it wasn't introduced earlier, though (`Select` and `DeleteCases` were introduced in 1.0 and 2.0, respectively). Maybe `Discard` was considered unnecessary because instead of discarding some things from a list, you can always just select the other things? Also, maybe the introduction of `Discard` in 14.2 is related to the introduction of `Tabular`, also in 14.2.

POSTED BY: Andrew D

Andrew D

Posted 2 months ago

By the way, I should have been using `RepeatedTiming`, not `AbsoluteTiming`, to test performance. For completeness, here are the timings on my computer: Clear[tripletList]; tripletList = {{1, 2, 5}, {1, 2, 6}, {2, 1, 4}, {2, 2, 3}, {2, 2, 4}}; (* (1) DeleteCases with pattern test ) RepeatedTiming[DeleteCases[tripletList, _?(#[[2]] == 1 &)], 5] ( (2) Replace with Condition ) RepeatedTiming[ Replace[tripletList, element_ /; (element[[2]] == 1) -> Nothing, {1}], 5] ( (3) Delete ) RepeatedTiming[Delete[#, Position[#, {_, 1, _}]] &@tripletList, 5] ( (4) DeleteCases ) RepeatedTiming[DeleteCases[tripletList, {_, 1, _}], 5] ( (5) Select ) RepeatedTiming[Select[tripletList, #[[2]] != 1 &], 5] ( OUTPUT from (1), (2), (3), (4), and (5), respectively: ) {2.5525810^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} {6.0258310^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} {4.4422310^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} {5.7544610^-7, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} {1.8988310^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}} So, in terms of timing, I have: (2) Replace with Condition > (3) Delete > (1) DeleteCases with pattern test > (5) Select > (4) DeleteCases with (4) being about 10 times faster than (2).

By the way, I should have been using RepeatedTiming, not AbsoluteTiming, to test performance.

For completeness, here are the timings on my computer:

Clear[tripletList];
tripletList = {{1, 2, 5}, {1, 2, 6}, {2, 1, 4}, {2, 2, 3}, {2, 2, 4}};

(* (1) DeleteCases with pattern test *)
RepeatedTiming[DeleteCases[tripletList, _?(#[[2]] == 1 &)], 5]
(* (2) Replace with Condition *)
RepeatedTiming[
 Replace[tripletList,
  element_ /; (element[[2]] == 1) -> Nothing, {1}], 5]
(* (3) Delete *)
RepeatedTiming[Delete[#, Position[#, {_, 1, _}]] &@tripletList, 5]
(* (4) DeleteCases *)
RepeatedTiming[DeleteCases[tripletList, {_, 1, _}], 5]
(* (5) Select *)
RepeatedTiming[Select[tripletList, #[[2]] != 1 &], 5]

(* OUTPUT from (1), (2), (3), (4), and (5), respectively: *)
{2.55258*10^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}}
{6.02583*10^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}}
{4.44223*10^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}}
{5.75446*10^-7, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}}
{1.89883*10^-6, {{1, 2, 5}, {1, 2, 6}, {2, 2, 3}, {2, 2, 4}}}

So, in terms of timing, I have:

(2) Replace with Condition
 >
(3) Delete
 >
(1) DeleteCases with pattern test
 >
(5) Select
 >
(4) DeleteCases

with (4) being about 10 times faster than (2).

POSTED BY: Andrew D

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback