Message Boards Message Boards

0
|
7328 Views
|
12 Replies
|
3 Total Likes
View groups...
Share
Share this post:

Apply Select to find the elements that have a given relationship?

Posted 6 years ago

Suppose one has the list {1,2,3,5,6,8,9,12,13} of which one wants to select the elements of which the dfference with the following element is greater than 1. One could find these elements with the following code:

list = {1, 2, 3, 5, 6, 8, 9, 12, 13};
For[result = {}; i = 1, i < Length[list] - 1, i++; 
 If[list[[i + 1]] - list[[i]] > 1
  , AppendTo[result, list[[i]]]]]
result

It becomes a little more complicated with the list {{{2, 5, 3}, 1.2}, {{3, 4, 6}, 3.7}, {{7, 4, 6}, 5.7}, {{7, 4, 10}, 5.8}, {{7, 4, 12}, 5.7}}. One could here find the subsublists of which the third element differs more than 1 from the third element in the following subsublist with the following code:

list = {{{2, 5, 3}, 1.2}, {{3, 4, 6}, 3.7}, {{7, 4, 6}, 
    5.7}, {{7, 4, 10}, 5.8}, {{7, 4, 12}, 5.7}};
For[result = {}; i = 1, i < Length[list], i++; 
 If[(list[[All, 1]][[i, 3]] - list[[All, 1]][[i - 1, 3]]) > 1, 
  AppendTo[result, Part[list, i]   ]]]
result

But I presume these operations will also be possible with Select. But how should I do this? I can't find it in the Mathematica Help.

POSTED BY: Laurens Wachters
12 Replies

Laurens,

I looked at your code but I had difficulty in getting the replace to work properly because your list of data may have a new year within it. In this case, the replacement rule would be different in one part of the list vs the other. Under those circumstances, you would need to break the list or do a Map[] operation that handles each date individually. The code became difficult to make it reliably work (which I suppose you found on your own). I think you are best with the approach using one of the built-in date functions posted above.

Regards,

Neil

POSTED BY: Neil Singer

Hi Laurens, I use offten GatherBy and then pick what I need.

Given you data

data = {{{2018, 6, 1}, 14.1}, {{2018, 6, 4}, 13.71}, {{2018, 6, 5}, 
   13.8}, {{2018, 6, 6}, 13.64}, {{2018, 6, 7}, 13.78}, {{2018, 6, 8},
    13.93}, {{2018, 6, 11}, 13.98}, {{2018, 6, 12}, 
   13.98}, {{2018, 6, 13}, 13.89}, {{2018, 6, 14}, 
   13.64}, {{2018, 6, 15}, 13.3}, {{2018, 6, 18}, 
   13.2}, {{2018, 6, 19}, 12.95}, {{2018, 6, 20}, 
   12.88}, {{2018, 6, 21}, 12.76}, {{2018, 6, 22}, 
   13.05}, {{2018, 6, 25}, 12.75}, {{2018, 6, 26}, 
   13.74}, {{2018, 6, 27}, 13.96}, {{2018, 6, 28}, 
   13.83}, {{2018, 6, 29}, 13.61}, {{2018, 7, 2}, 
   13.37}, {{2018, 7, 3}, 13.37}, {{2018, 7, 5}, 
   13.43}, {{2018, 7, 6}, 13.85}, {{2018, 7, 9}, 
   13.95}, {{2018, 7, 10}, 14.17}, {{2018, 7, 11}, 
   13.99}, {{2018, 7, 12}, 13.99}, {{2018, 7, 13}, 
   13.89}, {{2018, 7, 16}, 13.9}, {{2018, 7, 17}, 
   13.69}, {{2018, 7, 18}, 13.75}, {{2018, 7, 19}, 
   13.73}, {{2018, 7, 20}, 13.12}, {{2018, 7, 23}, 
   12.99}, {{2018, 7, 24}, 13.12}, {{2018, 7, 25}, 
   13.11}, {{2018, 7, 26}, 13.15}, {{2018, 7, 27}, 
   13.06}, {{2018, 7, 30}, 13.16}, {{2018, 7, 31}, 
   13.63}, {{2018, 8, 1}, 13.24}, {{2018, 8, 2}, 
   13.17}, {{2018, 8, 3}, 13.14}, {{2018, 8, 6}, 13.1}, {{2018, 8, 7},
    13.16}, {{2018, 8, 8}, 13.05}, {{2018, 8, 9}, 
   12.94}, {{2018, 8, 10}, 12.77}, {{2018, 8, 13}, 
   12.45}, {{2018, 8, 14}, 12.35}, {{2018, 8, 15}, 
   12.22}, {{2018, 8, 16}, 12.3}, {{2018, 8, 17}, 
   12.3}, {{2018, 8, 20}, 12.3}, {{2018, 8, 21}, 
   12.63}, {{2018, 8, 22}, 12.47}, {{2018, 8, 23}, 
   12.54}, {{2018, 8, 24}, 12.5}, {{2018, 8, 27}, 
   12.77}, {{2018, 8, 28}, 12.76}, {{2018, 8, 29}, 
   12.97}, {{2018, 8, 30}, 12.77}}

weeks = GatherBy[data, DateValue[#[[1]], "Week"] &]

MaximalBy[#, AbsoluteTime[#[[1]]] &] & /@ weeks
POSTED BY: l van Veen

You are perfectly right, Neil. Your code does the job quite well. Thanks a lot! Since I have spent so much time on Replace, I am still anxious to hear from you hoewever why my approach with Replace did not work. Is a conditional Replace possible at all? Beste regards, Laurens

POSTED BY: Laurens Wachters

Laurens,

Datedifference[] has all of the calendar knowledge built in to it. If you run the code above it will properly handle leap years and month seams, etc. because it is interpreting the dates as actual dates. Am I properly understanding your last post or am I missing something? If so, I can look at the replacement rules.

Regards,

Neil

POSTED BY: Neil Singer

Dear Neil, I am afraid that things become too complicated if one keeps trying to just use Differences, since then one keeps the problem at the end of months where the last work day of a week is either the 29th, the 30th, or the 31st. Therefore I decided to start working with yearly serial numbers of days instead of monthly serial numbers. For this I invented the rules method. For normal years one has to use rules1 and for leap years rules 2. To find out if a year is a leap year I do the test Mod[year,4]==0. The problem that I am left with is now how to do a Replace with one of the rules sets (i.e. rules1 for normal years and rules2 for leap years) depending on the result of this test. I could easily get what I want with a bit of procedural programming in Mathematica, but I see it as a challenge to get it done with functional programming. In fact my problem with the leap years boils down to the question if a Replace with either one of two sets of rules can be done based on the result of a test.

POSTED BY: Laurens Wachters

Laurens,

I can help you fix the replace but you might also consider using the built in date functionality in MMA. you can use DateDifference[] to compute the absolute number of days between two dates.

For example, you can use Vitaliy's code (which is simpler than mine) to pick out the dates that are more than two days apart by partitioning the list of dates {eg {2018,6,8}) into sets of two dates which are overlapping by one (i.e. take a list of {d1,d2,d3,d4,d5} and make it {{d1,d2},{d2,d3},{d3,d4},{d4,d5}} and then use DateDifference to compare each set of dates. Note I had to use Apply because the pairs of dates are a list and DateDifference will not take a list -- only a sequence of two dates plus options.

Pick[Most[data], 
 Thread[Map[Apply[DateDifference, #] &, 
    Partition[data[[All, 1]], 2, 1]] >= Quantity[2, "Days"]]]
POSTED BY: Neil Singer

Dear Neal and Vitaliy, I did find a method to convert the montly serial numbers of days to yearly serial numbers, but one problem remains. The following code gives the yearly serial numbers along with the prices of each day:

Pick[Most[
  Partition[
   Riffle[Replace[data[[All, 1, 2]], rules, All] + data[[All, 1, 3]], 
    data[[All, 2]]], 2]], 
 Thread[Differences[
    Partition[
      Riffle[Replace[data[[All, 1, 2]], rules, All] + 
        data[[All, 1, 3]], data[[All, 2]]], 2][[All, 1]]] >= 3]]

where e.g.

data = FinancialData["GE", "Close", {2018, 6, 1}]

and

rules1 = {1 -> 0, 2 -> 31, 3 -> 59, 4 -> 90, 4 -> 120, 6 -> 151, 
   7 -> 181, 8 -> 212, 9 -> 243, 10 -> 273, 11 -> 304, 12 -> 334};

In the same way the following code gives consecutively the year, the number of the month, the serial number of the day in the month, the price, and the serial number of the day in the year.

Pick[Most[
  Partition[
   Flatten[Riffle[data, 
     Replace[data[[All, 1, 2]], rules, All] + data[[All, 1, 3]]]], 
   5]], Thread[
  Differences[
    Partition[
      Flatten[Riffle[data, 
        Replace[data[[All, 1, 2]], rules, All] + data[[All, 1, 3]]]], 
      5][[All, 5]]] >= 3]]

As I said, one problem remains. That is that in leap years the rules should be:

rules2 = {1 -> 0, 2 -> 31, 3 -> 60, 4 -> 91, 4 -> 121, 6 -> 152, 
   7 -> 182, 8 -> 213, 9 -> 244, 10 -> 274, 11 -> 305, 12 -> 335};

So the replacement rules depend on the fact if there is a leap year. I have tried:

If[Mod[data[[All, 1, 1]], 4] == 0, 
 Replace[data[[All, 1, 2]], rules2, All], 
 Replace[data[[All, 1, 2]], rules1, All]]

but that does not work. So I suspected that this was caused by the fact that the elements that had to be changed did not contain the information needed for the decision which rules to use. So I tried to bring all information in one and the same object in the following numerical way. I start from:

data = FinancialData["GE", "Close", {{2015, 12, 20}, {2016, 3, 10}}]

and

list = data[[All, 1, 1]]*100000 + data[[All, 1, 2]]*1000 + 
  Replace[data[[All, 1, 2]], rules1, All]

The replacement action does work:

Replace[1000*(list/1000 - Floor[list/1000]), rules3, All]

where

rules3 = {59 -> 60, 120 -> 121, 151 -> 152, 181 -> 182, 212 -> 213, 
      243 -> 244, 273 -> 274, 304 -> 305, 334 -> 335};

but the conditional replacement does not:

 If[Mod[Floor[list/100000], 4] == 0, 
     Replace[1000*(list/1000 - Floor[list/1000]), rules3, All]]

Also using /; instead of If did not work. What did I do wrong????

POSTED BY: Laurens Wachters

Thanks Neil. The problem with the 4th of July can be solved by taking the difference >= 3 as criterium instead of >=1. But there remain problems at the end of those months where the 29th, 30th, or 31th day is a Friday. By the way, I get the same result with the Pick-method that Vitaliy proposed. In my opinion we could solve this problem by converting for instance the sublists {2018,2,1}, {2018,2,2}, {2018,2,3} etc. to something like {2018,32}, {2018,33}, {2018,34}. And in the same way {2018,3,1} would become {2018,59}. So each time the second element in these new sublists will give the serial number of the day in the year instead of in the various months. In that case your method and that of Vitaliy would work properly. But how do I get the conversion to the new sublists? I have been playing with ReplaceAll, but in some way that does not seem to work.

POSTED BY: Laurens Wachters

Laurens,

I tried what I posted and it works for what you want.

Extract[data, Position[Differences[data[[All, 1, 3]]], _?(# > 1 &)]]

to get

{{{2018,6,1},14.1},{{2018,6,8},13.93},{{2018,6,15},13.3},{{2018,6,22},13.05},{{2018,7,3},13.37},{{2018,7,6},13.85},{{2018,7,13},13.89},{{2018,7,20},13.12},{{2018,7,27},13.06},{{2018,8,3},13.14},{{2018,8,10},12.77},{{2018,8,17},12.3},{{2018,8,24},12.5}}

Note that there are some issues. When you change month (for example jump from 6/29 to July you can't just look at the day of the month for the differences. Also, note that July 4 is a holiday so that came up as an "end of week".

Regards

Neil

POSTED BY: Neil Singer

Thanks a lot Vitaliy and Neil. Unfortunately I was so stupid to expect that I could use Select or Pick for obtaining weekly instead of daily stock prices as follows:

data = FinancialData["GE", "Close", {2018, 6, 1}]

This produces the following list:

{{{2018, 6, 1}, 14.1}, {{2018, 6, 4}, 13.71}, {{2018, 6, 5}, 
  13.8}, {{2018, 6, 6}, 13.64}, {{2018, 6, 7}, 13.78}, {{2018, 6, 8}, 
  13.93}, {{2018, 6, 11}, 13.98}, {{2018, 6, 12}, 
  13.98}, {{2018, 6, 13}, 13.89}, {{2018, 6, 14}, 
  13.64}, {{2018, 6, 15}, 13.3}, {{2018, 6, 18}, 
  13.2}, {{2018, 6, 19}, 12.95}, {{2018, 6, 20}, 
  12.88}, {{2018, 6, 21}, 12.76}, {{2018, 6, 22}, 
  13.05}, {{2018, 6, 25}, 12.75}, {{2018, 6, 26}, 
  13.74}, {{2018, 6, 27}, 13.96}, {{2018, 6, 28}, 
  13.83}, {{2018, 6, 29}, 13.61}, {{2018, 7, 2}, 
  13.37}, {{2018, 7, 3}, 13.37}, {{2018, 7, 5}, 13.43}, {{2018, 7, 6},
   13.85}, {{2018, 7, 9}, 13.95}, {{2018, 7, 10}, 
  14.17}, {{2018, 7, 11}, 13.99}, {{2018, 7, 12}, 
  13.99}, {{2018, 7, 13}, 13.89}, {{2018, 7, 16}, 
  13.9}, {{2018, 7, 17}, 13.69}, {{2018, 7, 18}, 
  13.75}, {{2018, 7, 19}, 13.73}, {{2018, 7, 20}, 
  13.12}, {{2018, 7, 23}, 12.99}, {{2018, 7, 24}, 
  13.12}, {{2018, 7, 25}, 13.11}, {{2018, 7, 26}, 
  13.15}, {{2018, 7, 27}, 13.06}, {{2018, 7, 30}, 
  13.16}, {{2018, 7, 31}, 13.63}, {{2018, 8, 1}, 
  13.24}, {{2018, 8, 2}, 13.17}, {{2018, 8, 3}, 13.14}, {{2018, 8, 6},
   13.1}, {{2018, 8, 7}, 13.16}, {{2018, 8, 8}, 13.05}, {{2018, 8, 9},
   12.94}, {{2018, 8, 10}, 12.77}, {{2018, 8, 13}, 
  12.45}, {{2018, 8, 14}, 12.35}, {{2018, 8, 15}, 
  12.22}, {{2018, 8, 16}, 12.3}, {{2018, 8, 17}, 
  12.3}, {{2018, 8, 20}, 12.3}, {{2018, 8, 21}, 
  12.63}, {{2018, 8, 22}, 12.47}, {{2018, 8, 23}, 
  12.54}, {{2018, 8, 24}, 12.5}, {{2018, 8, 27}, 
  12.77}, {{2018, 8, 28}, 12.76}, {{2018, 8, 29}, 
  12.97}, {{2018, 8, 30}, 12.77}}

I now want to pick only the dates and prices at the last available day of each week. With :

Pick[Most[data], Thread[Differences[data[[All, 1, 3]]] >= 2]]

I get the following result:

{{{2018, 6, 1}, 14.1}, {{2018, 6, 8}, 13.93}, {{2018, 6, 15}, 13.3}, {{2018, 6, 22}, 13.05}, {{2018, 7, 3}, 13.37}, {{2018, 7, 6}, 13.85}, {{2018, 7, 13}, 13.89}, {{2018, 7, 20}, 13.12}, {{2018, 7, 27}, 13.06}, {{2018, 8, 3}, 13.14}, {{2018, 8, 10}, 12.77}, {{2018, 8, 17}, 12.3}, {{2018, 8, 24}, 12.5}}

But from this you can see that things go wrong at the beginning of July. So, is there better way to get weekly stock prices? Note that I want to do analyses for various markets/countries. So I don't want to use things like "Business Day", because then I have to change things for each market or country . I would in fact prefer to convert the date strings in the data-list in series of consecutive days for each year and then with Pick find the days that have a time gap. towards the following day for which a price is avalalble. But how do I do the conversion?

POSTED BY: Laurens Wachters

Laurens,

You should try to avoid doing loops and think about list-based approaches. Vitaliy made one such suggestion. Another is to use Extract and Position:

Extract[list, Position[Differences[list], _?(# > 1 &)]]

for your first example. Or use Part ([[All,1,3]] to get a column of only the third element) for the second example:

Extract[list, Position[Differences[list[[All, 1, 3]]], _?(# > 1 &)]]

These give the same answer as Vitaliy gave (note your second loop has a bug and is offset by one element from your description).

Regards,

Neil

POSTED BY: Neil Singer

Select looks at element by element with no looking at the neighbors. So you need to fed it right restructured data to do the job.

data = {1, 2, 3, 5, 6, 8, 9, 12, 13};
Select[Partition[data, 2, 1], #[[2]] - #[[1]] > 1 &][[All, 1]]

Out[] {3, 6, 9}

You don't have to use Select though.

Pick[Most[data], Thread[Differences[data] > 1]]

Out[] {3, 6, 9}

Works with more complicated case too:

data = {{{2, 5, 3}, 1.2}, {{3, 4, 6}, 3.7}, {{7, 4, 6}, 5.7}, 
            {{7, 4, 10}, 5.8}, {{7, 4, 12}, 5.7}};

Pick[Most[data], Thread[Differences[data[[All, 1, 3]]] > 1]]

Out[] = {{{2, 5, 3}, 1.2}, {{7, 4, 6}, 5.7}, {{7, 4, 10}, 5.8`}}

POSTED BY: Vitaliy Kaurov
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract