Unable to Find Sequence Continuation or Formula

Posted 9 years ago
4826 Views
|
4 Replies
|
9 Total Likes
|
 Hi,I am doing research for a project. I have over 1500 numbers which are in sequence. I am trying to enter only a few numbers of this 1500 sequence list and with the help of those few numbers enter, I can generate the continuation sequence upto the 1500th number.I bought a pro account just for this purpose. However, its not working. Any help will be appreciated. Here is the first 10 numbers.105208,105508,105637,105934,106208,106258,106377,106769,107293,107537Regards.
4 Replies
Sort By:
Posted 9 years ago
 Dear HK,I might not have made myself quite clear. The thing is that exactly the same procedure as in my post works with your sequence. data = Flatten[Import["~/Desktop/sequences/sequence-1.txt", "CSV"]]; interpol = InterpolatingPolynomial[data, t]; Also works for your sequence of 1500 or so numbers. Table[Abs[Floor[interpol]], {t, 1, 1696}] gives exactly the right sequence! The point that you only want positive numbers is irrelevant, because it is easy to make a rule - like I did above - where you take the Abs of everything. Of course, this will eventually go to infinity, but even that can be cured very easily; you could for example use some Fourier type thing - or simply say that the sequence is constant after some transient. The thing is that this does not seem to be the point. We can always find an infinite number of ways of writing this. There are some other interesting facts here:1) If you look at the difference of consecutive data points ListPlot[Differences[data]] this looks very much exponentially distributed. Histogram[Differences[data]] 2) Run EstimatedDistribution[Differences[data], ExponentialDistribution[[Mu]]] and you get ExponentialDistribution[0.00169902] This also shows that your time series is monotonously increasing. 3) Run a hypothesis test to see whether the Exponential Distribution works: DistributionFitTest[Differences[data], ExponentialDistribution[0.0016990249902017881], "HypothesisTestData"]["TestDataTable"] or alternatively DistributionFitTest[Differences[data], ExponentialDistribution[0.0016990249902017881], "HypothesisTestData"]["TestConclusion"] gives4) The autocorrelation function also drops like a stone: ListPlot[Transpose[{Range[21] - 1, CorrelationFunction[Differences[data], {0, 20}]}] // N, PlotRange -> All] so consecutive values are practically uncorrelated. Ok, then. What can we conclude? This looks very much like a stochastic process; similar to the stuff you would see in radioactive decay:http://community.wolfram.com/groups/-/m/t/250923You also find similar data when you look at arrival times of customers, or telephone queues etc. One can find - that is actually always the case - an infinite number of formulas that describe that sequence, but from the data that appears to be futile, because that description will need more or less as many parameters as you have numbers in the sequence.What would help is to know where that data comes from. Is it some kind of measurement of something? If so, of what? Cheers, Marco
Posted 9 years ago
 Hi,Thank you for replies above. I am sorry as I did not give more details. Here is some more details on this. There is exactly 1,696 numbers in each sequence. There is NO negative numbers Each number in the sequence is no more than 6 digits The sequence range is from 000000 to 999999 and those 1696 numbers are generated from this range and then resets. Each sequence has a Unique Number (UI) and a Common Number (CN). The Common Number for all 3 sequences below is 1500 The Unique Number for each sequence is belowSequence 1 => 57 Sequence 2 => 58 Sequence 3 => 59You can download the 3 sequences from this link => https://www.dropbox.com/s/wj5dmo1qtiwyce4/sequences.zipI really hope the above information helps. if anyone has any specific questions, let me know.Cheers
Posted 9 years ago
 Are you sure such formula exists? Even The On-Line Encyclopedia of Integer Sequences does not have it or its simplified version: # - #[[1]] &@{105208, 105508, 105637, 105934, 106208, 106258, 106377, 106769, 107293, 107537} {0, 300, 429, 726, 1000, 1050, 1169, 1561, 2085, 2329}Explaining what is the nature of the numbers, giving more numbers, and any additional information - desirably complete info about the problem - would be helpful.
Posted 9 years ago
 Dear Sam,I think that indeed such a formula does exist; it will not help a lot, but anyway. In fact there is an infinite number of sequences that fulfil these requirements. Let me construct one, which probably not the one HK wants. First we find an interpolating polynomial: interpol = InterpolatingPolynomial[{105208, 105508, 105637, 105934, 106208, 106258, 106377, 106769, 107293, 107537}, t] gives: 105208 + (300 + (-(171/2) + (113/2 + (-(265/12) + (13/3 + (-(1/45) + (-(17/80) + (71/1152 - (487 (-9 + t))/45360) (-8 + t)) (-7 + t)) (-6 + t)) (-5 + t)) (-4 + t)) (-3 + t)) (-2 + t)) (-1 + t) This polynomial will, by construction, go through all points in your list. The problem is that if we evaluate the polynomial at integer values the results will not be integers. But we can fix that by applying the Floor function: Table[Floor[interpol], {t, 1, 13}] which gives:{105208, 105508, 105637, 105934, 106208, 106258, 106377, 106769, 107293, 107537, 101918, 55301, -168469}So that is one solution that fulfils your condition. In fact, you can extend your sequence with any sequence of integers you like such as the first 10 digits of Pi: seq2 = Join[{105208, 105508, 105637, 105934, 106208, 106258, 106377, 106769, 107293, 107537}, RealDigits[Pi, 10, 10][[1]]] then interpolate: interpol2 = InterpolatingPolynomial[seq2, t] 105208 + (300 + (-(171/ 2) + (113/ 2 + (-(265/ 12) + (13/ 3 + (-(1/ 45) + (-(17/ 80) + (71/ 1152 + (-(487/ 45360) + (-(20383/ 725760) + (71051/ 2661120 + (-(535847/ 43545600) + (429343/ 113218560 + (-(3657119/ 4151347200) + (107564393/ 653837184000 + (-(457459/ 17791488000) + (307472309/ 88921857024000 + (-(18670559/45731240755200) + ( 61509179 (-19 + t))/1431118828339200) (-18 + t)) (-17 + t)) (-16 + t)) (-15 + t)) (-14 + t)) (-13 + t)) (-12 + t)) (-11 + t)) (-10 + t)) (-9 + t)) (-8 + t)) (-7 + t)) (-6 + t)) (-5 + t)) (-4 + t)) (-3 + t)) (-2 + t)) (-1 + t) then plot to check it does the trick:and then generate a list of integers with the help of the Floor function: Table[Floor[interpol2], {t, 1, Length[seq2] + 2}] which gives {105208, 105508, 105637, 105934, 106208, 106258, 106377, 106769, 107293, 107537, 3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 9934449952, 190559866782} It is quite obvious that this works in principle with any sequence of any length and you can extend it with all sorts of values, i.e. the first $m$ digits of $sqrt{2}$, or $sqrt{5}$, or $sqrt{7}$.In fact if you take the first few elements of the Fibonacci sequence: Table[Fibonacci[n], {n, 1, 12}] {1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144} you get interpol3 = InterpolatingPolynomial[Table[Fibonacci[n], {n, 1, 12}], t] 1 + (1/2 + (-(1/ 6) + (1/12 + (-(1/ 40) + (1/ 144 + (-(1/ 630) + (13/ 40320 + (-(1/ 17280) + (17/1814400 + (11 - t)/725760) (-10 + t)) (-9 + t)) (-8 + t)) (-7 + t)) (-6 + t)) (-5 + t)) (-4 + t)) (-3 + t)) (-2 + t) (-1 + t) and Table[Floor[interpol3], {t, 1, 14}] generates a sequence that is consistent with your list of numbers. This argument means: We have constructed an infinite number of solutions to your problem; most of the solutions will be useless to you. Mathematica would be a great help in any IQ test, where you have to construct these sequences. Instead of giving only one solution you can give an infinite number of solutions, even for random sequences. :-) Cheers, MarcoPS: Just nit-picking... I fully agree with what you said that we need much more information to attempt to give a useful solution to the problem.