Message Boards Message Boards

0
|
5571 Views
|
4 Replies
|
9 Total Likes
View groups...
Share
Share this post:

Unable to Find Sequence Continuation or Formula

Posted 11 years ago

Hi,

I am doing research for a project. I have over 1500 numbers which are in sequence. I am trying to enter only a few numbers of this 1500 sequence list and with the help of those few numbers enter, I can generate the continuation sequence upto the 1500th number.

I bought a pro account just for this purpose. However, its not working. Any help will be appreciated. Here is the first 10 numbers.

105208,105508,105637,105934,106208,106258,106377,106769,107293,107537

Regards.

POSTED BY: H K
4 Replies

Dear HK,

I might not have made myself quite clear. The thing is that exactly the same procedure as in my post works with your sequence.

data = Flatten[Import["~/Desktop/sequences/sequence-1.txt", "CSV"]];
interpol = InterpolatingPolynomial[data, t];

Also works for your sequence of 1500 or so numbers.

Table[Abs[Floor[interpol]], {t, 1, 1696}]

gives exactly the right sequence! The point that you only want positive numbers is irrelevant, because it is easy to make a rule - like I did above - where you take the Abs of everything.

Of course, this will eventually go to infinity, but even that can be cured very easily; you could for example use some Fourier type thing - or simply say that the sequence is constant after some transient. The thing is that this does not seem to be the point. We can always find an infinite number of ways of writing this.

There are some other interesting facts here:

1) If you look at the difference of consecutive data points

ListPlot[Differences[data]]

enter image description here

this looks very much exponentially distributed.

Histogram[Differences[data]]

enter image description here

2) Run

EstimatedDistribution[Differences[data], ExponentialDistribution[[Mu]]]

and you get

ExponentialDistribution[0.00169902]

This also shows that your time series is monotonously increasing.

3) Run a hypothesis test to see whether the Exponential Distribution works:

DistributionFitTest[Differences[data], ExponentialDistribution[0.0016990249902017881`], "HypothesisTestData"]["TestDataTable"]

enter image description here

or alternatively

DistributionFitTest[Differences[data], ExponentialDistribution[0.0016990249902017881`],  "HypothesisTestData"]["TestConclusion"]

gives

enter image description here

4) The autocorrelation function also drops like a stone:

ListPlot[Transpose[{Range[21] - 1, CorrelationFunction[Differences[data], {0, 20}]}] // N, PlotRange -> All]

enter image description here

so consecutive values are practically uncorrelated.

Ok, then. What can we conclude? This looks very much like a stochastic process; similar to the stuff you would see in radioactive decay:

http://community.wolfram.com/groups/-/m/t/250923

You also find similar data when you look at arrival times of customers, or telephone queues etc.

One can find - that is actually always the case - an infinite number of formulas that describe that sequence, but from the data that appears to be futile, because that description will need more or less as many parameters as you have numbers in the sequence.

What would help is to know where that data comes from. Is it some kind of measurement of something? If so, of what?

Cheers, Marco

POSTED BY: Marco Thiel
Posted 11 years ago

Hi,

Thank you for replies above. I am sorry as I did not give more details. Here is some more details on this.

  • There is exactly 1,696 numbers in each sequence.
  • There is NO negative numbers
  • Each number in the sequence is no more than 6 digits
  • The sequence range is from 000000 to 999999 and those 1696 numbers are generated from this range and then resets.
  • Each sequence has a Unique Number (UI) and a Common Number (CN). The Common Number for all 3 sequences below is 1500

The Unique Number for each sequence is below

Sequence 1 => 57 Sequence 2 => 58 Sequence 3 => 59

You can download the 3 sequences from this link => https://www.dropbox.com/s/wj5dmo1qtiwyce4/sequences.zip

I really hope the above information helps. if anyone has any specific questions, let me know.

Cheers

POSTED BY: H K

Are you sure such formula exists? Even The On-Line Encyclopedia of Integer Sequences does not have it or its simplified version:

# - #[[1]] &@{105208, 105508, 105637, 105934, 106208, 106258, 106377, 106769, 107293, 107537}

{0, 300, 429, 726, 1000, 1050, 1169, 1561, 2085, 2329}

Explaining what is the nature of the numbers, giving more numbers, and any additional information - desirably complete info about the problem - would be helpful.

POSTED BY: Sam Carrettie

Dear Sam,

I think that indeed such a formula does exist; it will not help a lot, but anyway. In fact there is an infinite number of sequences that fulfil these requirements. Let me construct one, which probably not the one HK wants. First we find an interpolating polynomial:

interpol = 
 InterpolatingPolynomial[{105208, 105508, 105637, 105934, 106208, 
   106258, 106377, 106769, 107293, 107537}, t]

gives:

105208 + (300 + (-(171/2) + (113/2 + (-(265/12) + (13/3 + (-(1/45) + (-(17/80) + (71/1152 - (487 (-9 + t))/45360) (-8 + t)) (-7 + t)) (-6 + t)) (-5 + t)) (-4 + t)) (-3 + t)) (-2 + t)) (-1 + t)

This polynomial will, by construction, go through all points in your list.

enter image description here

The problem is that if we evaluate the polynomial at integer values the results will not be integers. But we can fix that by applying the Floor function:

Table[Floor[interpol], {t, 1, 13}]

which gives:

{105208, 105508, 105637, 105934, 106208, 106258, 106377, 106769, 107293, 107537, 101918, 55301, -168469}

So that is one solution that fulfils your condition. In fact, you can extend your sequence with any sequence of integers you like such as the first 10 digits of Pi:

seq2 = Join[{105208, 105508, 105637, 105934, 106208, 106258, 106377, 
   106769, 107293, 107537}, RealDigits[Pi, 10, 10][[1]]]

then interpolate:

interpol2 = InterpolatingPolynomial[seq2, t]

105208 + (300 + (-(171/ 2) + (113/ 2 + (-(265/ 12) + (13/ 3 + (-(1/ 45) + (-(17/ 80) + (71/ 1152 + (-(487/ 45360) + (-(20383/ 725760) + (71051/ 2661120 + (-(535847/ 43545600) + (429343/ 113218560 + (-(3657119/ 4151347200) + (107564393/ 653837184000 + (-(457459/ 17791488000) + (307472309/ 88921857024000 + (-(18670559/45731240755200) + ( 61509179 (-19 + t))/1431118828339200) (-18 + t)) (-17 + t)) (-16 + t)) (-15 + t)) (-14 + t)) (-13 + t)) (-12 + t)) (-11 + t)) (-10 + t)) (-9 + t)) (-8 + t)) (-7 + t)) (-6 + t)) (-5 + t)) (-4 + t)) (-3 + t)) (-2 + t)) (-1 + t)

then plot to check it does the trick:

enter image description here

and then generate a list of integers with the help of the Floor function:

Table[Floor[interpol2], {t, 1, Length[seq2] + 2}]

which gives

{105208, 105508, 105637, 105934, 106208, 106258, 106377, 106769, 107293, 107537, 3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 9934449952, 190559866782}

It is quite obvious that this works in principle with any sequence of any length and you can extend it with all sorts of values, i.e. the first $m$ digits of $sqrt{2}$, or $sqrt{5}$, or $sqrt{7}$.

In fact if you take the first few elements of the Fibonacci sequence:

Table[Fibonacci[n], {n, 1, 12}] 

{1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144}

you get

interpol3 = InterpolatingPolynomial[Table[Fibonacci[n], {n, 1, 12}], t]

1 + (1/2 + (-(1/ 6) + (1/12 + (-(1/ 40) + (1/ 144 + (-(1/ 630) + (13/ 40320 + (-(1/ 17280) + (17/1814400 + (11 - t)/725760) (-10 + t)) (-9 + t)) (-8 + t)) (-7 + t)) (-6 + t)) (-5 + t)) (-4 + t)) (-3 + t)) (-2 + t) (-1 + t)

and

Table[Floor[interpol3], {t, 1, 14}]

generates a sequence that is consistent with your list of numbers.

This argument means:

  1. We have constructed an infinite number of solutions to your problem; most of the solutions will be useless to you.
  2. Mathematica would be a great help in any IQ test, where you have to construct these sequences. Instead of giving only one solution you can give an infinite number of solutions, even for random sequences. :-)

Cheers, Marco

PS: Just nit-picking... I fully agree with what you said that we need much more information to attempt to give a useful solution to the problem.

POSTED BY: Marco Thiel
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract