Group Abstract Group Abstract

Message Boards Message Boards

Use InterpolatingPolynomial with a lot of data?

Posted 6 years ago
POSTED BY: Jaime de la Mota
19 Replies

Also, As Jim States in his response, in addition to the data, it would be useful to know more about the origin of the data because that also helps guide the underlying decision of how to best model it.

Regards

POSTED BY: Neil Singer
Posted 6 years ago

I think that you might be putting the cart before the horse. (I'm using that expression to give a hint at my age.)

If you are trying to obtain a prediction equation based on temperature (maybe predicted by elevation? spatial coordinates?, etc.), it would be helpful (although maybe not completely appropriate for this forum) to spell out the whole problem. Starting out with what to do with the predictor variable as you describe seems premature (again, at least for us who don't know the available data and objective).

And can you not define a function that just interpolates from the surrounding 4 grid points?

Finally, when faced with too much data: Sample! This is what the field of statistics is all about. That way you can get a manageable set of calculations without sacrificing the fineness of your spatial grid of data.

POSTED BY: Jim Baldwin

Hello Jim.

I am using ampl to try to optimize a route. I have a model in which the speed in the X and Y direction depends on the temperature. AMPL, the software that I am using needs to be fed the value of T at each point of the optimized route. In the past I have used a known distribution, i. e. (1/(x[i]+y[i])^2), that way the value of T was known everywhere. Now, I have a the file with "real" data. I want to fin the interpolation polynomial to be able to feed AMPL an expression as the one two lines above.

Regards. Jaime.

POSTED BY: Jaime de la Mota

... or LinearModelFit if its linear...

POSTED BY: Neil Singer
Posted 6 years ago

Hello again. I have been testing relatively simple models using NonlinearModelFit; the results are unfortunately not what I am looking for. I am testing the models in question as

ModelSolution["AdjustedRSquared"]

being ModelSolution= NonlinearModelFit[AllData //. {x_List} :> x, a x +b x^2+ c y+ d ^2+ e(x*y), {a, b, c, d, e}, {y, x}] and so on.

The R squared coefficient that I am obtaining doesn't go over 0.7, and I would like having a higher value. Can you offer any advice on how to proceed?

Regards. Jaime.

POSTED BY: Updating Name
POSTED BY: Neil Singer
Attachments:
POSTED BY: Jaime de la Mota
Posted 6 years ago

Setting one's goal with respect to $R^2$ is not recommended as $R^2$ is a measure of how much of the variability is explained but maybe there's a lot or a little to explain. (Sometimes an $R^2$ of 0.99 is not adequate.)

Using NLMXB["EstimatedVariance"]^0.5` (i.e., the root mean square error using your notation) gives an estimate of precision in terms of the units being predicted is more understandable and absolute.

POSTED BY: Jim Baldwin

Thanks for the advice. Finally I have achieved a R^2 of 92%, but I was writing

NLMX1R = NonlinearModelFit[TX1R //. {x_List} :> x, 
  a *x + b*x^2 + c*x^3 + d*x^4 + e*x^5 + f*x^6 + g*y + h*y^2 + i*y^3 +
    j*(x*y) + k (x^2*y) + l*(x*y^2) + m (x*y)^2 + n*(x^3*y) + 
   o*(y^3*x) + p (x^3*y^2) + q (x^2 + y^3) + r (x*y)^3, {a, b, c, d, 
   e, f, g, h, i, j, k, l, m, n, o, p, q, r}, {y, x}]
Normal[%]
NLMX1R["AdjustedRSquared"]

Which probably would cause problems in the optimization software. I will try to learn how to understand this coefficient.

POSTED BY: Jaime de la Mota

Jaime,

You did not indicate that your x and y vectors were the full length of the temperature vector. In that case the formatting is even easier:

In[1]:= 
xvectl = {1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 
   5};
yvectl = {1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5};

In[2]:= data =  MapThread[{{#1, #2}, #3} &, {xvectl, yvectl, ts}]

Out[2]= {{{1, 1}, 111}, {{2, 1}, 122}, {{3, 1}, 133}, {{4, 1}, 
  144}, {{5, 2}, 211}, {{1, 2}, 222}, {{2, 2}, 233}, {{3, 2}, 
  244}, {{4, 3}, 311}, {{5, 3}, 322}, {{1, 3}, 333}, {{2, 3}, 
  344}, {{3, 4}, 411}, {{4, 4}, 422}, {{5, 4}, 433}, {{1, 4}, 
  444}, {{2, 5}, 511}, {{3, 5}, 522}, {{4, 5}, 533}, {{5, 5}, 544}}

However, Daniel is correct, Since the polynomial is guaranteed to go through every point, your interpolation will almost certainly be useless if MMA can even generate any answer at all.

My suggestion is to use NonlinearModelFit (or its related function, FindFit). In that case you can specify a function of any order and find the best fit. You can graph your data and the fitted function together and iterate until you are happy with the fitted function. Now you will have a lower order "best fit" to the data. In that case you need to change my code above because the fit functions take an {x,y,z} list,

data =  Transpose[{xvectl, yvectl, ts}]

{{1, 1, 111}, {2, 1, 122}, {3, 1, 133}, {4, 1, 144}, {5, 2, 211}, {1, 2, 222}, {2, 2, 233}, {3, 2, 244}, {4, 3, 311}, {5, 3, 322}, {1, 3, 333}, {2, 3, 344}, {3, 4, 411}, {4, 4, 422}, {5, 4, 433}, {1, 4, 444}, {2, 5, 511}, {3, 5, 522}, {4, 5, 533}, {5, 5, 544}}

You can also experiment with FindFormula but I am not sure how it behaves with large amounts of data. I hope this helps.

Regards,

Neil

POSTED BY: Neil Singer

Thank you very much Neil. I will study your proposed solution at once.

POSTED BY: Jaime de la Mota

Keep in mind that an interpolating polynomial will be of very high degree, which means it could wiggle considerably in regions where you would prefer it not do that, and also it will almost ceretainly give (huge) garbage if you go at all outside the rectangle (or other region) in which the points lie.

There are many ways to improve on this (radial basis methods, splines, local interpolating polynomials, convex combinations from neighboring values,...). But none of these comprise an "analytic" expression. Which leads me to suspect you are dealing with a limitation that will not be able to deliver sound results. I apologize for the pessimism. But I really doubt a huge interpolating polynomial will work well in the setting you describe.

POSTED BY: Daniel Lichtblau

I suspected as so. That is why I would like to know if there is any way to limit the order of the polynomial to third or fourth order. I don't think I am overlooking anything approaching the problem, but still. In one ocasion I solved a problem where the windspeed went as dx/dt= constant*y^3+thrust; I had to write it as:

var f1 {i in N} = w_x*x2[i]^3+u1[i];

DOTX1 {i in N1}: x1[i+1] = x1[i] + (1/6)step(f1[i] + 4*midf1[i] + f1[i+1]);

where u1 is the engine thrust. I don't know any way to do this without having the analytic expression of the wind (in the example above, y^3) or the temperature (in my current problem).

That is why I need the polynomial in question or at least, a reasonable aproximation.

POSTED BY: Jaime de la Mota

Jaime,

Based on your last response, I seems that your question is one of creating the right structure and not trying to reduce the computation. If that is the case then you can restructure your data as follows:

In[1]:= xvect = {1, 2, 3, 4, 5};
yvect = {11, 22, 33, 44};
tmat = {{111, 122, 133, 144}, {211, 222, 233, 244}, {311, 322, 333, 
    344}, {411, 422, 433, 444}, {511, 522, 533, 544}};

In[4]:= ts = Flatten[tmat]

Out[4]= {111, 122, 133, 144, 211, 222, 233, 244, 311, 322, 333, 344, \
411, 422, 433, 444, 511, 522, 533, 544}

In[5]:= xys = Tuples[{xvect, yvect}]

Out[5]= {{1, 11}, {1, 22}, {1, 33}, {1, 44}, {2, 11}, {2, 22}, {2, 
  33}, {2, 44}, {3, 11}, {3, 22}, {3, 33}, {3, 44}, {4, 11}, {4, 
  22}, {4, 33}, {4, 44}, {5, 11}, {5, 22}, {5, 33}, {5, 44}}

In[6]:= data = MapThread[{#1, #2} &, {xys, ts}]

Out[6]= {{{1, 11}, 111}, {{1, 22}, 122}, {{1, 33}, 133}, {{1, 44}, 
  144}, {{2, 11}, 211}, {{2, 22}, 222}, {{2, 33}, 233}, {{2, 44}, 
  244}, {{3, 11}, 311}, {{3, 22}, 322}, {{3, 33}, 333}, {{3, 44}, 
  344}, {{4, 11}, 411}, {{4, 22}, 422}, {{4, 33}, 433}, {{4, 44}, 
  444}, {{5, 11}, 511}, {{5, 22}, 522}, {{5, 33}, 533}, {{5, 44}, 
  544}}

Is this what you wanted?

Regards,

Neil

POSTED BY: Neil Singer

I don't think this is what I am looking for is as you can see in the attached images. I have my X values in a vector, my Y values in another vector and my f(x,y) in a third; however, the function doesn't seems to work, since the output is f(x,y). I would like to know why that code does not work and also if I can somehow limit the amount of terms in the expansion not to find a polynomial of order x^{300000}.

Thanks for the answer. Jaime.

Attachment

Attachment

POSTED BY: Jaime de la Mota

Jaime,

do you really need an analytical expression for your data? I guess you would rather need ListInterpolation. Here is a minimal example:

data = RandomReal[{-1, 1}, {20, 30}];
xBorder = {0, 100};
yBorder = {-30, 50};
func = ListInterpolation[data, {xBorder, yBorder}];

Then you can use func like a "regular function", e.g. in:

Plot3D[func[x, y], {x, Sequence @@ xBorder}, {y, Sequence @@ yBorder}]

Does that help? Regards -- Henrik

POSTED BY: Henrik Schachner

Hello. First of all, thanks for your interest in this question and your answer, but yes, I need the interpolation poynomial. I need the polynomial because I need to introduce this Temperature in an optimization model constructed using ampl; I have worked with token temperature distributions which definition as f(x, y) is known; I have seeked for a way to use the file as it is, but the optimization software needs an analytic expression and the interpolation polynomial is the best I can get.

POSTED BY: Jaime de la Mota

... well, difficult! If your data are sufficiently "smooth", maybe you can greatly reduce the resolution.

POSTED BY: Henrik Schachner

Indeed, if there is no remedy I might be forced to do that or try to limit the scope of my problem to a way smaller area. However, my first issue remains: How can I write

InterpolatingPolynomial[{{{0, 0}, 1}, {{1, 0}, 7}, {{0, 1}, 10}, {{2, 1}, 40}, {{3, 3}, 151}, {{1, 2}, 47}}, {x, y}].

in a more efficient way? Can somentihg like this be done?

InterpolatingPolynomial[{{{vectX, vectY}, vectT}}, {x, y}].

POSTED BY: Jaime de la Mota
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard