Group Abstract Group Abstract

Message Boards Message Boards

Use InterpolatingPolynomial with a lot of data?

Posted 6 years ago

I have a matrix of size 681*441, that is 300321 points; the elements are satellite measurements of temperature in a geographical area; the observations are equally spaced

I also have a vector of size 681 and another of size 441with the values of latitude and longitude associated to each one of the previous measurements. I want to construct an interpollation polynomial using This function. However, there are two issues with this. The first is that I am unnable to write by hand the X(longitude), Y(latitude) and T(f(x,y)) values as in this example:

InterpolatingPolynomial[{{{0, 0}, 1}, {{1, 0}, 7}, {{0, 1}, 
   10}, {{2, 1}, 40}, {{3, 3}, 151}, {{1, 2}, 47}}, {x, y}] 

Also, I am worried on the fact that since there are so may points, the resulting expression will be too complicated for me to implement in an optimization software.

Can someone please advice me on how to proceed? Thanks.

POSTED BY: Jaime de la Mota
19 Replies

Also, As Jim States in his response, in addition to the data, it would be useful to know more about the origin of the data because that also helps guide the underlying decision of how to best model it.

Regards

POSTED BY: Neil Singer
Posted 6 years ago

I think that you might be putting the cart before the horse. (I'm using that expression to give a hint at my age.)

If you are trying to obtain a prediction equation based on temperature (maybe predicted by elevation? spatial coordinates?, etc.), it would be helpful (although maybe not completely appropriate for this forum) to spell out the whole problem. Starting out with what to do with the predictor variable as you describe seems premature (again, at least for us who don't know the available data and objective).

And can you not define a function that just interpolates from the surrounding 4 grid points?

Finally, when faced with too much data: Sample! This is what the field of statistics is all about. That way you can get a manageable set of calculations without sacrificing the fineness of your spatial grid of data.

POSTED BY: Jim Baldwin

Hello Jim.

I am using ampl to try to optimize a route. I have a model in which the speed in the X and Y direction depends on the temperature. AMPL, the software that I am using needs to be fed the value of T at each point of the optimized route. In the past I have used a known distribution, i. e. (1/(x[i]+y[i])^2), that way the value of T was known everywhere. Now, I have a the file with "real" data. I want to fin the interpolation polynomial to be able to feed AMPL an expression as the one two lines above.

Regards. Jaime.

POSTED BY: Jaime de la Mota

... or LinearModelFit if its linear...

POSTED BY: Neil Singer
Posted 6 years ago

Hello again. I have been testing relatively simple models using NonlinearModelFit; the results are unfortunately not what I am looking for. I am testing the models in question as

ModelSolution["AdjustedRSquared"]

being ModelSolution= NonlinearModelFit[AllData //. {x_List} :> x, a x +b x^2+ c y+ d ^2+ e(x*y), {a, b, c, d, e}, {y, x}] and so on.

The R squared coefficient that I am obtaining doesn't go over 0.7, and I would like having a higher value. Can you offer any advice on how to proceed?

Regards. Jaime.

POSTED BY: Updating Name

Jaime,

You should probably post your data, otherwise it is impossible to say what is wrong. Maybe it needs a different function prototype to fit it. Maybe as Jim states below, the data needs to be sampled (with appropriate smoothing/interpolation). Maybe the data is so noisy it is problematic. Its hard to say in the abstract.

To get you started, I modified an example from the documentation. You can plot your prototype curve fit and your data and visually see what is going on to try to gain some insight as to what may be wrong.

model=aa Exp[-bb ((x-x0)^2+(y-y0)^2)];
data=MapThread[{#1[[1]],#1[[2]],1.2 Exp[-34((#1-.56).(#1-.56))]+#2}&,{RandomReal[1,{100,2}],RandomReal[{-.1,.1},100]}];
fit=NonlinearModelFit[data,model,{aa,bb,{x0, 0.5},{y0,0.6}},{x,y}];
Show[Plot3D[fit["BestFit"],{x,0,1},{y,0,1},PlotRange->All],ListPointPlot3D[data,PlotStyle->Directive[PointSize[Medium],Red]]]

Regards,

Neil

POSTED BY: Neil Singer

Hello Neil. The data comes from a weather model. However, I cannot post it directly, since my boss doesn't allow me. Instead I can post a plot of said data.

My goal is to go from the point a=(-3, 40.25) to the point b=(-74, 40.5) the most efficient way. I have trimmed most of the points since they were very far away of the region of interest. I have now "only" 42441 points. By doing so, my R^2 goes up to 79%. I'd want it to be at least 90%.

I am working with the model data directly, it isn't smoothed.enter image description here

And about the fit, see the atttached file for my model.

Sorry I can't be more specific.

Regards.

Jaime.

Attachments:
POSTED BY: Jaime de la Mota
Posted 6 years ago

Setting one's goal with respect to $R^2$ is not recommended as $R^2$ is a measure of how much of the variability is explained but maybe there's a lot or a little to explain. (Sometimes an $R^2$ of 0.99 is not adequate.)

Using NLMXB["EstimatedVariance"]^0.5` (i.e., the root mean square error using your notation) gives an estimate of precision in terms of the units being predicted is more understandable and absolute.

POSTED BY: Jim Baldwin

Thanks for the advice. Finally I have achieved a R^2 of 92%, but I was writing

NLMX1R = NonlinearModelFit[TX1R //. {x_List} :> x, 
  a *x + b*x^2 + c*x^3 + d*x^4 + e*x^5 + f*x^6 + g*y + h*y^2 + i*y^3 +
    j*(x*y) + k (x^2*y) + l*(x*y^2) + m (x*y)^2 + n*(x^3*y) + 
   o*(y^3*x) + p (x^3*y^2) + q (x^2 + y^3) + r (x*y)^3, {a, b, c, d, 
   e, f, g, h, i, j, k, l, m, n, o, p, q, r}, {y, x}]
Normal[%]
NLMX1R["AdjustedRSquared"]

Which probably would cause problems in the optimization software. I will try to learn how to understand this coefficient.

POSTED BY: Jaime de la Mota
POSTED BY: Neil Singer

Thank you very much Neil. I will study your proposed solution at once.

POSTED BY: Jaime de la Mota

Keep in mind that an interpolating polynomial will be of very high degree, which means it could wiggle considerably in regions where you would prefer it not do that, and also it will almost ceretainly give (huge) garbage if you go at all outside the rectangle (or other region) in which the points lie.

There are many ways to improve on this (radial basis methods, splines, local interpolating polynomials, convex combinations from neighboring values,...). But none of these comprise an "analytic" expression. Which leads me to suspect you are dealing with a limitation that will not be able to deliver sound results. I apologize for the pessimism. But I really doubt a huge interpolating polynomial will work well in the setting you describe.

POSTED BY: Daniel Lichtblau

I suspected as so. That is why I would like to know if there is any way to limit the order of the polynomial to third or fourth order. I don't think I am overlooking anything approaching the problem, but still. In one ocasion I solved a problem where the windspeed went as dx/dt= constant*y^3+thrust; I had to write it as:

var f1 {i in N} = w_x*x2[i]^3+u1[i];

DOTX1 {i in N1}: x1[i+1] = x1[i] + (1/6)step(f1[i] + 4*midf1[i] + f1[i+1]);

where u1 is the engine thrust. I don't know any way to do this without having the analytic expression of the wind (in the example above, y^3) or the temperature (in my current problem).

That is why I need the polynomial in question or at least, a reasonable aproximation.

POSTED BY: Jaime de la Mota

Jaime,

Based on your last response, I seems that your question is one of creating the right structure and not trying to reduce the computation. If that is the case then you can restructure your data as follows:

In[1]:= xvect = {1, 2, 3, 4, 5};
yvect = {11, 22, 33, 44};
tmat = {{111, 122, 133, 144}, {211, 222, 233, 244}, {311, 322, 333, 
    344}, {411, 422, 433, 444}, {511, 522, 533, 544}};

In[4]:= ts = Flatten[tmat]

Out[4]= {111, 122, 133, 144, 211, 222, 233, 244, 311, 322, 333, 344, \
411, 422, 433, 444, 511, 522, 533, 544}

In[5]:= xys = Tuples[{xvect, yvect}]

Out[5]= {{1, 11}, {1, 22}, {1, 33}, {1, 44}, {2, 11}, {2, 22}, {2, 
  33}, {2, 44}, {3, 11}, {3, 22}, {3, 33}, {3, 44}, {4, 11}, {4, 
  22}, {4, 33}, {4, 44}, {5, 11}, {5, 22}, {5, 33}, {5, 44}}

In[6]:= data = MapThread[{#1, #2} &, {xys, ts}]

Out[6]= {{{1, 11}, 111}, {{1, 22}, 122}, {{1, 33}, 133}, {{1, 44}, 
  144}, {{2, 11}, 211}, {{2, 22}, 222}, {{2, 33}, 233}, {{2, 44}, 
  244}, {{3, 11}, 311}, {{3, 22}, 322}, {{3, 33}, 333}, {{3, 44}, 
  344}, {{4, 11}, 411}, {{4, 22}, 422}, {{4, 33}, 433}, {{4, 44}, 
  444}, {{5, 11}, 511}, {{5, 22}, 522}, {{5, 33}, 533}, {{5, 44}, 
  544}}

Is this what you wanted?

Regards,

Neil

POSTED BY: Neil Singer

I don't think this is what I am looking for is as you can see in the attached images. I have my X values in a vector, my Y values in another vector and my f(x,y) in a third; however, the function doesn't seems to work, since the output is f(x,y). I would like to know why that code does not work and also if I can somehow limit the amount of terms in the expansion not to find a polynomial of order x^{300000}.

Thanks for the answer. Jaime.

Attachment

Attachment

POSTED BY: Jaime de la Mota

Jaime,

do you really need an analytical expression for your data? I guess you would rather need ListInterpolation. Here is a minimal example:

data = RandomReal[{-1, 1}, {20, 30}];
xBorder = {0, 100};
yBorder = {-30, 50};
func = ListInterpolation[data, {xBorder, yBorder}];

Then you can use func like a "regular function", e.g. in:

Plot3D[func[x, y], {x, Sequence @@ xBorder}, {y, Sequence @@ yBorder}]

Does that help? Regards -- Henrik

POSTED BY: Henrik Schachner

Hello. First of all, thanks for your interest in this question and your answer, but yes, I need the interpolation poynomial. I need the polynomial because I need to introduce this Temperature in an optimization model constructed using ampl; I have worked with token temperature distributions which definition as f(x, y) is known; I have seeked for a way to use the file as it is, but the optimization software needs an analytic expression and the interpolation polynomial is the best I can get.

POSTED BY: Jaime de la Mota

... well, difficult! If your data are sufficiently "smooth", maybe you can greatly reduce the resolution.

POSTED BY: Henrik Schachner

Indeed, if there is no remedy I might be forced to do that or try to limit the scope of my problem to a way smaller area. However, my first issue remains: How can I write

InterpolatingPolynomial[{{{0, 0}, 1}, {{1, 0}, 7}, {{0, 1}, 10}, {{2, 1}, 40}, {{3, 3}, 151}, {{1, 2}, 47}}, {x, y}].

in a more efficient way? Can somentihg like this be done?

InterpolatingPolynomial[{{{vectX, vectY}, vectT}}, {x, y}].

POSTED BY: Jaime de la Mota
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard