Group Abstract

Message Boards

WOLFRAM COMMUNITY

9K Views

19 Replies

11 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Control Systems Data Science Engineering Mathematics Algebra Wolfram Language

Use InterpolatingPolynomial with a lot of data?

Jaime de la Mota

Posted 6 years ago

I have a matrix of size 681*441, that is 300321 points; the elements are satellite measurements of temperature in a geographical area; the observations are equally spaced I also have a vector of size 681 and another of size 441with the values of latitude and longitude associated to each one of the previous measurements. I want to construct an interpollation polynomial using This function. However, there are two issues with this. The first is that I am unnable to write by hand the X(longitude), Y(latitude) and T(f(x,y)) values as in this example: InterpolatingPolynomial[{{{0, 0}, 1}, {{1, 0}, 7}, {{0, 1}, 10}, {{2, 1}, 40}, {{3, 3}, 151}, {{1, 2}, 47}}, {x, y}] Also, I am worried on the fact that since there are so may points, the resulting expression will be too complicated for me to implement in an optimization software. Can someone please advice me on how to proceed? Thanks.

POSTED BY: Jaime de la Mota

19 Replies

Sort By:

Neil Singer

Neil Singer, AC Kinetics, Inc.

Posted 6 years ago

Also, As Jim States in his response, in addition to the data, it would be useful to know more about the origin of the data because that also helps guide the underlying decision of how to best model it. Regards

POSTED BY: Neil Singer

Jim Baldwin

Posted 6 years ago

I think that you might be putting the cart before the horse. (I'm using that expression to give a hint at my age.) If you are trying to obtain a prediction equation based on temperature (maybe predicted by elevation? spatial coordinates?, etc.), it would be helpful (although maybe not completely appropriate for this forum) to spell out the whole problem. Starting out with what to do with the predictor variable as you describe seems premature (again, at least for us who don't know the available data and objective). And can you not define a function that just interpolates from the surrounding 4 grid points? Finally, when faced with too much data: Sample! This is what the field of statistics is all about. That way you can get a manageable set of calculations without sacrificing the fineness of your spatial grid of data.

POSTED BY: Jim Baldwin

Jaime de la Mota

Posted 6 years ago

Hello Jim. I am using ampl to try to optimize a route. I have a model in which the speed in the X and Y direction depends on the temperature. AMPL, the software that I am using needs to be fed the value of T at each point of the optimized route. In the past I have used a known distribution, i. e. (1/(x[i]+y[i])^2), that way the value of T was known everywhere. Now, I have a the file with "real" data. I want to fin the interpolation polynomial to be able to feed AMPL an expression as the one two lines above. Regards. Jaime.

POSTED BY: Jaime de la Mota

Neil Singer

Neil Singer, AC Kinetics, Inc.

Posted 6 years ago

... or LinearModelFit if its linear...

POSTED BY: Neil Singer

Updating Name

Posted 6 years ago

Hello again. I have been testing relatively simple models using NonlinearModelFit; the results are unfortunately not what I am looking for. I am testing the models in question as ModelSolution["AdjustedRSquared"] being ModelSolution= NonlinearModelFit[AllData //. {x_List} :> x, a x +b x^2+ c y+ d ^2+ e(x*y), {a, b, c, d, e}, {y, x}] and so on. The R squared coefficient that I am obtaining doesn't go over 0.7, and I would like having a higher value. Can you offer any advice on how to proceed? Regards. Jaime.

POSTED BY: Updating Name

Neil Singer

Neil Singer, AC Kinetics, Inc.

Posted 6 years ago

Jaime, You should probably post your data, otherwise it is impossible to say what is wrong. Maybe it needs a different function prototype to fit it. Maybe as Jim states below, the data needs to be sampled (with appropriate smoothing/interpolation). Maybe the data is so noisy it is problematic. Its hard to say in the abstract. To get you started, I modified an example from the documentation. You can plot your prototype curve fit and your data and visually see what is going on to try to gain some insight as to what may be wrong. model=aa Exp[-bb ((x-x0)^2+(y-y0)^2)]; data=MapThread[{#1[[1]],#1[[2]],1.2 Exp[-34((#1-.56).(#1-.56))]+#2}&,{RandomReal[1,{100,2}],RandomReal[{-.1,.1},100]}]; fit=NonlinearModelFit[data,model,{aa,bb,{x0, 0.5},{y0,0.6}},{x,y}]; Show[Plot3D[fit["BestFit"],{x,0,1},{y,0,1},PlotRange->All],ListPointPlot3D[data,PlotStyle->Directive[PointSize[Medium],Red]]] Regards, Neil

POSTED BY: Neil Singer

Jaime de la Mota

Posted 6 years ago

Hello Neil. The data comes from a weather model. However, I cannot post it directly, since my boss doesn't allow me. Instead I can post a plot of said data. My goal is to go from the point a=(-3, 40.25) to the point b=(-74, 40.5) the most efficient way. I have trimmed most of the points since they were very far away of the region of interest. I have now "only" 42441 points. By doing so, my R^2 goes up to 79%. I'd want it to be at least 90%. I am working with the model data directly, it isn't smoothed. And about the fit, see the atttached file for my model. Sorry I can't be more specific. Regards. Jaime. Attachments: model_fit.nb

POSTED BY: Jaime de la Mota

Jim Baldwin

Posted 6 years ago

Setting one's goal with respect to $R^2$ is not recommended as $R^2$ is a measure of how much of the variability is explained but maybe there's a lot or a little to explain. (Sometimes an $R^2$ of 0.99 is not adequate.) Using `NLMX`B["EstimatedVariance"]^0.5` (i.e., the root mean square error using your notation) gives an estimate of precision in terms of the units being predicted is more understandable and absolute.

POSTED BY: Jim Baldwin

Jaime de la Mota

Posted 6 years ago

Thanks for the advice. Finally I have achieved a R^2 of 92%, but I was writing NLMX1R = NonlinearModelFit[TX1R //. {x_List} :> x, a x + bx^2 + cx^3 + dx^4 + ex^5 + fx^6 + gy + hy^2 + iy^3 + j(xy) + k (x^2y) + l(xy^2) + m (xy)^2 + n(x^3y) + o(y^3x) + p (x^3y^2) + q (x^2 + y^3) + r (x*y)^3, {a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r}, {y, x}] Normal[%] NLMX1R["AdjustedRSquared"] Which probably would cause problems in the optimization software. I will try to learn how to understand this coefficient.

Thanks for the advice. Finally I have achieved a R^2 of 92%, but I was writing

NLMX1R = NonlinearModelFit[TX1R //. {x_List} :> x, 
  a *x + b*x^2 + c*x^3 + d*x^4 + e*x^5 + f*x^6 + g*y + h*y^2 + i*y^3 +
    j*(x*y) + k (x^2*y) + l*(x*y^2) + m (x*y)^2 + n*(x^3*y) + 
   o*(y^3*x) + p (x^3*y^2) + q (x^2 + y^3) + r (x*y)^3, {a, b, c, d, 
   e, f, g, h, i, j, k, l, m, n, o, p, q, r}, {y, x}]
Normal[%]
NLMX1R["AdjustedRSquared"]

Which probably would cause problems in the optimization software. I will try to learn how to understand this coefficient.

POSTED BY: Jaime de la Mota

Neil Singer

Neil Singer, AC Kinetics, Inc.

Posted 6 years ago

Jaime, You did not indicate that your x and y vectors were the full length of the temperature vector. In that case the formatting is even easier: In[1]:= xvectl = {1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5}; yvectl = {1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5}; In[2]:= data = MapThread[{{#1, #2}, #3} &, {xvectl, yvectl, ts}] Out[2]= {{{1, 1}, 111}, {{2, 1}, 122}, {{3, 1}, 133}, {{4, 1}, 144}, {{5, 2}, 211}, {{1, 2}, 222}, {{2, 2}, 233}, {{3, 2}, 244}, {{4, 3}, 311}, {{5, 3}, 322}, {{1, 3}, 333}, {{2, 3}, 344}, {{3, 4}, 411}, {{4, 4}, 422}, {{5, 4}, 433}, {{1, 4}, 444}, {{2, 5}, 511}, {{3, 5}, 522}, {{4, 5}, 533}, {{5, 5}, 544}} However, Daniel is correct, Since the polynomial is guaranteed to go through every point, your interpolation will almost certainly be useless if MMA can even generate any answer at all. My suggestion is to use NonlinearModelFit (or its related function, FindFit). In that case you can specify a function of any order and find the best fit. You can graph your data and the fitted function together and iterate until you are happy with the fitted function. Now you will have a lower order "best fit" to the data. In that case you need to change my code above because the fit functions take an {x,y,z} list, data = Transpose[{xvectl, yvectl, ts}] {{1, 1, 111}, {2, 1, 122}, {3, 1, 133}, {4, 1, 144}, {5, 2, 211}, {1, 2, 222}, {2, 2, 233}, {3, 2, 244}, {4, 3, 311}, {5, 3, 322}, {1, 3, 333}, {2, 3, 344}, {3, 4, 411}, {4, 4, 422}, {5, 4, 433}, {1, 4, 444}, {2, 5, 511}, {3, 5, 522}, {4, 5, 533}, {5, 5, 544}} You can also experiment with FindFormula but I am not sure how it behaves with large amounts of data. I hope this helps. Regards, Neil

POSTED BY: Neil Singer

Jaime de la Mota

Posted 6 years ago

Thank you very much Neil. I will study your proposed solution at once.

POSTED BY: Jaime de la Mota

Daniel Lichtblau

Daniel Lichtblau, Wolfram Research

Posted 6 years ago

Keep in mind that an interpolating polynomial will be of very high degree, which means it could wiggle considerably in regions where you would prefer it not do that, and also it will almost ceretainly give (huge) garbage if you go at all outside the rectangle (or other region) in which the points lie. There are many ways to improve on this (radial basis methods, splines, local interpolating polynomials, convex combinations from neighboring values,...). But none of these comprise an "analytic" expression. Which leads me to suspect you are dealing with a limitation that will not be able to deliver sound results. I apologize for the pessimism. But I really doubt a huge interpolating polynomial will work well in the setting you describe.

POSTED BY: Daniel Lichtblau

Jaime de la Mota

Posted 6 years ago

I suspected as so. That is why I would like to know if there is any way to limit the order of the polynomial to third or fourth order. I don't think I am overlooking anything approaching the problem, but still. In one ocasion I solved a problem where the windspeed went as dx/dt= constanty^3+thrust; I had to write it as: var f1 {i in N} = w_xx2[i]^3+u1[i]; DOTX1 {i in N1}: x1[i+1] = x1[i] + (1/6)step(f1[i] + 4*midf1[i] + f1[i+1]); where u1 is the engine thrust. I don't know any way to do this without having the analytic expression of the wind (in the example above, y^3) or the temperature (in my current problem). That is why I need the polynomial in question or at least, a reasonable aproximation.

POSTED BY: Jaime de la Mota

Neil Singer

Neil Singer, AC Kinetics, Inc.

Posted 6 years ago

Jaime, Based on your last response, I seems that your question is one of creating the right structure and not trying to reduce the computation. If that is the case then you can restructure your data as follows: In[1]:= xvect = {1, 2, 3, 4, 5}; yvect = {11, 22, 33, 44}; tmat = {{111, 122, 133, 144}, {211, 222, 233, 244}, {311, 322, 333, 344}, {411, 422, 433, 444}, {511, 522, 533, 544}}; In[4]:= ts = Flatten[tmat] Out[4]= {111, 122, 133, 144, 211, 222, 233, 244, 311, 322, 333, 344, \ 411, 422, 433, 444, 511, 522, 533, 544} In[5]:= xys = Tuples[{xvect, yvect}] Out[5]= {{1, 11}, {1, 22}, {1, 33}, {1, 44}, {2, 11}, {2, 22}, {2, 33}, {2, 44}, {3, 11}, {3, 22}, {3, 33}, {3, 44}, {4, 11}, {4, 22}, {4, 33}, {4, 44}, {5, 11}, {5, 22}, {5, 33}, {5, 44}} In[6]:= data = MapThread[{#1, #2} &, {xys, ts}] Out[6]= {{{1, 11}, 111}, {{1, 22}, 122}, {{1, 33}, 133}, {{1, 44}, 144}, {{2, 11}, 211}, {{2, 22}, 222}, {{2, 33}, 233}, {{2, 44}, 244}, {{3, 11}, 311}, {{3, 22}, 322}, {{3, 33}, 333}, {{3, 44}, 344}, {{4, 11}, 411}, {{4, 22}, 422}, {{4, 33}, 433}, {{4, 44}, 444}, {{5, 11}, 511}, {{5, 22}, 522}, {{5, 33}, 533}, {{5, 44}, 544}} Is this what you wanted? Regards, Neil

Jaime,

Based on your last response, I seems that your question is one of creating the right structure and not trying to reduce the computation. If that is the case then you can restructure your data as follows:

In[1]:= xvect = {1, 2, 3, 4, 5};
yvect = {11, 22, 33, 44};
tmat = {{111, 122, 133, 144}, {211, 222, 233, 244}, {311, 322, 333, 
    344}, {411, 422, 433, 444}, {511, 522, 533, 544}};

In[4]:= ts = Flatten[tmat]

Out[4]= {111, 122, 133, 144, 211, 222, 233, 244, 311, 322, 333, 344, \
411, 422, 433, 444, 511, 522, 533, 544}

In[5]:= xys = Tuples[{xvect, yvect}]

Out[5]= {{1, 11}, {1, 22}, {1, 33}, {1, 44}, {2, 11}, {2, 22}, {2, 
  33}, {2, 44}, {3, 11}, {3, 22}, {3, 33}, {3, 44}, {4, 11}, {4, 
  22}, {4, 33}, {4, 44}, {5, 11}, {5, 22}, {5, 33}, {5, 44}}

In[6]:= data = MapThread[{#1, #2} &, {xys, ts}]

Out[6]= {{{1, 11}, 111}, {{1, 22}, 122}, {{1, 33}, 133}, {{1, 44}, 
  144}, {{2, 11}, 211}, {{2, 22}, 222}, {{2, 33}, 233}, {{2, 44}, 
  244}, {{3, 11}, 311}, {{3, 22}, 322}, {{3, 33}, 333}, {{3, 44}, 
  344}, {{4, 11}, 411}, {{4, 22}, 422}, {{4, 33}, 433}, {{4, 44}, 
  444}, {{5, 11}, 511}, {{5, 22}, 522}, {{5, 33}, 533}, {{5, 44}, 
  544}}

Is this what you wanted?

Regards,

Neil

POSTED BY: Neil Singer

Jaime de la Mota

Posted 6 years ago

I don't think this is what I am looking for is as you can see in the attached images. I have my X values in a vector, my Y values in another vector and my f(x,y) in a third; however, the function doesn't seems to work, since the output is f(x,y). I would like to know why that code does not work and also if I can somehow limit the amount of terms in the expansion not to find a polynomial of order x^{300000}. Thanks for the answer. Jaime. Attachments: mathematica1.png mathematica2.png

POSTED BY: Jaime de la Mota

Henrik Schachner

Henrik Schachner, Radiation Therapy Center, Weilheim, Germany

Posted 6 years ago

Jaime, do you really need an analytical expression for your data? I guess you would rather need `ListInterpolation`. Here is a minimal example: data = RandomReal[{-1, 1}, {20, 30}]; xBorder = {0, 100}; yBorder = {-30, 50}; func = ListInterpolation[data, {xBorder, yBorder}]; Then you can use `func` like a "regular function", e.g. in: Plot3D[func[x, y], {x, Sequence @@ xBorder}, {y, Sequence @@ yBorder}] Does that help? Regards -- Henrik

POSTED BY: Henrik Schachner

Jaime de la Mota

Posted 6 years ago

Hello. First of all, thanks for your interest in this question and your answer, but yes, I need the interpolation poynomial. I need the polynomial because I need to introduce this Temperature in an optimization model constructed using ampl; I have worked with token temperature distributions which definition as f(x, y) is known; I have seeked for a way to use the file as it is, but the optimization software needs an analytic expression and the interpolation polynomial is the best I can get.

POSTED BY: Jaime de la Mota

Henrik Schachner

Henrik Schachner, Radiation Therapy Center, Weilheim, Germany

Posted 6 years ago

... well, difficult! If your data are sufficiently "smooth", maybe you can greatly reduce the resolution.

POSTED BY: Henrik Schachner

Jaime de la Mota

Posted 6 years ago

Indeed, if there is no remedy I might be forced to do that or try to limit the scope of my problem to a way smaller area. However, my first issue remains: How can I write InterpolatingPolynomial[{{{0, 0}, 1}, {{1, 0}, 7}, {{0, 1}, 10}, {{2, 1}, 40}, {{3, 3}, 151}, {{1, 2}, 47}}, {x, y}]. in a more efficient way? Can somentihg like this be done? InterpolatingPolynomial[{{{vectX, vectY}, vectT}}, {x, y}].

POSTED BY: Jaime de la Mota

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback