Message Boards Message Boards

Nonlinear fitting of reaction kinetics data with Kronecker delta and bootstrapping

6 Replies

Mathematica can solve with one condition: Prod[0] =0

POSTED BY: Mariusz Iwaniuk

From the initial attempt to fit the data I gather that Ald is the benzaldehyde. The data has numerous y values for the same x values though. Was this from multiple experiments? If so, it would make sense to perhaps ,average those y values.

Maybe more important is the question of what exactly is needed. Is it the DE solution? Or identification of the parameter value for kzero? If the latter, I would suggest using ParametricNDSolve, especially if there are known values for initial concentrations. One could proceed like so, using data from this post's notebook.

Average the values:

newdatabenz = 
 Map[{#[[1, 1]], Mean[#[[All, 2]]]} &, SplitBy[databenz, First]]

(* Out[118]= {{0., 0.0174805}, {361., 0.0102426}, {659., 
  0.006913}, {944., 0.00504}, {1241., 0.00391667}, {1528., 
  0.0037}, {1825., 0.00334333}} *)

Get a parametricized numeric ODE solution function for the aldehyde.

solP = ParametricNDSolveValue[{Ald'[t] == -kzero Ald[t] Hydr[t],
    Hydr'[t] == -kzero Ald[t] Hydr[t],
    Prod'[t] == kzero Ald[t] Hydr[t], Ald[0] == .026, Hydr[0] == .025,
     Prod[0] == 0}, Ald, {t, 0, 2000}, kzero];

Find a fit to the averaged data:

(* kzfit = 
 FindFit[newdatabenz, solP[kzero][t], {{kzero, .1}}, t]

Out[121]= {kzero -> 0.181871} *)

Plot the solution curve and data points:

kzval = kzero /. kzfit;
Show[ListPlot[newdatabenz, PlotStyle -> PointSize[0.025]], 
 Plot[solP[kzval][t], {t, 0, 1825}, PlotRange -> All],
 PlotRange -> All]

enter image description here

Not bad except for the first point.

POSTED BY: Daniel Lichtblau

Thank you for your interest. And I apologize if the notebook is not fully self-explanatory. It was really written initially assuming that everyone looking at it would have had a look at the peer-reviewed publication that is mentioned in the abstract. I would be happy to provide anyone who contacts me with the text of that publication ( buhlmann@umn.edu ). Let me give you a reply, which would probably even make more sense for someone who has had a look at that publication; but I will try to be as clear as I can. As you saw, there were multiple sets of y values. You cannot meaningfully average those, though. The data are all from one experiment, but the data are very different in character. As described in that publication, the data describe the reaction of one chemical reagent to a product reagent, and the different sets of data are spectral features of the reacting reagent and the product compound. Averaging these is not meaningful. However, each set of data reflects the same reaction kinetics, and therefore they all depend on the same reaction constant, which I labelled kzero. That really gives two different options. Either one can make a fit of each data set, which will give for three data sets three different values for kzero, and then one can apply usual statistics to determine an average and a confidence interval for kzero. The disadvantage of this method is that the confidence interval for this determination of kzero is then determined with a degree of freedom of 2, which does not reflect well that the original data contain many more data points. (Imagine for example what would happen if had only two data sets available and made to separate fits; you would get two kzero values, and with only one degree of freedom, the confidence interval for kzero would be huge, definitely not reflecting the large number of data points in the initial data.) The second option (which I chose) is to fit all data in one fit using the Kronecker Delta, which is really the main feature of this notebook. I think chemists underutilize the Kronecker Delta. The question then is how to get an estimate of the accuracy of the resulting kzero, a question for which I then used bootstrapping.

I will remark that accessing the article requires payment. I have no objection to that; I'm just noting that it poses a modest impediment to having Community readers read it along with this post. And the post comes close to being self-contained (which is good), other than, for me, some aspects I do not follow and will delve into next.

I'm still not entirely clear on what you want to do with the data. I now understand why averaging would be a bad idea. I confess I did not read too far down because what I saw early on left me confused. Let me explain why. Initially I saw uses of FindFit on the full data set. Only considerably later did the attempts at simultaneous fits appear (using KroneckerDelta). But my confusion hit earlier so I missed that. So now we know we want simultaneous fits, with kzero constant across all data. Of course for this to make sense at least one parameter must vary between the different measurements. But they all seem to arise from the same set of ODEs, with the same initial conditions (since they are measurements from the same experiment). So this is my cirrent point of confusion.

I agree that simultaneous fitting is likely underused, and I have no objection to the idea of using bootstrapping to estimate goodness-of-fit or similar. I'm just trying to figure out what might be needed to approach this in a way that avoids solving a difficult nonlinear system of equations if it is not absolutely necessary, and using ParametricNDSolve strikes me as a possible direction for this.

POSTED BY: Daniel Lichtblau

Thank you for your comments. In reply to your comment about accessibility, I did modify my earlier reply to you with a remark stating that I would be happy to share the text of that original publication with anyone who contacts me ( buhlmann@umn.edu ). I stated in my earlier reply that the three different data sets that are fitted simultaneously represent different spectral features. For those who dig deep and look at that publication, they will see that these spectral features refer to NMR spectroscopy (nuclear magnetic resonance spectroscopy). For those not familiar with this technique, let me give an example that is different but would follow a similar logic. Imagine a purple compound reacted in a chemical reaction to give a green product. Three different data sets would be the intensity of absorption of red and blue light of the reacting compound, and the intensity of absorption of green light of the product. Averaging the different data sets would not make any sense, but as all three data sets represent the same reaction, the same set of ODEs. And why is it preferable to solve the ODEs and not just make a fit with any function that gives a nice fit. The reaction constant, kzero, cannot be obtained from a fit with any function that gives a nice fit. It can only be obtained from using a physically meaningful model, as one gets it from solving the ODEs. That reaction constant, kzero, is a very useful parameter as it can be used to predict the speed of reaction of experiments that have not been performed, for example with different concentrations of the reacting compound. Hope that helps.

enter image description here -- you have earned Featured Contributor Badge enter image description here Your exceptional post has been selected for our editorial columns Staff Picks http://wolfr.am/StaffPicks and Publication Materials https://wolfr.am/PubMat and Your Profile is now distinguished by a Featured Contributor Badge and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: EDITORIAL BOARD
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract