# Message Boards

Answer
(Unmark)

GROUPS:

11

# Statistical age determination of tree rings

Posted 3 months ago

Statistical age determination of tree rings Dr. Martin Ricker mricker@ib.unam.mx, martin_tuxtlas@yahoo.com.mx Out[]= Sun 17 Jan 2021 20:27:54
Introduction: Here I will explain the details of a notebook that I developed to carry out the algorithm and calculations for the following article: The Ricker, M., G. Gutiérrez-García, D. Juárez-Guerrero, & M.E.K. Evans. 2020. Statistical age determination of tree rings. PLoS ONE 15(9): e0239052 (20 pages). https://doi.org/10.1371/journal.pone.0239052 To provide the big picture of what this is about, it is useful to read first the article´s abstract: Dendrochronology, the study of annual rings formed by trees and woody plants, has important applications in research of climate and environmental phenomena of the past. Since its inception in the late 19th century, dendrochronology has not had a way to quantify uncertainty about the years assigned to each ring (dating). There are, however, many woody species and sites where it is difficult or impossible to delimit annual ring boundaries and verify them with crossdating, especially in the lowland tropics. Rather than ignoring dating uncertainty or discarding such samples as useless, we present for the first time a probabilistic approach to assign expected ages with a confidence interval. It is proven that the cumulative age in a tree-ring time series advances by an amount equal to the probability that a putative growth boundary is truly annual. Confidence curves for the tree stem radius as a function of uncertain ages are determined. A sensitivity analysis shows the effect of uncertainty of the probability that a recognizable boundary is annual, as well as of the number of expected missing boundaries. Furthermore, we derive a probabilistic version of the mean sensitivity of a dendrochronological time series, which quantifies a tree’s sensitivity to environmental variation over time, as well as probabilistic versions of the autocorrelation and process standard deviation. A computer code in Mathematica is provided, with sample input files, as supporting information. Further research is necessary to analyze frequency patterns of false and missing boundaries for different species and sites.
The input: The notebook imports an Excel file (attached below) of type “csv” (“comma delimited”); this type is automatically a single-sheet file, and the Mathematica import of this format works better for large files than the “xlsx” type. The name of our input file is “S2 Alchornea Input.csv”, where “S2” refers to the second supplementary file, and “Alchornea” is the genus name of the tree, whose data was analyzed. The input file is imported from the directory of the Mathematica notebook: In[]:= SetDirectory[NotebookDirectory[]];inputFile="S2 Alchornea Input.csv";originalData=Import[inputFile];TableForm[originalData] Out[]//TableForm=
As shown, the file consists of four columns and any number of rows. The first line contains the column titles; the exact column titles don’t matter, because they are only for orientation, and are not used further. Of the four columns, only the second (radius increments) and fourth (probabilities) are used by the algorithm, while the other two columns contain auxiliary information for labeling the output. The four columns contain the following data: 1) The boundary number is typically an integer, though for missing boundaries, we use for example 10, 10.5, 11, where 10.5 refers to a missing boundary after boundary 10. 2) The radius increment in centimeters (cm), though the unit is only used for labeling. The increments should be positive real number. 3) The boundary type indicates annual (probability = 1), recognizable (probability < 1), or missing boundaries. The column is only for orientation, and is not used further. 4) The probabilities should be positive real numbers larger than 0, and not larger than 1. For a recognizable boundary, the number represents the probability that a recognizable boundary is an annual, regular (and not a false) boundary. For a missing boundary, the number represents the probability that a missing annual boundary is expected at a specific location in a core sample. All probabilities after the base boundary are 0.83 here, but that must not always be so. The first row is the base boundary, numbered as boundary 0, and having a probability of 1 (recognizable and regular). It contains the increment of the first tree ring. The probabilities of the subsequent boundaries may either vary or all be the same. The last boundary of the sample is distinct in that it has no increment (0 cm).
Calculate the probabilities for the different possible ages each tree-ring boundary could have: As the first step, the column titles are deleted, and the data is sorted according to the chronology of the tree-ring boundaries (in our input file, the data is already ordered). Then the input probabilities are extracted in a list and rationalized. Rationalization causes the calculations to be carried out with exact numbers in Mathematica (0.83 becomes 83/100). This is recommended for two reasons: First, probabilities for some ages can become very small, and the sum of all probabilities for all possible ages of a given boundary may not be exactly 1 anymore. Second, interpolation of the discrete points in the cumulative probability function, for calculating confidence limits, may not work anymore with approximate numbers, for numerical reasons. In[]:= data=SortBy[Drop[originalData,1],First];probabilities=Rationalize[Transpose[data][[4]]]; The following code represents the key part to calculate iteratively the expected probabilities of different possible ages that each tree-ring boundary may have, starting at the base boundary (number 0). The base boundary is defined to have an age of 0 years and a probability of being an annual (regular) boundary of 1. In each iteration, an additional boundary is considered, and with it the number of possible ages augments, unless the input probability is 1. The boundary that is farthest away from the base boundary thus has the most possible ages, each with a certain probability. The sum of the probabilities of all possible ages for a given tree-ring boundary is always 1. In each iteration, the variables “calculatedProbsPerAge” is calculated for the corresponding tree-ring boundary. We will go through the first three iterations to see how the code works. In[]:= calculatedProbsPerAge={{-1},{1}};(*Age,probability*)ageResults=Table[If[probabilities[[i]]≠1,newAges=Append[calculatedProbsPerAge[[1]],calculatedProbsPerAge[[1,-1]]+1];newcalculatedProbs=Append[calculatedProbsPerAge[[2]]*(1-probabilities[[i]]),0]+Prepend[calculatedProbsPerAge[[2]]*probabilities[[i]],0],(*probabilities[[i]]==1*)newAges=calculatedProbsPerAge[[1]]+1;newcalculatedProbs=calculatedProbsPerAge[[2]]];(*endofIf*)calculatedProbsPerAge={newAges,newcalculatedProbs},{i,1,Length[data]}]; The input probability of the base boundary is 1 (for i = 1): In[]:= i=1;data[[i]]probabilities[[i]] Out[]= {0,0.124,Annual,1} Out[]= 1 Given the If-clause, the first iteration reduces to the following: In[]:= calculatedProbsPerAge={{-1},{1}};(*Age,probability*)newAges=calculatedProbsPerAge[[1]]+1newcalculatedProbs=calculatedProbsPerAge[[2]]calculatedProbsPerAge={newAges,newcalculatedProbs} Out[]= {0} Out[]= {1} Out[]= {{0},{1}} The starting values before the first iteration of “calculatedProbsPerAge” is set to a (meaningless) age of -1 year and a probability of 1. The addition of 1 in the first iteration results in an age of 0 years and 1 for the probability to represent an annual boundary, for the base boundary. The input data for the second iteration (i = 2) is: In[]:= i=2;data[[i]]probabilities[[i]] Out[]= {1,0.099,Recognizable,0.83} Out[]= 83 100 With the input probability not being 1, the first option of the If-clause applies: In[]:= calculatedProbsPerAge={{0},{1}};(*fromthelastiteration*)i=2;newAges=Append[calculatedProbsPerAge[[1]],calculatedProbsPerAge[[1,-1]]+1]newcalculatedProbs=Append[calculatedProbsPerAge[[2]]*(1-probabilities[[i]]),0]+Prepend[calculatedProbsPerAge[[2]]*probabilities[[i]],0]calculatedProbsPerAge={newAges,newcalculatedProbs} Out[]= {0,1} Out[]= 17 100 83 100 Out[]= {0,1}, 17 100 83 100 The second boundary thus is either 0 years old (it is not annual), with a probability of 0.17, or it is 1 year old with a probability of 0.83. The third iteration (i = 3) follows: In[]:= i=3;data[[i]]probabilities[[i]] Out[]= {2,0.256,Recognizable,0.83} Out[]= 83 100 In[]:= calculatedProbsPerAge={{0,1},{17/100,83/100}};(*fromthelastiteration*)i=3;newAges=Append[calculatedProbsPerAge[[1]],calculatedProbsPerAge[[1,-1]]+1]newcalculatedProbs=Append[calculatedProbsPerAge[[2]]*(1-probabilities[[i]]),0]+Prepend[calculatedProbsPerAge[[2]]*probabilities[[i]],0]calculatedProbsPerAge={newAges,newcalculatedProbs} Out[]= {0,1,2} Out[]= 289 10000 1411 5000 6889 10000 Out[]= {0,1,2}, 289 10000 1411 5000 6889 10000 If the input probability is smaller than 1, then there is a certain probability that the additional tree-ring boundary is false, and thus no age would be added. However, if the additional tree-ring boundary is annual, then an additional age is attached in the list of “newAges”. As for the corresponding probabilities “newcalculatedProbs”, a 0 is appended at the end of the list of the previous probabilities, and all previous probabilities are multiplied with 1 minus the new input probability. The result is a list of the probabilities in the case that the boundary is false. A second list is generated by prepending a 0 to the list of previous probabilities, and all previous probabilities are multiplied with the new input probability. The result is a list of probabilities in the case that boundary is annual. The two lists are added to provide the final probabilities for each possible age, as shown below: In[]:= calculatedProbsPerAge={{0,1},{17/100,83/100}};(*fromthelastiteration*)i=3;TableForm[{row1=Append[calculatedProbsPerAge[[2]]*(1-probabilities[[i]]),0],row2=Prepend[calculatedProbsPerAge[[2]]*probabilities[[i]],0],row1+row2},TableHeadings{{"Probabilities if boundary is false","Probabilities if boundary is annual","Sum of two probabilities"},{"0 years","1 year","2 years"}}] Out[]//TableForm=
Going through all 49 iterations results in the following output for the last tree-ring boundary, whose probabilities as exact numbers add up exactly to 1 (probabilities are shown as real numbers, rather than exact ratios):For easier interpretation, the exact numbers can be converted to real numbers with “N”: N[Transpose[ageResults[[-1]]]] {{0.,1.15225× -37 10 -35 10 -33 10 -31 10 -29 10 -28 10 -26 10 -25 10 -23 10 -22 10 -21 10 -20 10 -18 10 -17 10 -16 10 -15 10 -14 10 -13 10 -12 10 -11 10 -10 10 -10 10 -9 10 -8 10 -7 10 -7 10 -6 10 Total[ageResults[[-1,2]]] Out[]= 1 The list from "Transpose[ageResults[[-1]]]" represents a discrete probability density function that is both left and right-censured, shown in the following graph: In[]:= graph1=Show[ListPlot[Transpose[ageResults[[-1]]],Filling->Axis,PlotMarkers{"●",8},PlotStyle{Red},PlotRangeFull],Frame{{True,True},{True,True}},FrameStyle12,FrameLabel{{Style["Probability",Black,FontSize12],None},{Style["Age (years)",Black,FontSize12],Style["Probability density function of last boundary ("<>ToString[Length[probabilities]-1]<>")",ColorData[59,1],FontSize16]}},ImageSize400,AspectRatio0.75,ImageMargins8] Out[]=
Calculate the additional statistical information, including expected ages and confidence intervals: The result from the last section is a list of 49 sublists (corresponding to 49 boundaries for 48 tree rings), each consisting of 2 sub-sublists of variable length. One sub-sublist contains the possible ages and the other the corresponding probabilities. In[]:= Dimensions[ageResults] Out[]= {49,2} (*Showingtheresultfor4boundaries*)ageResults[[1;;4]] Out[]= {{0},{1}},{0,1}, 17 100 83 100 289 10000 1411 5000 6889 10000 4913 1000000 71961 1000000 351339 1000000 571787 1000000 This is easier to read in table form, and with probabilities as real numbers: In[]:= TableForm[N[ageResults[[1;;4]]]] Out[]//TableForm=
The fourth group corresponds to the third boundary after the base boundary. With a probability of 0.004913 all three boundaries after the base boundary are false, and the age of the wood at that part is still somewhere between 0 and 1 years. With a probability of 0.071961, the boundary is the first annual boundary after the base boundary (its age is 1 year), with 0.351339 the second (its age is 2 years), and with 0.571787 the third (its age is 3 years). The first parameters that are taken from that data are the minimum and maximum possible ages a boundary may have: In[]:= ageRanges=Table[MinMax[ageResults[[i,1]]],{i,1,Length[ageResults]}] Out[]= {{0,0},{0,1},{0,2},{0,3},{0,4},{0,5},{0,6},{0,7},{0,8},{0,9},{0,10},{0,11},{0,12},{0,13},{0,14},{0,15},{0,16},{0,17},{0,18},{0,19},{0,20},{0,21},{0,22},{0,23},{0,24},{0,25},{0,26},{0,27},{0,28},{0,29},{0,30},{0,31},{0,32},{0,33},{0,34},{0,35},{0,36},{0,37},{0,38},{0,39},{0,40},{0,41},{0,42},{0,43},{0,44},{0,45},{0,46},{0,47},{0,48}} Of more interest is the expected age of each boundary. It is calculated for each boundary by multiplying each possible age with the corresponding probability, and taking the sum of the products (exact numbers are again converted to real numbers with “N”): In[]:= expectedAges=Table[Total[ageResults[[i,1]]*ageResults[[i,2]]],{i,1,Length[ageResults]}];N[expectedAges] Out[]= {0.,0.83,1.66,2.49,3.32,4.15,4.98,5.81,6.64,7.47,8.3,9.13,9.96,10.79,11.62,12.45,13.28,14.11,14.94,15.77,16.6,17.43,18.26,19.09,19.92,20.75,21.58,22.41,23.24,24.07,24.9,25.73,26.56,27.39,28.22,29.05,29.88,30.71,31.54,32.37,33.2,34.03,34.86,35.69,36.52,37.35,38.18,39.01,39.84} The formula for calculating the variances is slightly more complicated. The square of the age is taken before multiplication with the corresponding probability, the sum is taken again, and the square of the expected age is rested. The variances are presented together with the number of data points, so that subsequently one could also calculate the corresponding standard errors. In[]:= varianceAgesNum=Table[{Total[ageResults[[i,1]]^2*ageResults[[i,2]]]-expectedAges[[i]]^2,Length[ageResults[[i,1]]]},{i,1,Length[ageResults]}];N[varianceAgesNum] Out[]= {{0.,1.},{0.1411,2.},{0.2822,3.},{0.4233,4.},{0.5644,5.},{0.7055,6.},{0.8466,7.},{0.9877,8.},{1.1288,9.},{1.2699,10.},{1.411,11.},{1.5521,12.},{1.6932,13.},{1.8343,14.},{1.9754,15.},{2.1165,16.},{2.2576,17.},{2.3987,18.},{2.5398,19.},{2.6809,20.},{2.822,21.},{2.9631,22.},{3.1042,23.},{3.2453,24.},{3.3864,25.},{3.5275,26.},{3.6686,27.},{3.8097,28.},{3.9508,29.},{4.0919,30.},{4.233,31.},{4.3741,32.},{4.5152,33.},{4.6563,34.},{4.7974,35.},{4.9385,36.},{5.0796,37.},{5.2207,38.},{5.3618,39.},{5.5029,40.},{5.644,41.},{5.7851,42.},{5.9262,43.},{6.0673,44.},{6.2084,45.},{6.3495,46.},{6.4906,47.},{6.6317,48.},{6.7728,49.}} The standard deviation of the possible ages around the expected age is the square root of the variance: In[]:= standardDeviations=Sqrt[Transpose[varianceAgesNum][[1]]];N[standardDeviations] Out[]= {0.,0.375633,0.531225,0.650615,0.751266,0.83994,0.920109,0.993831,1.06245,1.1269,1.18786,1.24583,1.30123,1.35436,1.40549,1.45482,1.50253,1.54877,1.59367,1.63735,1.67988,1.72137,1.76187,1.80147,1.84022,1.87816,1.91536,1.95185,1.98766,2.02284,2.05743,2.09143,2.1249,2.15785,2.1903,2.22227,2.2538,2.28489,2.31556,2.34583,2.37571,2.40522,2.43438,2.46319,2.49167,2.51982,2.54767,2.57521,2.60246} To calculate confidence intervals, the inverse of the continuous Cumulative Distribution Function is needed. The points of ages at different probabilities are interpolated with straight lines, resulting in a piecewise linear (“PL”) Function: In[]:= input=ageResults[[-1(*or1+48*)]];inversePLCDF48=Interpolation[Transpose[{Accumulate[input[[2]]],input[[1]]}],InterpolationOrder1] Out[]= As noted already above, the sum of all probabilities is exactly 1, which is the upper bound of the domain in the interpolating function. The lower bound of the domain is here 115225400457255426923013053222916919834651165519677685328641 / 96 10 -37 10 which most people would set equal to zero. However, there is a very low non-zero probability in our model that boundary number 48 is still 0 years old; this is because no boundary after the (first) base boundary had a probability of 1 to represent (necessarily) an annual boundary. The inverse Cumulative Distribution Function is used to find 95%-confidence limits for the possible age of the corresponding tree-ring boundary. To continue with exact numbers, the limit probabilities of 0.025 and 0.975 are rationalized: In[]:= lowerConfProb=Rationalize[0.025]upperConfProb=Rationalize[0.975] Out[]= 1 40 Out[]= 39 40 In[]:= N[inversePLCDF48[lowerConfProb]]N[inversePLCDF48[upperConfProb]] Out[]= 33.9617 Out[]= 44.1027 The function values (converted to real numbers) for graphing the inverse Cumulative Distribution Function are: In[]:= functionValues48=Transpose[{Accumulate[ageResults[[-1,2]]],ageResults[[-1,1]]}];N[functionValues48] Out[]= {{1.15225× -37 10 -35 10 -33 10 -31 10 -29 10 -28 10 -26 10 -25 10 -23 10 -22 10 -21 10 -19 10 -18 10 -17 10 -16 10 -15 10 -14 10 -13 10 -12 10 -11 10 -10 10 -10 10 -9 10 -8 10 -7 10 -7 10 -6 10 In[]:= graph2=Show[ListLinePlot[Transpose[{Accumulate[Transpose[Transpose[ageResults[[-1]]]][[2]]],Transpose[Transpose[ageResults[[-1]]]][[1]]}],PlotStyle{Red,AbsoluteThickness[1.3]},PlotMarkers{"●",7},PlotRangeFull],Axes->False,Frame{{True,True},{True,True}},FrameStyle12,FrameLabel{{Style["Age (years)",Black,FontSize12],None},{Style["Cumulative probability",Black,FontSize12],Style["Cumulative distribution function\nof last boundary ("<>ToString[Length[probabilities]-1]<>")",ColorData[59,1],FontSize16]}},ImageSize400,AspectRatio0.75,ImageMargins8] Out[]= The graph shows that on the left, the age is extremely sensitive to a change in cumulative probability, then there is a wide range where age is only moderately affected by a change in cumulative probability, and then there is a final range where age is again highly sensitive to a change in cumulative probability. The previous explanations were all for the last tree-ring boundary. The procedure to calculate the continuous Cumulative Distribution Function and determine 95%-confidence limits has to be carried out for each boundary after the base boundary: In[]:= confIntervals=Prepend[Table[input=ageResults[[i]];inversePLCDF=Interpolation[Transpose[{Accumulate[input[[2]]],input[[1]]}],InterpolationOrder1];{If[inversePLCDF[[1,1,1]]<lowerConfProb,inversePLCDF[lowerConfProb],input[[1,1]]],inversePLCDF[upperConfProb]},{i,2,Length[data]}],{0,0}];N[confIntervals] Out[]= {{0.,0.},{0.,0.96988},{0.,1.96371},{0.279137,2.95628},{1.06575,3.94732},{1.6 |