Block correlation analysis

From Cybis Wiki
Jump to navigation Jump to search

When you check the quality of a match between a dendro curve and a reference curve, you should check not only how the two curves match overall, but also how various short blocks (segments) of the two curves match each other. This type of checking is also named "Correlation by segment".

Blocks (segments) of the curve being tested are defined with a selectable length (e.g. 30) and with differing starting points, e.g. lagged by 10 years. I.e. starting at 0 (0-9), 10 (10-39), 20 (20-49), 30 (30-59) etc. This means that the blocks being tested may overlap.

In a CDendro "Block correlation analys" we also find out where these blocks match best and next best and next-next best towards the reference. The results are useful when the curve being tested is not already crossdated. Then the block check may reveal a missed or extra ring, see example below.

Note that the best or next-best (etc) match of the block itself is then (also) recalculated to the corresponding best or next-best position of the whole curve. That value is then named "SetsSampleTo".

"Extended block correlation analysis"
There is a variant of the "Block correlation analysis" of CDendro, named "Extended block correlation analysis". Another name could have been "Growing block-length correlation analysis". Then after the test of a block, that block is lengthened with the start-of-next-block distance used (e.g. 10 when lagged by 10 years) and that new block is tested for best and next-best matches. Then it is again lengthened and a new test is run. This goes on until the block cannot be made longer. All the resulting "SetsSampleTo"-values are saved and sorted.

This extended type of analysis takes time to run when samples and reference are long, so it has to be turned on before the test is ordered in CDendro.

To make this type of extended test work well when you are searching for a match, it may be suitable to set the block length quite high, e.g. to 70. The smaller blocks of length 30, 40, 50, 60 will anyhow always find nice best and next-best positions which are nevertheless wrong.

Here is the result of an attempt to crossdate a curve towards a reference. At first there is no reasonable match at all! So an (extended) Block correlation analysis has been run.

SN009A_of_D:\ake\tree\sn\SNMEAS.rwl using No detrend
compared to the reference 
D:\ake\tree\DEC\Namdo of length 414 using No detrend  Dated to 1995
Minimum overlap used when finding best match: 50

Table sorted by Proportion of last two years growth (2,0,T (P2Yrs)/TTest 

--Rel Over  *P2Yrs------  (year)              *Corr
-year  lap   CorrC TTest                     StdDev
  191  123    0.31   3.5  (1804)             (0.25)
   63  123    0.29   3.3  (1932)             (0.12)
  331   81    0.33   3.2  (1664)             (0.16)
  340   72    0.32   2.8  (1655)             (0.21)
  255  123    0.24   2.8  (1740)             (0.14)
  313   99    0.26   2.6  (1682)             (0.19)
  227  123    0.23   2.6  (1768)             (0.18)
   51  123    0.23   2.6  (1944)             (0.08)

SN009A_of_D:\ake\tree\sn\SNMEAS.rwl compared to the reference D:\ake\tree\DEC\Namdo
Best matches for the whole sample:   191 3.53 (1804)     63 3.31 (1932)    331 3.16 (1664)  
The sample is currently dated to 1989 which is used in the "Aimed at" column.
Block length: 30   Table sorted by Proportion of last two years growth (2,0,T (P2Yrs)/TTest

Block  -----Aimed------   -------Best    ------------Three best matches with {hitAt,Prop2Yrs,SetsSampleTo}---
start  --------at  year   around that    ---1stBestMatch-------    ---2ndBestMatch-------    ---3rdBestMatch-------
    0      6 0.41  1989       6 0.41       139 0.49  139 (1856)       61 0.46   61 (1934)       32 0.46   32 (1963)
   10     16 0.62  1979      16 0.62        16 0.62    6 (1989)      287 0.44  277 (1718)        6 0.43   -4 (1999)
   20     26 0.72  1969      26 0.72        26 0.72    6 (1989)      271 0.60  251 (1744)       93 0.51   73 (1922)
   30     36 0.70  1959      36 0.70        36 0.70    6 (1989)      281 0.59  251 (1744)      370 0.54  340 (1655)
   40     46 0.54  1949      46 0.54       260 0.59  220 (1775)       46 0.54    6 (1989)      380 0.50  340 (1655)
   50     56 0.08  1939      54 0.36*      381 0.58  331 (1664)      320 0.51  270 (1725)      183 0.47  133 (1862)
   60     66-0.09  1929      65 0.27*      123 0.53   63 (1932)      315 0.51  255 (1740)      121 0.50   61 (1934)
   70     76-0.41  1919      75 0.64*       75 0.64    5 (1990)      289 0.59  219 (1776)      327 0.51  257 (1738)
   80     86-0.47  1909      85 0.72*       85 0.72    5 (1990)      337 0.57  257 (1738)      162 0.56   82 (1913)
   90     96-0.56  1899      95 0.78*       95 0.78    5 (1990)      281 0.63  191 (1804)      145 0.61   55 (1940)
Lowest block CorrC = -0.56 at index 90, year=1899

Results may be influenced by your minimum overlap setting and the existence of zero rings.

SetsSampleTo alternatives sorted from blocktest above - only BEST matches:
 1664      1775      1856      1932      1989(3)   1990(3)  
Most recurrent year: 1989(3)  1990(3)

SetsSampleTo alternatives sorted from blocktest above best/nextbest/3rdBest:
 1655(2)   1664      1718      1725      1738(2)   1740      1744(2)   1775      1776      1804      1856     
 1862      1913      1922      1932      1934(2)   1940      1963      1989(4)   1990(3)   1999     
Most recurrent year: 1989(4)

SetsSampleTo alternatives sorted from extended block length test, all BEST matches:
 1664(2)   1738(2)   1768      1775      1804(3)   1856      1932(10)  1989(18)  1990(8)  
Most recurrent year: 1989(18)

SetsSampleTo alternatives sorted from extended block length test, BEST and NEXT best matches:
 1664(8)   1718      1725      1738(5)   1740(5)   1744(8)   1768(3)   1775      1776      1804(5)   1854     
 1856      1932(20)  1934      1963(2)   1964      1989(19)  1990(9)  
Most recurrent year: 1932(20)

Do notice the year-group 1989-1990 alternative, which corresponds to a reasonable match for six of our ten 30 years long blocks and as much as 28 matches in the extended test!

The jump from 1989 to 1990 in the "1stBestMatch" column is because of a false ring that should be removed. After we have removed that ring from our measurements we get these results:

SN009A_of_D:\ake\tree\sn\SNMEAS.rwl using No detrend
compared to the reference 
D:\ake\tree\DEC\Namdo of length 414 using No detrend  Dated to 1995
Minimum overlap used when finding best match: 50

Table sorted by Proportion of last two years growth (2,0,T (P2Yrs)/TTest 

--Rel Over  *P2Yrs------  (year)              *Corr
-year  lap   CorrC TTest                     StdDev
    6  123    0.60   8.3  (1989) (as dated)  (0.15)
   32  123    0.37   4.4  (1963)             (0.04)
  258  123    0.28   3.2  (1737)             (0.16)
  142  123    0.26   2.9  (1853)             (0.15)
  331   82    0.31   2.9  (1664)             (0.14)
  -46   77    0.29   2.6  (2041)             (0.12)
  289  123    0.23   2.6  (1706)             (0.15)
  340   73    0.29   2.6  (1655)             (0.20)

There is a match at relative year 6  i.e. at  1989

SN009A_of_D:\ake\tree\sn\SNMEAS.rwl compared to the reference D:\ake\tree\DEC\Namdo
Best matches for the whole sample:     6 8.32 (1989)     32 4.41 (1963)    258 3.16 (1737)  
The sample is currently dated to 1989 which is used in the "Aimed at" column.
Block length: 30   Table sorted by Proportion of last two years growth (2,0,T (P2Yrs)/TTest

Block  -----Aimed------   -------Best    ------------Three best matches with {hitAt,Prop2Yrs,SetsSampleTo}---
start  --------at  year   around that    ---1stBestMatch-------    ---2ndBestMatch-------    ---3rdBestMatch-------
    0      6 0.41  1989       6 0.41       139 0.49  139 (1856)       61 0.46   61 (1934)       32 0.46   32 (1963)
   10     16 0.62  1979      16 0.62        16 0.62    6 (1989)      287 0.44  277 (1718)        6 0.43   -4 (1999)
   20     26 0.72  1969      26 0.72        26 0.72    6 (1989)      271 0.60  251 (1744)       93 0.51   73 (1922)
   30     36 0.70  1959      36 0.70        36 0.70    6 (1989)      281 0.59  251 (1744)      370 0.54  340 (1655)
   40     46 0.54  1949      46 0.54        46 0.54    6 (1989)      371 0.52  331 (1664)      260 0.47  220 (1775)
   50     56 0.40  1939      56 0.40       330 0.55  280 (1715)      114 0.54   64 (1931)        5 0.52  -45 (2040)
   60     66 0.34  1929      66 0.34       124 0.59   64 (1931)      122 0.46   62 (1933)      202 0.46  142 (1853)
   70     76 0.70  1919      76 0.70        76 0.70    6 (1989)      328 0.53  258 (1737)      359 0.52  289 (1706)
   80     86 0.69  1909      86 0.69        86 0.69    6 (1989)      126 0.60   46 (1949)       76 0.58   -4 (1999)
   90     96 0.79  1899      96 0.79        96 0.79    6 (1989)      282 0.71  192 (1803)       -1 0.58  -91 (2086)
Lowest block CorrC = 0.34 at index 60, year=1929

Results may be influenced by your minimum overlap setting and the existence of zero rings.

SetsSampleTo alternatives sorted from blocktest above - only BEST matches:
 1715      1856      1931      1989(7)  
Most recurrent year: 1989(7)

SetsSampleTo alternatives sorted from blocktest above best/nextbest/3rdBest:
 1655      1664      1706      1715      1718      1737      1744(2)   1775      1803      1853      1856     
 1922      1931(2)   1933      1934      1949      1963      1989(7)   1999(2)   2040      2086     
Most recurrent year: 1989(7)

SetsSampleTo alternatives sorted from extended block length test, all BEST matches:
 1715      1856      1931(2)   1989(51) 
Most recurrent year: 1989(51)

SetsSampleTo alternatives sorted from extended block length test, BEST and NEXT best matches:
 1664      1682(2)   1706      1715      1718      1737(4)   1744(10)  1775(6)   1803(2)   1853(3)   1856     
 1931(5)   1933      1934      1949      1963(19)  1989(51) 
Most recurrent year: 1989(51)

Block checking may also be useful when doing a quality analysis of e.g. an old published reference curve. When all blocks match properly towards another reference curve, then that old published curve may be correct. Though if the late end tail of the curve matches towards one time in the other reference, and the early end tail matches towards quite another time, then we may suspect that samples from different times were mixed together when that old published reference curve was originally created.