KERNEL FEATURE CROSS-CORRELATION FOR UNSUPERVISED QUANTIFICATION OF DAMAGE FROM WINDTHROW IN FORESTS

In this study estimation of tree damage from a windthrow event using feature detection on RGB high resolution imagery is assessed. An accurate quantitative assessment of the damage in terms of volume is important and can be done by ground sampling, which is notably expensive and time-consuming, or by manual interpretation and analyses of aerial images. This latter manual method also requires an expert operator investing time to manually detect damaged trees and apply relation functions between measures and volume which are also error-prone. In the proposed method RGB images with 0.2 m ground sample distance are analysed using an adaptive template matching method. Ten images corresponding to ten separate study areas are tested. A 13x13 pixels kernel with a simplified linear-feature representation of a cylinder is applied at different rotation angles (from 0° to 170° at 10° steps). The higher values of the normalized cross-correlation (NCC) of all angles are recorded for each pixel for each image. Several features are tested: percentiles (75, 80, 85, 90, 95, 99, max) and sum and number of pixels with NCC above 0.55. Three regression methods are tested, multiple regression (mr), support vector machines (svm) with linear kernel and random forests. The first two methods gave the best results. The ground-truth was acquired by ground sampling, and total volumes of damaged trees are estimated for each of the 10 areas. Damaged volumes in the ten areas range from ~1.8 x10 m to ~1.2x10 m. Regression results show that smv regression method over the sum gives an R-squared of 0.92, a mean of absolute errors (MAE) of 255 m and a relative absolute error (RAE) of 34% using leave-one-out cross validation from the 10 observations. These initial results are encouraging and support further investigations on more finely tuned kernel template metrics to define an unsupervised image analysis process to automatically assess forest damage from windthrow.


INTRODUCTION
The main aim of remote sensing technologies is to estimate variables of interest over larger areas and with lower costs when compared to manual measurements or other means.Assessment of damages after an event is an important aspect of remote sensing applications.Windthrow events, when extreme, can break stems and flatten large patches of forests (Figure 1).Depending on the type of forest, this can have a direct economic impact (loss of wood products) or a more indirect economical effect, e.g. on tourism.As stated in Mitchell (1995), referring to North America forests, "salvaging windthrow is costly and may disrupt silvicultural and management plans.Non-salvaged windthrown trees provide bark beetle habitat, increase fuel loading, and limit the mobility of wildlife and recreationists."Another effect of windthrow is over the biodiversity, as forest structure changes suddenly and dramatically, thus impacting over ecological niches and thus on fauna and flora.The ability to spatially assess the magnitude of windthrow damage is very important, as it can affect decisions depending on forest type and management.Damage directly impacts on owners economically as there might be insurance companies involved and an accurate assessment of the damage becomes crucial for payment of insurance.Remote sensing is a logical solution and has been used in assessing forest health and status, in particular laser scanner from aerial imagery to monitor biomass (Koch, 2010;Pirotti, 2010a;Pirotti et al., 2014).Depending on the total area of interest images from satellites, aerial, remotely piloted aircrafts or even static cameras can be used to estimate hazards (Pirotti et al., 2015).In the following work we assess results of a method to quantify damage using aerial RGB imagery and compare which regression technique give the best estimation of the amount of damaged volume over ten areas.

STUDY AREA
The study was carried out in Tuscany Region, one of the most forested area in Italy, with the 31% of its total surface covered by forest (INFC, 2005).In 2015, the Tuscany Region was hit by a heavy wind storm between the late evening of 4 th (18 UTC) and the morning (06 UTC) of 5 th March, with wind gusts over 165 km/h, and a dominating wind direction from northeast (LaMMA, 2015).The storm caused extensive damages in the forest cover, especially in conifers plantations.To test our methodology, we selected 10 study areas located in the Province of Arezzo, Firenze, Lucca and Pistoia, which are characterized by different levels of tree damage (Tab.1).

METHOD
In this investigation, RGB images with 0.2 m ground sample distance (GSD) acquired on May 2015 were used.The approach consists in using a template which resembles the object which we want to detect and quantify -in this case felled tree trunksand convolve a similarity index to assess which pixels in the image represent a felled tree and which do not.The template is practically a kernel of values which is applied to all the images with a convolution (Lee et al., 2006).
The criteria behind the template matching approach is that felled trees from windthrow will have a certain directionality due to stems being similar to linescylinders -in the image.Convolution of the kernel representing a linear feature (Figure 2) over the original image is done by applying a normalized crosscorrelation (NCC) function (Di Stefano et al., 2004) as in equation 1: (1) where N and M are respectively the number of rows and columns of the kernel (13 in our case) and I is value of the pixel.
The hypothesis is that the resulting NCC will have higher values when over a felled tree stem and when the kernel feature direction is close to the direction of the tree stem on the ground.To record this, the "best" fit was kept, recording the higher value of NCC from all directions as shown in equation 2: Ten images corresponding to ten separate study areas are tested.A 13x13 pixels kernel with the simplified linear-feature representation of a cylinder (Figure 2) is applied at different rotation anglesa sequence from 0° to 170° at 10° steps.The highest values of the NCC from the 18 rotated kernels are recorded (equation 2), creating a raster with values representing the corresponding NCC value.The frequency distribution of NCC values in each image is then used to predict the damaged volume.The hypothesis is that areas with many felled trees will provide a distribution of NCC values shifted more towards higher values in the image due to matching well with the kernel thanks to its similarity.The following features are tested: percentiles 75, 80, 85, 90, 95, 99, 99.9 and 1 (maximum value), sum of NCC values above 0.55 and number of pixels with value above 0.55 as shown in the equation below: where n and m are respectively the number of rows and columns in the image, and mNCC is the value from equation 2.
The ground truth values of damaged volume have been estimated with field sampling in the areas using the Line Intersect Sampling method, as described in section 3.2.Field data were used to assess the features extracted from our method model.

Models used
The following three methods for machine learning were used for the regression model to the data for predicting damaged volume: support vector machines (svm), multiple regression (mr) and random forest (randomForest).For svm most common kernels were tested -radial basis function (rbf), polynomial, linear, laplacian and spline.

Ground truth
In each area, the total volume of damaged trees was estimated using the Line Intersect Sampling (LIS) method (Kaiser, 1983).LIS is a form of cluster sampling in which population elements crossed by a line transect are selected into the sample (Gregoire and Valentine, 2003).To estimate the wood volume of fallen trees on forest floor, the LIS method entails measuring at the point of intersection the diameter of each piece of fallen tree intersected by a transect line of a given length, then the wood volume of fallen trees per hectare is computed as (Van Wagner, 1968): where V ˆ is the wood volume (m 3 /ha) of fallen trees per hectare, L is the total length (m) of the transect line, d is the diameter (cm) at the point of intersection, and n is the number of intersections between the transect line and the pieces of fallen trees.The total length of the transect line is an important variable when implementing the LIS method (Van Wagner, 1968).In order to establish the length per hectare of the transect line, a Monte Carlo Simulation was carried out using different lengths ranging between 80 and 1000 m/ha.On the bases of the relative standard error obtained by the Monte Carlo Simulation (Figure 3), the total length per hectare of the transect line was set to 500 m.In each area, the LIS method was designed with a GIS system by randomly chosen 25 points per hectares, on which a randomly oriented sampling line of 20 m was centered and adopted to select fallen trees.The diameters at the point of intersections between the sampling lines and the pieces of fallen trees were measured in the field during summer 2015.Finally, Equation 4 was used to estimate the wood volume (m 3 /ha) of fallen trees.

RESULTS
In Figure 4 we can report some observations that are worth to be noted.First of all there is an angle value at which a lowest value seems to be common to all images, i.e. ~120°.This angle value always has the lowest NCC value in all areas.This implies that the wind damage caused trees to fall at different angles, but with a lower probability in that certain one.
The three machine learning methods used have resulted in the relative accuracies presented in Figure 6.Svm and mr models give the best results.The former performed best with the linear kernel.This result was expected, as regression is best on a linear basis, as this method is supposed to quantify damaged trees based on similarity with a kernel.The correlation between features extracted from convolution of the kernel to the image is expected to be lineari.e. more similarity implies more dead wood thus more damage.The svm model performed slightly better than mr, giving a best R 2 = 0.92 from cross correlation using the leaveone-out (LOO), a mean of absolute errors (MAE) of 255 m 3 , and a relative absolute error (RAE) of 34%.The mr model provided a R 2 =0.88.In both cases using both the count of pixels with NCC higher than 0.55 (num) and the sum of the NCC value of these pixels (sum) as in Figure 6 and Tab. 2. There was no significant difference between using the sum and num features, with the latter giving R 2 = 0.91 with svm. Figure 7 shows the final results plotting the values that are also visible in Table 2. Further discussion to final results is presented in the next section.

DISCUSSION
Automatic assessment of damaged areas is an important field for research.GIS-supported applications can deploy damage maps by using various sources to infer damaged ground cover (Piragnolo et al., 2015(Piragnolo et al., , 2014;;Pirotti, 2010a).In this case, we used aerial imagery to estimate damage from windthrow in ten areas, getting a very positive result.Template matching approaches have been used in many fields for recognition of features in images.In forestry also in detecting tree canopies and estimate their position (Pirotti, 2010b).In this particular case we detected felled trees, which result as damage from a severe windthrow event.The application of a kernel representing the template of the feature that we want to quantify, allows to extract some metrics which quantify similarity, as has been done in other approaches (Pirotti et al., 2014).There is undoubtedly a linear relationship of the number of pixels having high correlation and the degree of damage.A visual improvement to the results would be to create a heatmap over larger extents reporting a color-coded thematic map of the damage degree.This type of information can also be used to study the spatial distribution of damage, to see if there is a relationship with other variables such as morphology of the terrain, forest management technique, or tree species.This has been done in Sinton et al. ( 2000) and the availability of imagery and this automatic method to quantify damage would have allowed the author to take further steps for analysis of historical windthrow damages.Most of our test areas were small in size, with three areas being significantly larger.In all cases our method performed well.This method can be fully automated, as the principle that is used can be applied to all situations were the damage is recent thus the overall texture of the felled trees in the aerial image is well represented.It can also be noted that the method might vary in accuracy depending on the image resolution, requiring a bit of tuning of the kernel that acts as a template.This can be the focus of another study, but in general it can be argued that lower resolutions (e.g. from remotely piloted aircrafts -RPAs) will still work, because what is looked for is the linearity of the objects, and higher resolutions will just sample better this characteristic of the land cover.Figure 7: Plot of results -R 2 is 0.88the bottom plot is zoomed to the lower left part.

ID
On the other hand higher resolution means processing will be much more computationally intensive, probably not giving a significant improvement.Therefore any resolution of the image that allows to detect linearity is applicable, with preference to lower resolutions for improving the speed of the process.
Resampling the images at lower resolutions and applying the same method iteratively can be the focus of an interesting investigation, to assess how the accuracy decays at lower resolutions.

CONCLUSIONS
Results show that using NCC to assess similarity of a feature to a template can accurately assess damage.It is important to note that this is due to the effect that windthrow has over trees, allowing to detect a well-defined shape over the ground from aerial images.Damaged versus healthy vegetation is easily differentiated and this method provides a viable solution.

Figure 1 :
Figure 1: Overview of study areas.

Figure 3 .
Figure 3. Monte Carlo Simulation: relative standard error obtained using different lengths per hectare of the transect line.

Figure 4 .
Figure 4. Results of percentile values from the frequency distributions of NCC over the ten areas.

Figure 5 .
Figure 5. Results over an area with a large patch of felled trees (top) and the corresponding highest NCC values from the kernel rotations (bottom).

Figure 6 .
Figure 6.Results from the three machine learning methods, validation is done with Mean of Absolute Errors (MAE), Relative Absolute Error (RAE) and Pearson's correlation value squared (R 2 ) all were validated with Leave One Out cross validation method (LOO).

Table 2 .
Results summary: ID is the area name, sum and num are respectively the sum of NCC values above 0.55 and the count of such pixels, Obs.Vol and Pred.Vol are respectively the groundsampled and predicted volumes of damaged trees in m 3 .