APPLICATION OF TREE DETECTION METHODS OVER LIDAR DATA FOR FOREST VOLUME ESTIMATION

Lidar (light detection and ranging) data are becoming more and more important in the analysis of the most relevant forest parameters. This study aims to compare the most recent segmentation methods for single trees using the ALS (Airborne Laser Scanning) point cloud and the CHM (Canopy Height Model). The methods used were the Li et al., method developed in 2012 and the Multi CHM method developed in 2015. The parameters analysed were the height and diameter for the individual trees and the volume and density for the entire forest. The efficiency of each method was verified by comparing the estimated parameters with those measured through 30 test areas. To better identify the useful parameters for the correct calibration of the algorithms, the population was divided into three layers according to the vertical structure and chronological class. From the comparison of the volumes obtained with the above methods and those calculated for the test areas, it emerges a tendency to over-segment for the Multi CHM method, while for the appropriately calibrated Li method there is a better correspondence to reality. The F-score values for the volumes obtained for the Li method are between 0.52 and 0.69 while for those obtained for the Multi CHM method are between 0.47 and 0.55. When compared with relascopic measures for each of the 48 parcels, a mean absolute difference ~127 m/ha and ~141 m/ha were found for Li2012 and MultiCHM respectively.


INTRODUCTION
Lidar data (light detection and ranging) are becoming increasingly important in the analysis of the Earth surface (Pirotti, 2019) and in particular the main forest parameters. Nowadays lidar technology is used in the drafting of forest management plans mainly as support in the stratification of the stand in forest units and in the verification of the parcel boundaries (Leckie et al., 2003).
The estimation of volumes is currently carried out by methods based on the estimation of the basimetric area and its conversion into volumetric parameter. These methods consist of measuring diameters for the whole forest or for test areas or the relascopic technique. The importance of knowing the volume of a stand is fundamental for a correct planning that guarantees the opportunity to use the woody products without depleting the wooded areas.
Remote sensing methods for estimating forest volumes are the topic of a very large number of investigations ever since remote sensing became an accessible technology. Optical and radar imagery are often used together (Vaglio Laurin et al., 2016). Vegetation indices have provided biomass models that work better at low biomass values and saturate over thick vegetation (van der Meer et al., 2000).
Laser scanning brought a new frontier in estimation of volume and biomass. Laser scanning has attracted particular interest due to its unique advantage, i.e. the capability to penetrate through the foliage and capture both tree structures and the ground (Lim et al., 2003). Traditional forest inventories measure the diameter at breast height of the tree and estimate volume with allometric models (Dalponte et al., 2018;Jucker et al., 2017). Volume and aboveground biomass (AGB) are then tied to other models related to species and other factors. Estimation of DBH from airborne laser scanner data is only possible through allometric equations. Accurate canopy height models from laser scanning surveys allow area-based and single-tree based methods (Pirotti et al., 2017), and derived informative layers such as damage assessment .
Algorithms for individual tree detection and segmentation from lidar data have been widely investigated (Dalponte et al., 2008;Dalponte and Coomes, 2016;Pirotti et al., 2017). All methods, to the authors' knowledge, require some type of parameter tuning, that is provided as scalar values or linked to some functions, that change depending on forest parameters, such as tree density, structure and tree height (Pirotti et al., 2017). It is well known that forest structure can be aggregated up to a point, as forest management is spatially aggregated in parcels that supposedly have a constant population structure. Parcels are joined to lookup-tables that link height, diameter and volume, whose proportions are supposedly constant for the forest population in the parcel.
The aim of the study is to verify the applicability of lidar data in the estimation of forest volume using two methods that have been implemented in the R environment.

STUDY SITE
The study area is located in the province of Trento, in the Italian Alps. The entire area extends on the orographic left of the Val di Fiemme close to the lake of Stramentizzo (46 ° 15 '45' 'N,11 ° 23' 27 '' E), hydroelectric basin on the river Avisio. The total area is about 708 ha. The minimum altitude of the area is 790 m a.s.l., near Lake Stramentizzo, while the maximum altitude is 1718 m a.s.l. in loc. Palleta; the prevailing altitude is between 1000 and 1400 m a.s.l. The orography is overall uniform; the slopes are quite steep with a progressive decrease of the slope as the altitude increases. The dominant exposure is North with variations between North-West and North-East.
The forest area is composed almost exclusively of conifers, with the predominance of spruce (Picea abies L.) which mixes with white fir (Abies alba Mill.) and larch (Larix decidua Mill.) depending on the station. The broadleaved trees are poorly spread and consist mainly of beech (Fagus sylvatica L.). The great presence of single-layer spruce forests is attributable, in addition to the environmental conditions suitable for the species, to the single-specific reafforestations that until a few decades ago were carried out after the clear cut. The VAIA storm at the end of October 2018 created heavy damages in the property (Figure 1 and Figure 2). The damaged area is ~104 ha for a total of ~51,000 m 3 , estimated from previous forest plans actualized to today.

Stratification
For this study, 3 types of layer were identified related to the structure of the forest and the evolutionary stage: multi-layers, mature single-layers and young single-layers. The detection of the layers was conducted on video through the affixing of the semi-transparent CHM orthophoto; this operation, associated with a blinking operation (rapid switching on and off of the layer) allowed a provisional delineation of the layers subsequently verified in the field (Alberti et al., 2013). The study of volumes per single tree renders the classification of the stand as a function of density inessential.

Sampling
In the study area 30 test areas were fixed with a radius of 15 m and an area of 707 m 2 . The positioning of the centre of the test area was not random. For the placement of the test areas, the characteristics of the stand were considered, researching areas with mean dendrometric parameters, excluding areas where biotic and abiotic disorders have occurred recently. 15 areas were assigned to the multi-layer, 10 areas to the old single-layer and 5 areas to the young single-layer.
Two pickets were placed for the georeferencing of the test area and the points were acquired by the use of the Trimble TSC3 controller combined with the Trimble R8s GNSS System receiver. The points were acquired with a horizontal error of less than 40 cm.
The centre of the test area was acquired through triangulation with pickets. In the phase of acquisition of the position of the trees included in the test areas for all trees it has been established the species, the diameter and the height. All the peculiar characteristics of the trees as the steep inclination of the trunk or dead subjects were also noted (Montaghi et al., 2013). The plants and pickets were acquired through the total station Trimble S6 combined with the Trimble TSC3 controller.

Method A -Li2012
This method, developed by (Li et al., 2012), allows the segmentation of individual trees from the cloud of ALS points arising from the lidar. The algorithm is particularly suitable for the study area because it was developed for mixed coniferous forests on rough terrain (Li et al., 2012). The method is based on the principle that the spacing at the bottom of the crown is less than near the apices. It is necessary to identify the correct interdiction thresholds for each layer so as to limit errors due to over-and-under-segmentation. These thresholds depend on the shape and position of the trees.
First, the ALS cloud must be normalized in order to facilitate the identification of local maxima. In the segmentation phase the algorithm identifies the local maximum in the cloud of points and it identifies it as a "target tree". Then it associates all the points within a predetermined threshold to the "target tree" excluding the points belonging to other plants. Once segmented, the target tree is removed from the point cloud, so that the algorithm can detect the new local maximum (Li et al., 2012). All operations were implemented with the Rstudio software.

Method B -MultiCHM
The MultiCHM method, described by Eysn et al., in 2015, is based on iterative development of CHM at decreasing heights and acquisition of the local maximum (LMF) for each band (Eysn et al., 2015).
For the generation of the CHM it is necessary the normalization of the cloud of ALS points through the use of a DTM. From the normalized ALS point cloud a CHM is generated between two predetermined elevations. To prevent errors due to the presence of pixels with elevations greater than the crown surface, a CHM is created by assigning the 95 th percentile for each raster cell (Eysn et al., 2015). The CHM without outliers is used for the detection of local maxima (LM), the points fixed in this operation provide the position and the height of the trees, these data are saved in a database at each iteration. Then a lower elevation CHM is generated and the above operations are repeated until the ALS points are exhausted (Eysn et al., 2015). The algorithm used in this method proceeds to the detection of apices through a mobile window with a fixed size of 3x3 pixels.
To prevent problems related to double counting all the points belonging to the single tree located below the actual apex must be eliminated. The points saved in the database are placed in descending order by identifying the point with the largest height, it is identified as "real apex". The identification of the following "real apex' is carried out by validating previously established horizontal and vertical distance thresholds. If the distance is lower at these thresholds the point is eliminated, in the case of greater distance the point is identified as the "real apex" (Eysn et al., 2015).

Comparison
In order to test the accuracy of the Li and MultiCHM methods, the results of these methods were compared in the test areas, parcels and layers.
The aim of such verifications is to determine which of the two methods is the most suitable for the stand and to determine the total volume of the layer and parcel level. All operations were conducted with Qgis software.

sample area check.
The matching procedure for the trees measured in the forest and those resulting from the methods shall consider the position factor only. For the determination of the searching area a fixed buffer of 2 m was chosen, applied to the points corresponding to the trees measured in forest. This threshold is necessary due to the slight natural inclinations of the plant and also due to the inaccuracies due to the geo-referencing of the points related to the trees surveyed in the field. The points corresponding to the trees extracted with the two methods were then cut out using the 2 m buffer as a mask. They were subsequently screened for irregularities.
The tree points were then assigned to the three classes: • true positives (TP): points arising from the method that match correctly to truths on the ground; • false negatives (FN): truths on the ground without correspondence with the points arising from the methods; • false positives (FP): erroneously extracted points from the method, as they are without real correspondence; These values, however, do not return a single index relative to the accuracy of the method. For this reason, the accuracy measurement F-score (Eq. 2) calculated through the harmonic mean of precision and recall was applied.
where p and r are respectively precision and recall values.

Check on parcel.
The comparison of the two methods used for the volumetric estimation of stands was carried out not only by test area but also at parcel level. The application of the method to parcel level turns out to be very important for a comparison with the historical data of the property. It is important to specify how historical data should be used as an indication and not as a real parameter as they are also derived from estimations.

Check on strata.
Comparison of layer results may be useful in understanding the sensitivity of methods applied in relation to the variability of stands. A careful analysis of the results obtained by the MultiCHM method and the Li method allows to identify the areas in which the methods are suitable so as to try to understand the limits and potential of them.

RESULTS AND DISCUSSION
Total volume per parcel was aggregated from single trees and comparison with reference values for each sample area, each parcel, and each stratum was carried out.
The term "differences" will be used instead of "error", because it must be noted that the figures considered for control are from relascopic measurements and these do not have a level of accuracy such that it can be used as control (e.g. 10x more accurate). In literature errors from relascopic measures are between 4 and 10%, with higher error values for uneven-aged dense forests (Pesonen et al., 2009;Piqué et al., 2011). Our area included both even-aged and multi-strata forests, therefore we can assume an error in relascopic measures in the higher range. Accounting for this factor is important; it does not decrease the validity of the method, but clarifies that differences do not necessarily imply that one method is better, but how close it comes to a more common estimation method, i.e. sample relascope areas.

Sample areas
Sample areas had each tree position, diameter and height known, so a strict control was carried out. The two tables above show how MultiCHM performed better in terms of detecting tree position, but mostly on smaller trees, whereas Li2012 had a better performance to detect the larger trees, thus a better prediction of volume. It must be noted that any automatic tree detection method is strongly dependent on how parameters are tuned, and that the way that the two methods were applied might be further improved. The Li2012 method was slight tuned by applying different values of a parameter called dt2 which is an adaptive distance threshold (Li et al., 2012). Different dt2 values were used for the three strata; even-aged, multiplanar and

Parcels
Comparison was carried out by aggregating total volume per parcel and results are presented and compared between the two methods and relascopic measures done over 350 sample areas.  Table 3. Volume values (m 3 /ha) and differences.
Overall differences were aggregated to provide metrics that summarize differences between the two methods using some accuracy metrics in Table 4 below.   Figure  3 below shows that the differences are not correlated with area size. There are 5 gross differences in parcels 32, 28, 35, 48 and 30. The differences are due to difficulty assigning an updated volume value from relascopic figures, as those areas underwent a windthrow event of VAIA storm. These parcels that were partially damaged, had to be recalculated be removing the damaged part from all calculations. This process leads to extra errors that very likely caused those differences. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition)

Strata
Aggregating to the three strata mature single-layered, multilayered and young stands values have the results in the following An initial look at the table above shows a higher accuracy in young stands, where it is known that due to higher tree density, tree-based methods often perform worse. It must be noted that the Li2012 methods was tuned for the three strata, respectively young-stands had dt2=1.25, single-layered had dt2=2.0 and multi-layered had dt2=1.5. This differentiation was not applied to MultiCHM also because its parameters are different. As can be noted, and as was mentioned at the beginning of this article, tuning parameters is a fundamental step for a good result. It is trivial that a method can be over-fitted to a certain scenario, but in this case, fitting was done over 30 sample areas and applied to 48 parcels that were then compared (see Table 3). This provides a better idea of the replicability of the method.

Damaged trees volume estimation
The estimated volume of timber felled by Vaia storm by the property staff is ~50,200 m 3 . The volume estimation was carried out in the days following storm by extracting the volumetric values both from the current forest management plan and from the register of clear cuts made in previous years in the areas affected by the storm.
For the damanged areas, the value of 1.5 was used as an interdistance dt2. The volume obtained from the application of the Li method for damaged areas is 51,084 m 3 , with an overestimate of 1.76% compared to the volumes estimated by the forestry company. The volumes estimated by the Li method were confirmed by the quantity of timber cleared in 2019, Wood-use work carried out from April to December 2019 affected about 50% of the areas affected by the storm with the extraction of about 27,000 m 3 of material including lumber and biomass for wood chips.

CONCLUSIONS
Estimating volume in forests using lidar is an important task that deserves further investigation due to the many different forest scenarios. In this work a simple comparison was carried out. The objective is not to define best method, but to show how figures can vary due to different factors, such as disturbance (windthrow from VAIA storm) and estimation method (relascope vs lidarderived models).
This study reported two methods: from the comparison of the volumes obtained with the above methods and those calculated for the test areas, it emerges a tendency to over-segment for the Multi CHM method, while for the appropriately calibrated Li2012 method there is a better correspondence to reality. The Fscore values for the volumes obtained for the Li method are between 0.52 and 0.69 while for those obtained for the Multi CHM method are between 0.47 and 0.55. When compared with relascopic measures for each of the 48 parcels, a mean absolute difference ~127 m 3 /ha and ~141 m 3 /ha were found for Li2012 and MultiCHM respectively.