EVALUATION OF AZURE KINECT DERIVED POINT CLOUDS TO DETERMINE THE PRESENCE OF MICROHABITATS ON SINGLE TREES BASED ON THE SWISS STANDARD PARAMETERS

In the last few years, a number of low-cost 3D scanning sensors have been developed to reconstruct the real-world environment. These sensors were primarily designed for indoor use, making them highly unpredictable in terms of their performance and accuracy when used outdoors. The Azure Kinect belongs to this category of low-cost 3D scanners and has been successfully employed in outdoor applications. In addition, this sensor possesses features such as portability and live visualization during data acquisition that makes it extremely interesting in the field of forestry. In the context of forest inventory, these advantages would allow to facilitate the task of tree parameters acquisition in an efficient manner. In this paper, a protocol was established for the acquisition of 3D data in forests using the Azure Kinect. A comparison of the resulting point cloud was performed against photogrammetry. Results demonstrated that the Azure Kinect point cloud was of suitable quality for extracting tree parameters such as diameter at breast height (DBH, with a standard deviation of 2.2cm). Furthermore, the quality of the visual and geometric information of the point cloud was evaluated in terms of its feasibility to identify microhabitats. Microhabitats represent valuable information on forest biodiversity and are included in Swiss forest inventory measurements. In total, five different microhabitats were identified in the Azure Kinect Point cloud. The measurements were therefore comparable to sensors such as terrestrial laser scanning and photogrammetry. Therefore, we argue that the Azure Kinect point cloud can efficiently identify certain types of microhabitats and this study presents a first approach of its application in forest inventories.


INTRODUCTION
In the field of forestry, terrestrial laser scanning (TLS) is the gold standard for 3D reconstruction of trees (Rehush et al., 2018). TLS devices can create very high-resolution point clouds but are expensive, time-consuming and complex. Therefore, interest in novel terrestrial-based alternatives is increasing (https://e-services.cost.eu/files/domain_ files/CA/Action_CA20118/mou/CA20118-e.pdf, accessed 2022-22-03). The Azure Kinect, a Time-of-Flight (ToF) camera system, presents a promising alternative based on its cost, availability, and ease of use (Neupane et al., 2021). This device has been proven to be feasible for tree measurements, such as measuring the diameter at breast height (DBH) of urban trees (McGlade et al., 2020) or automating fruit localization and sizing (Neupane et al., 2021). However, to our knowledge, few to no studies have yet attempted to use Azure Kinect to collect forest inventory data. The objective of this study was to evaluate the use of the Azure Kinect in a forest environment and how its computer vision algorithms can handle the extreme and rapid change of brightness and the homogeneity of the texture patterns in the background. For this purpose, we tested whether the device can accurately detect Tree-related Microhabitats (TreMs), which are morphological tree features important for specific species, such as bats, birds, or mammals (Bütler et al., 2020). TreMs are usually manually assessed by an inventory crew of two people, which is extremely time-consuming and depends heavily on the training and knowledge of the involved team. In this paper, we present a procedure to automate * Corresponding author the detection of TreMs using the Azure Kinect, with a particular focus on the accuracy of the results.

The use of terrestrial remote sensing in forest inventories
For forest inventories, the most popular terrestrial remote sensing instrument is TLS. Indeed, the long range and precision of TLS coupled with the high resolution and detailed point cloud, make it the standard in forestry applications. However, its use comes with a very high cost and intensive processing requirements. To overcome these drawbacks, researchers started to look at cheaper solutions such close-range photogrammetry (CRP) (Mokro et al., 2018). The resulting point cloud is of similar quality with a more affordable hardware cost, however the post-processing time is still high and requires substantial computational power. Therefore, the development of further lower cost technologies remains of interest. Thanks to the progress made in the development of novel terrestrial based technologies, this category of low-cost sensor began to be considered as an alternative measurement device in forest inventory (https://e-services.cost.eu/files/domain_ files/CA/Action_CA20118/mou/CA20118-e.pdf, accessed 2022-03-22). For example, (Hyyppä et al., 2017) demonstrated that Kinect and Google Tango provided accurate individual tree stem measurements of DBH and stem curvature approximations. In addition, (Gollob et al., 2021) evaluated the capability of the iPad Pro 2020 to collect forest measurements.
The Azure Kinect is the evolution of the Kinect used in the above-mentioned study (Hyyppä et al., 2017) and has recently been successfully used in a variety of agricultural measurements such as leaf area estimation on tomato plants (Masuda, 2021), automatic branch detection of jujube trees (Ma et al., 2021) and fruit localization and sizing (Neupane et al., 2021). Moreover, (Neupane et al., 2021) ranked it first out of eight other depth cameras tested for having the highest potential to perform measurements under challenging light conditions. To date, the sensor's suitability for forest inventory measurements remains to be explored and is the purpose of this study.

TreMs inventory
Measuring forest biodiversity is often done via the use of biodiversity indicators, such as volume of dead wood (Christensen et al., 2005) or more recently, via the measurement of TreMs (Bütler et al., 2020). TreMs inventories can provide an estimate of species richness in forests as they represent the presence of certain ecosystems on the tree trunk (Courbaud et al., 2022). In order to provide a standardized definition of TreMs, (Larrieu et al., 2018) proposed a classification of 47 TreMs forms into 7 groups and 15 sub-categories. (Brändli et al., 2021) further simplified the classification of TreMs to the 19 most significant species in terms of their occurrence and importance in Switzerland. This classification has also been adopted for the Swiss national forest inventory (Düggelin et al., 2020). In this inventory, 19 categories were associated with specific characteristics that allow for a simple and efficient classification. In our study, we employed the classification of (Brändli et al., 2021) to identify the presence of microhabitats in our forest 3D scans.

METHOD
Two measurement campaigns were carried out during this study. A preliminary campaign of measurements was necessary due to the novelty of the sensor and the limited literature available on the subject. In fact, in novel terrestrial based technologies, the choice of the algorithm to reconstruct the point cloud is as important as the choice of the sensor. Therefore, it was necessary to determine which algorithm would be most appropriate in a forest environment. In addition, strong changes in luminosity could have a dramatic impact on the performance of the reconstruction algorithm. To limit this impact, it was necessary to establish a plan that would encompass different parameters such as the velocity of movement, the path of travel and the appropriate field of view. The first campaign hence allowed the development of a measurement protocol and additionally, gave tangible evidence that the Azure Kinect was able to recreate a functional 3D model of a single tree.

Measurement Protocol
For the measurement, one loop was made around the tree holding the device at chest height. The radius of measurement was kept in the range of 1.5 to 5 meters. The BAD-SLAM algorithm was implemented at the beginning of the measurement, which allowed the reconstruction of the point cloud to be visualized in real time and was immediately saved on the device. Additional images were taken using a camera to build reference data from terrestrial photogrammetry. The acquisition of images for the photogrammetric dataset was undertaken using a Nikon D3200 DSLR camera with an 18-20 mm lens. With a capturing distance of around 2 m, this translated into a theoretical Ground Sampling Distance (GSD) of around 0.4 to 0.5 mm. For each tree sample, images were taken in a convergent manner in loops. Two loops were performed serving as backup data, yielding an average of 80 images for each tree sample. Furthermore, in order to facilitate scaling in the resulting 3D model, two coded targets and a measuring tape were placed on the ground and were visible in some of the images. The distance between the coded targets were measured using a measuring tape. This ensured that at least two scale bars can be used for absolute orientation purposes during the photogrammetric processing.

Datasets
Hönggerberg forest For the preliminary research, three trees in the Hönggerberg forest of Zürich, Switzerland were scanned: one maple, one beech, and one pine (Acer platanoides, Fagus sylvatica and Picea abies, respectively). The selected species are among the most common tree species in Swiss forests (https://www.lfi.ch/publikationen/publ/ posterserie_LFI3_A4-en.pdf, accessed 2022-07-03) and possess an easily recognisable bark texture.
Rameren Forest For the second measurement campaign, three trees in the Rameren forest were scanned: one cherry, one hornbeam, and one Douglas fir (Prunus subg. Cerasus, Carpinus betulus and Pseudotsuga menziesii respectively). The tree species were chosen because each had at least one microhabitat and the surrounding area was sufficiently open for measurement purposes.

Preprocessing
A reference point cloud was first created for data assessment, using the images taken by the camera and processed using Agisoft Metashape software. For each tree, image orientation was performed followed by dense matching to produce a dense point cloud. The result was then scaled using the two automatically detected coded targets and measuring tape placed at the foot of each tree. Furthermore, each point cloud was georeferenced to a pre-existing reference TLS point cloud. The georeferencing only involved 3D conformal translation and rotation, with the scale factor assumed as fixed in both the photogrammetric and Azure Kinect point clouds.

Post-processing
In post-processing, two analyses were carried out using the software CloudCompare (https://www.cloudcompare. org/, accessed 2022-22-03). The first analysis aimed to assess the quality of the Azure Kinect point cloud quantitatively based on the two other reference point clouds. We first employed the photogrammetric point cloud in the M3C2 (Multiscale Model to Model Cloud Comparison) plugin in CloudCompare to assess the geometric precision and accuracy of the Azure Kinect point cloud. The M3C2 is a method to calculate the signed distance between two point clouds (Lague et al., 2013). To initiate the algorithm, the main parameters were identified. For example, we used the normal of the point cloud number 1 in the scale box and estimated the projection diameter and the maximum depth using the guess function. A further analysis involved comparing the DBH value of all trees point clouds to a manual reference measurement obtained with a calliper. To calculate the DBH of a point cloud, the method described by (Čerňava et al., 2017) was used by extracting a 6-cm section from a 1.27 to 1.33 m height for each of the trunks scanned. Then, RANSAC was employed to fit a cylinder to each of the extracted sections.
The second analysis involved TreMs detection in each individual point clouds. First, we examined the colour component of the point cloud and attempted to visually recognise TreMs on the bark of each individual tree. Afterwards, we employed CloudCompare tools to compute the roughness as well as the Gaussian curvature values of the point cloud. The neighbourhood radius considered for every calculation was set to 10 cm. Finally, the roughness computation was used to investigate whether a tree species could be identified.

M3C2 (Multiscale Model to Model Cloud Comparison)
The mean value and the standard deviation summarised in Table  1 are below centimeter level. This reflects a consistent alignment between the two clouds and confirmed that the Azure Kinect point cloud is of a reliable quality to perform measurements that require a level of precision at the centimeter range. Since our reference point cloud (photogrammetry) is capable of achieving sub-millimeter accuracy, a standard deviation of at most 0.5 cm would ensure a good quality of our point cloud and in particular for forestry applications. Furthermore, by looking at the mean value in Table 1 we noticed a higher value for the Cherry tree. This means that for the cherry tree there might be a systematic error in our point cloud alignment. One explanation could be the steeper topography where the tree was located, which could have led to inaccuracies in measurement using the Azure Kinect sensor. Nonetheless, the magnitude in systematic error was sufficiently small (less than 1 cm), demonstrating the feasibility of the Azure Kinect point cloud for extracting forest inventory parameters. In the following section, we present results of the DBH calculation for each tree. DBH (Diameter at Breast Height) DBH, a tree parameter widely used in forestry, allows forest professionals to derive valuable tree information. This includes the volume of a tree and the number and diversity of TreMs that a tree harbours (Bütler et al., 2021). The computed DBH values of this study are summarized in Table 2. As expected, the results of Azure Kinect had the lowest precision and accuracy for DBH estimation 2 ± 2.2 cm (c.f Table 3). The results obtained by classical photogrammetry were the closest to those of the reference value (Calliper), with an average error of 1.3±1.4 cm (c.f Table  3). The second most accurate results were obtained from TLS, with values close to the reference of 1.3 ± 1.6 cm (c.f Table  3). Nevertheless, the Azure Kinect DBH measurements were more accurate when compared to the best results obtained with the iPad Pro 2020 (σ = ±2.78cm) in this study . Our results confirmed that the Azure Kinect has a very good potential in the field of low-cost precision sensors. To evaluate the Azure Kinect point cloud qualitatively, we employed microhabitat detection as the baseline for our analysis.

Microhabitats Detection
Microhabitat trees in forests have increased in importance, as they are very good indicators of biodiversity, which is of major importance in forest management.

Point Clouds
Microhabitats Types Photogrammetry Type and depth of tree cavity, root buttress cavity, mosses, lichens and ivy Azure Kinect Type and depth of tree cavity, root buttress cavity, mosses and ivy TLS Type and depth of tree cavity, root buttress cavity, mosses Table 4. Summary of microhabitats in the Rameren datasets In table 4, we present results of the different microhabitats detect on the three sampled trees. Interestingly, we found that the Azure Kinect point cloud allowed us to detect one more microhabitat than the TLS point cloud and one less than the photogrammetric point cloud. We explain this difference in the following paragraphs.
Visual Inspection The Azure Kinect point cloud (Figure 1) did not appear to be affected by noise unlike the TLS point cloud (Figure 3). This comparison was further verified when visualising the point cloud in three dimensions. By rotating around the point cloud, we could clearly distinguish the creeper on the trunk of the Douglas fir tree with Azure Kinect and photogrammetry, whereas this structure was not visible with TLS. Despite the density of the point cloud of Azure Kinect being lower than the other two methods, we did not observe a critical loss of information. However, the low colour contrast of the Azure Kinect point cloud impacted the sharpness of details on the tree trunk and made the differentiation between microhabitats structures less detectable. For example, in Figure 4, the colour information contained in the photogrammetric point cloud (left image) can be used to identify the presence of lichens without ambiguity. In comparison, the color information was not interpretable in the Azure Kinect point cloud (right image). Despite the fact that the presence of lichen was recognizable in the Azure Kinect point cloud with the white-gray pattern on the trunk. It is not possible to categorize it within a growth form such as foliose, fructicose or crustoce, which is a crucial information for microhabitat classification. The same observation also applied to mosses differentiation. Such a limitation in observation was to be expected due to the lower resolution of the Azure Kinect camera. On a medium scale, the Azure Kinect point cloud allowed to visually recognize the presence of three-dimensional structures on the trunk including ivy, liana, and cavities. However, for an effective identification of these type of microhabitats additional information are required. Therefore, a point cloud curvature analysis of the trunk structures was performed in the next section to refine our approach in accurate microhabitat identification.
Point Cloud Curvature Thanks to the cloudcompare software the curvature of each point cloud can be calculated. This translated in a change in the direction of the surface normal vector, i.e. elliptic, parabolic or hyperbolic curvature. The curvature value allowed us to detect microhabitats in the Azure Kinect point cloud that were less identifiable using visual analysis. Figure 5 gave an example for each TreMs type we could isolate with the curvature analysis. The root buttress cavity (left image) was the most challenging microhabitat to detect due to its vicinity to the ground. In general, this part is most impacted by the noise due to multipath effect. The change in curvature is hence not as clear as for the other types but is still valid. Liana and ivy (middle image) were the most efficiently detectable due to the structure of the curvature. The result was remarkable and completely cleared up any doubts on either presence or absence of this type of microhabitats. The cavities (right image) represented a microhabitat type that was least identifiable using visual analysis. The curvature analysis partly resolved the issue by allowing us to easily reject the inclusion into this category, but not to refine the classification between cavities. Instead, the depth of the cavity was required to carry out a deeper classification.

Species Identification
The point cloud roughness values of each of the scanned tree trunks were also evaluated to determine whether this information would permit to distinguish individual trees species. Figure 6 depicts the point cloud of a hornbeam tree bonded to a spruce tree as a good illustration of our results. Accordingly, the trunk roughness of the two trees exhibited different bark patterns. This distinction was most clear using Photogrammetry, followed by Azure Kinect where a difference was noticeable. TLS was the least accurate in distinguishing between the two species. Nonetheless, the comparison is somewhat bias due to noise in the TLS point cloud that interfered with the correct execution of the algorithm. Removing the noise from the scans would have improved the results but was contrary to the purpose of this paper of evaluating the raw data.
In this paper, our results confirmed that the Azure Kinect has a very good potential in the field of low-cost precision sensors. However, increasing the sample size in future work, will further allow to validate our presented method. In addition, the detection of TreMs was performed manually on the software Cloud-Compare. However, this last part could be greatly improved with the help of AI (Artificial Intelligence). In fact, a machine learning algorithm has already been implemented to identify TreMs in TLS point clouds (Rehush et al., 2018). Applying the same approach to the Azure Kinect point cloud seems therefore a logical follow-up since based on this paper's observations, the quality of the Azure Kinect point cloud is similar to that of a TLS.

CONCLUSION
In conclusion, this study demonstrated that the Microsoft Azure Kinect sensor has promising abilities to measure the physical parameters of trees in our dataset collected in Swiss forests. The DBH of the three scanned trees were successfully inferred with acceptable precision. The quality of the point cloud was also shown to be of sufficient accuracy to detect asperities on the tree trunk at a centimeter resolution. These results show that the Microsoft Azure Kinect could potentially be a great alternative to TLS for detection of TreMs, although the measurement range is significantly shorter than that of TLS. The scanning process is straightforward and data can be generated in real-time, allowing the user to double-check the completion of the point cloud while scanning.