UAV-BASED PHOTOGRAMMETRIC POINT CLOUDS-TREE STEM MAPPING IN OPEN STANDS IN COMPARISON TO TERRESTRIAL LASER SCANNER POINT CLOUDS

In both ecology and forestry, there is a high demand for structural information of forest stands. Forest structures, due to their heterogeneity and density, are often difficult to assess. Hence, a variety of technologies are being applied to account for this ”difficult to come by” information. Common techniques are aerial images or groundand airborne-Lidar. In the present study we evaluate the potential use of unmanned aerial vehicles (UAVs) as a platform for tree stem detection in open stands. A flight campaign over a test site near Freiburg, Germany covering a target area of 120× 75[m] was conducted. The dominant tree species of the site is oak (quercus robur) with almost no understory growth. Over 1000 images with a tilt angle of 45°were shot. The flight pattern applied consisted of two antipodal staggered flight routes at a height of 55[m] above the ground. We used a Panasonic G3 consumer camera equipped with a 14 − 42[mm] standard lens and a 16.6 megapixel sensor. The data collection took place in leaf-off state in April 2013. The area was prepared with artificial ground control points for transformation of the structure-from-motion (SFM) point cloud into real world coordinates. After processing, the results were compared with a terrestrial laser scanner (TLS) point cloud of the same area. In the 0.9[ha] test area, 102 individual trees above 7[cm] diameter at breast height were located on in the TLS-cloud. We chose the software CMVS/PMVS-2 since its algorithms are developed with focus on dense reconstruction. The processing chain for the UAV-acquired images consists of six steps: a. cleaning the data: removing of blurry, underor over exposed and off-site images; b. applying the SIFT operator [Lowe, 2004]; c. image matching; d. bundle adjustment; e. clustering; and f. dense reconstruction. In total, 73 stems were considered as reconstructed and located within one meter of the reference trees. In general stems were far less accurate and complete as in the TLS-point cloud. Only few stems were considered to be fully reconstructed. From the comparison of reconstruction achievement with respect to height above ground, we can state that reconstruction accuracy decreased in the crown layer of the stand. In addition we were cutting 50[cm] slices in z-direction and applied a robust cylinder fit to the stem slices. Radii of the TLS-cloud and the SFM-cloud surprisingly correlated well with a Pearson’s correlation coefficient of r = 0.696. This first study showed promising results for UAV-based forest structure modelling. Yet, there is a demand for additional research with regard to vegetation stages, flight pattern, processing setup and the utilisation of spectral information.


INTRODUCTION
1.1 Utilization of UAVs in the environmental sector Recently, unmanned aerial vehicles (UAVs) have evolved to offthe-shelf platforms for remote sensing applications and photogrammetric data acquisition.UAV based digital surface models and ortho images have already been successfully evaluated and proved to be sound products.Moreover UAVs facilitate data acquisition with a high spatial and temporal resolution, also referred to as hyperspatial and hypertemporal, [Lucieer et al., 2012], which yielded a lot of attention regarding the applicability of UAVs in the environmental sector [Haala et al., 2011].Compared to other platforms such as airborne Lidar, relative low costs for acquisition of information and high flexibility with such devices led to intense research in that field.UAVs of different size and technology have been used in applications such as precision agriculture [Valente et al., 2013, Zhang andKovacs, 2012], fire monitoring [Merino et al., 2012] and forest characterization [Tao et al., 2011, Wallace et al., 2012].Payload used range from off-theshelf digital cameras [Rosnell and Honkavaara, 2012] to FLIR systems [Turner et al., 2011] and laser scanners [Wallace et al., 2012].In addition more and more efficient algorithms for transforming the enormous amount of collected images into 3D information become available for public through web services or open source tools.Examples for the first are Microsoft Photosynth or 123D Catch Beta for free processing.Bundler [Snavely et al., 2006] and CMVS/PMVS-2 [Furukawa and Ponce, 2010a] are options for the latter.Output of the programs is a sparse to dense point cloud generated through image matching.The point cloud carries additional information such as color (taken from the images) or point-normals (estimated during the processing).

Dense 3D reconstruction for forest stand parameters
In both ecology and silviculture, there is a high demand for accurate forest structure information.Modern forest resource planning, applications such as industrial wood flow optimization, require accurate inventories of forest stands, including information as diameter at breast height (DBH) distribution or tree density [Moberg and Nordmark, 2006] and tree quality parameters.Likewise high resolution 3D-information is needed in many scientific fields and essential for accurate mapping of vegetation biomass [Frolking et al., 2009], habitat quality [Vierling et al., 2008] or carbon storage [Asner, 2009].To obtain this three-dimensional information Lidar is widely and commercially used in forest inventories [Hyyppä et al., 2008].However, aerial laser scanning (ALS) exhibits high area coverage, but relatively high costs and lower point densities, whereas the latter is limiting tree detection accuracy [Tesfamichael et al., 2009].In contrast to ALS, terrestrial laser scanning (TLS) produces very high point densities, although decreasing with increasing distance to the sensor and covering a relatively small areas.Point cloud generation based on UAV platforms and image matching could possibly fill in the gap between ALS and TLS, accessing/covering quite large areas, delivering high point densities for accurate detections, while requiring relatively little resources.Attempts to mount Lidar sensors on UAVs have shown great potential.However in [Wallace et al., 2012] several limitations are indicated, e.g. the generation of accurate and dense point clouds is only possible at quite low altitudes (below 50m).This restricts applications in stands with taller trees.Area coverage is furthermore limited as the relatively high weight of suitable Lidar sensors only allows for shorter flight durations (employing a conventional octocopter).A technique to extract 3D-information of UAV flights is given by the structure-from-motion (SFM) processing chain, which produces point clouds on the basis of feature matches within overlapping images.Several studies depicted a high potential of the combination of UAV image flights and the SFM processing chain, particularly with regard to urban objects and terrain modeling [Remondino et al., 2011].By now fewer attempts have been conducted to extract vegetation structure or specifically information on forest stands.Previous studies indicated difficulties and limitations of this technique [Harwin andLucieer, 2012, Rosnell andHonkavaara, 2012] in complex and heterogeneous spatial structures such as vegetation.Thus, 3D-reconstruction of vegetation requires an approach which differs from the above mentioned objects in terms of the data acquisition setup and data processing and its parameters.In this initial study, we present and evaluate an approach to reconstruct and automatically detect individual trees, based on an aerial survey and processing setup which is adjusted to broad leaf forest stands.

Study Site
The study site is located in an old oak (Quercus robur 90%) dominated stand close to Freiburg I. Brsg.Germany (48°0'51 7°44'39).The stand has little under growth and a low stem density.Relevant species beside oak are hornbeam (Carpinus betulus 10%) and maple (Acer pseudoplatanus).The entire area (36.4[ha]) is protected due its near to nature forest community with a significant portion of standing dead wood, bird's nests and bat appearance.In total the study site covers only 0.9[ha] (75 × 125m 2 ). Figure 1 shows an overview of the site.Clearly visible is its sparse tree layer and the soil coverage of Carex brizoides.

TLS survey of the study site
To compare the results of the UAV-data eight terrestrial laser scans were carried out during leaf-off state (date of scan cam- Table 1: TLS parameters

UAV setup and data acquisition
With respect to the study area and the processing chain the following requirements of the UAV-platform were identified: a) capability of starting and landing within the forest through the canopy b) flying at low altitudes close to the canopy top and c) ability of performing small meshed flight patterns.In accordance with these criteria an octocopter was chosen as the UAV-platform, which is based on a modified MK Okto2 (Highsystems GmbH).
A standard consumer camera (Panasonic Lumix G3) was used as sensor.The high resolution, the capability to use different lenses and the good signal to noise ratio of the sensor were crucial factors of the choice.It was fixed on a flexible mount which enables tilting the camera vertically and horizontally.This also compensates changes of orientation while cruising.The camera was triggered at a pulse rate of 1.4 [Hz].
After evaluation of the different flight patterns and the corresponding 3D reconstructions as well as camera settings a double zigzag pattern was chosen (see Figure 2).It consists of two flights shifted along the direction of view.
Two final image flights were performed on a slightly cloudy afternoon in mid April 2013 (air-temperature 22°C; wind-speed 2.5[m/s]; rel.humidity 50%; radiation 620W/m 2 ).Image acquisition took place along two antipodal staggered flight routes at a height above ground of 55[m] and a cruising speed of 2.5[m/s].
The horizontal viewing direction was kept stable to 118°in order to maximize image overlap and hence feature matching within the SFM-processing chain.In total 1129 images were taken.Even were set on the camera.

UAV-data processing
Pictures taken by the UAV were manually inspected regarding blur, under-or over-exposure, visible horizon and off-site.Eventhough, the used algorithms are known to be robust to changing image quality [Lowe, 2004], orthophoto generation is affected by the image choice.The cleaned set of images is processed with a typical structure from motion tool chain.In our case with Visu-alSFM developed by [Furukawa and Ponce, 2010b].It pipes the results of the different processing steps to the succeeding tool and visualizes intermediate results.In five steps, a dense point cloud is generated: 1. applying the SIFT operator (Lowe, 2004): the semi invariant feature transform detector is used to automatically generate tie points for each image; 2. image matching: with the help of detected SIFT-features each image is matched with its corresponding images; 3. bundle adjustment: a multicore bundle adjustment [Wu et al., 2011] is applied to estimate camera parameters; 4. clustering: clustering the image set for efficient dense reconstruction and; 5. dense reconstruction: generating point clouds from images and camera parameters.Each step can be manipulated by a variety of parameters.Choosing suitable settings for each step is a complicated procedure, requiring a high number of tests.In addition, the settings influence each other and therefore a systematic analyses of the effect of a single parameter is difficult.We have chosen a trade of between reconstruction completeness and correctness.Settings towards a high correctness in terms of a high spatial accuracy of the reconstructed 3D-points are made at the expenses of a preferably comprehensive reconstruction.In With seven ground control points a RMSE of 0.12[m] could be achieved.With the transformation, the UAV-Point cloud got scaled by the factor 2.43.To eliminate outliers and single points we filtered the point cloud with a neighbour relation approach described in [Rusu et al., 2008].Figure 5 shows a view into the cleaned point cloud.
horizontal slicing The test site consist of relatively flat terrain which allows a simple z-value threshold for horizontal slicing as height layer creation.Identical slices of 50[cm] thickness were cut out of both point clouds.
extraction of coherent point clusters The slices contain a high amount of spurious points.However, tree parts are characterized by dense accumulation of points.We extracted coherent point clusters by defining cluster criteria regarding minimum and maximum points per cluster as well as maximal distance between cluster points.It turned out that best results could be achieved in a range of 40...10.000 points per cluster. of each other we considered this segment as successfully reconstructed.In cases of n : m pairs we achieve a 1 : 1 relationship while taking into account the estimated radius rather than then the distance.For example for one TLS-segment, three SFM-segments were found within one meter, the segment with the radius closest to the TLS-segment's radius would be considered as a match.

RESULTS
Results are presented for tree detection and radius estimation separately, even though there is a relation between them.If a tree (segment) has been reconstructed, the radius estimation has been executed.

Tree detection
The test site contains 102 trees above 7[cm] diameter at breast height (dbh).Out of these 73 (71%) were considered as reconstructed and located within one meter of the reference trees.This does not account for the different reconstruction output along the z-axis, which varies thoroughly.A tree counts as detected if it has been reconstructed with at least 5000 points.The left tree in Figure 3 as example counts as detected.With respect to incomplete trees in the SFM-point cloud, the detection rate in dependent on the height layer was evaluated.Figure 4 illustrates the decrease of matches with increasing height and shows that at ca. 20[m] height a clear drop in matches occurs.This can be explained by the beginning of the crown, and therefore components (smaller branches), which are more difficult to reconstruct.

Radius estimation
With diameter estimation, a consistent trend could be observed.
Figure 3 shows exemplary matched tree pairs.Figures 6 and 7 are based on the right tree.Radius estimation of the 50[cm] slices results in variations along the z axis.Similar to the matches, in lower parts of the stand a slightly better estimation is given.However, reliability decreases as the height location of increases.
Over all matches of slices of comparison of estimated radii of both data sets were made.The radii of the reconstructed slices seem to have a trend to be underestimated.The result of Pearson's test of linear correlation was significant at a two-tailed p value of p < 0.001 and a correlation coefficient of r = 0.696.

DISCUSSION AND OUTLOOK
Old open broad-leaf stands are in many ways attractive to foresters and environmentalists.On one hand they often represent high economic value (e.g.veneer production), and on the other such stands have a high ecological value as habitat for endangered species.Characterisation of such areas are highly favourable.UAVs can be seen as contributing towards new flexible and cheaper sensor platforms delivering high quality airborne spectral and 3Dinformation.We showed that in sparse forests single tree detection based on 3D-point clouds generated through image matching is possible.In the following sections we will discuss the results and the different settings made while conducting this study.

Flight planning
In several pre-studies, we experimented with different flight patterns and camera settings.Within traditional airborne image flights, pictures are taken in nadir direction towards ground and in parallel flight strips with usually 60% to 80% overlap.For the purpose of 3D-reconstruction of vertical structures, in this case single tree stems, nadir pictures are insufficient, trees are often occluded by the branches of taller trees and vertical surfaces.The ZigZagpattern with direction of view in 45°performed very well regarding connectivity of the images and pixel coverage of the trees.Furthermore this pattern reduces shadow effects, where parts of the crown hide in worst-case an entire tree.However, one draw back in our setting is the fact that the direction of view is fixed into one direction (west to east).With this setting, we achieve a relatively complete coverage of the stand, but we are missing the  back side of the scene.Thus, tree trunks cannot be reconstructed completely, they appear open on the back site.This effect can be observed in single TLS-scans in stands.Several studies show that a half reconstructed tree can still make reliable diameter estimations possible [Bienert et al., 2007, Liang et al., 2008, Lovell et al., 2011].

Structure from motion process
The open source software chain used in this study was able to process the image set and returned very promising results.Considering the vast amount of configurable parameters in each step, evaluating the total performance of the system becomes difficult.
Beginning with the first step -SIFT-detection -there has been evidence of a strong influence regarding specific settings [Lingua et al., 2009].We conclude that standard settings of pmvs2 are not optimal for reconstruction of strong jumps in distance to the camera.This especially applies to trees where small branches of the crowns are difficult to reconstruct due to their great drop in distance to the ground.It seems that increasing the pixels sampled (pmvs2: wsize) for calculating the photometric consistency score lead to better results.On the same hand lowering the threshold for accepting a reconstruction leads to more 3D-points as well.
Finding the best combination is a cumbersome process and needs to further optimized.

Reconstruction results
In general, the reconstruction of the ground with mainly grass vegetation is, except in shadowed areas complete and seems to be accurate.However, a detailed evaluation was not part of this study.Tree reconstruction can be seen as partly successful.The variation of reconstruction along the height axis is influenced by the density of structures in the crown.This means that thinner branches will likely remain undetected than the up to 50 times wider tree trunks near to the ground.As discussed in [Rosnell and Honkavaara, 2012], we can confirm that the strong threedimensionality of forests poses challenges for SFM processing and data acquisition.After testing various flight patterns we can conclude that generally reconstruction becomes more challenging at lower altitudes, while at higher altitudes accuracy decreases due reduced ground sampling distance.Another issue identified is the fact that trees are rarely reconstructed in a connected point set, rather they include gaps in the trunk and crown.Reducing shadow-effects in the image set used for the SFM-process can minimise the problem at the cost of increasing complexity of flight planning.

Radius estimation
Results of the radius estimation were unexpectedly good.Robust, RANSAC based cylinder fitting proved to be able to cope fairly well with the high variation within the stem points.The tendency of overestimating radii can be explained with the relaxed threshold of accepting a reconstruction which leads to less accurate results.

Future work
This first study has shown good results and promising potential for UAV-based 3D-reconstruction of forest stands, yet more research has to be conducted with regard to the following topics: flight pattern: Covering the scene with a coherent set of images, ensuring that the objects of interest are mapped from every side while minimizing shadow effects is very challenging.
Testing new flight patterns can help to make up the forest.
First attempts for agricultural sites can be found in [Valente et al., 2013].
processing setup: Many parameters in each processing step of the point cloud generation influence the results.Identification of the relevant parameters for forest stand reconstruction will be addressed in future work.
vegetation phase: We are planning to repeat this study in different seasons.Especially in summer when trees are foliated results are likely to change dramatically.However, other ecological or forestry relevant parameters such as canopy closure or species detection might become measurable.
utilization of spectral information: One advantage of computer vision based point clouds over Lidar-Point clouds is that they are coming with spectral data.At this stage we didn't use this information, however we believe that this can improve results and becomes very valuable when using leaf-on data.

Figure 1 :
Figure 1: View of the study area including ground control points (highlighted in the lower left) and the study area.

Figure 2 :
Figure 2: The blue and grey zigzag-lines show the flight path of the UAV.Orange arrows indicate the direction of the camera and the camera angle of 45°.paign 12.02.2013).The scanning device was a Faro Focus 3D 120.Five spheres for registration of the scans were placed in the target area.Mean registration error of the eight scans was 0.0035[m].After registration the point density was very high due to massive overlap of the different scans.To speed up processing time two filters were applied: a) a voxelgrid filter with a grid size of 1[cm] and b) an outlier removal to eliminate ghost and isolated points.Details can be found in table 1.The entire point set was cropped to the study area.Scan resolution 0.0035°A verage point size per scan 24Mio Total points in registered scans 125Mio Points after reduction 22Mio

Figure 3 :
Figure 3: TLS-trees (green) and reconstructed trees (red).Left shows a partly reconstructed smaller tree.Right shows a tree with a fully reconstructed stem and parts of the crown.with the zigzag-pattern, the vertical camera angle of 45°was held constant.The focal length of 14[mm], an aperture of F = 9, ISO film speed equivalent of 800 and a shutter speed of 1/800[sec] were set on the camera.

Figure 4 :
Figure 4: Matches between reference and reconstructed trees in relation to height above ground.

Figure 5 :
Figure 5: View into the cleaned SFM-point cloud.radiusestimation Each cluster of 50[cm] thickness was used as input point set for a RANSAC based cylinder fit.To ensure that the model cylinder fits the point set we defined criteria for accepting estimated parameters: a) radius must be between 0.05...1.0[m];b) the axis of the cylinder must be in direction of z with a maximum deviation of 30 • ; c) if after 5000 iterations no appropriate model coefficients can be found stop the iterating.To eliminate outliers, radii must be within raccept ≤ r + 2σr.

Figure 6 :
Figure 6: A 50[cm] slice of a TLS stem (green) and a reconstructed stem (red) and their corresponding fitted diameters.

Figure 7 :
Figure 7: A 50[cm] slice of a TLS stem (green) and a reconstructed stem (red) and their corresponding fitted diameters projected on a plane.

Table 2 :
Summary of the structure from motion reconstruction.
Table 2 basic characteristics of structure from motion procedure are summarized.registration & cleaning Ground control points were used to estimate transformations parameters of Helmert-transformation.