SECTION-BASED TREE SPECIES IDENTIFICATION USING AIRBORNE LIDAR POINT CLOUD

The application of LiDAR data in forestry initially focused on mapping forest community, particularly and primarily intended for largescale forest management and planning. Then with the smaller footprint and higher sampling density LiDAR data available, detecting individual tree overstory, estimating crowns parameters and identifying tree species are demonstrated practicable. This paper proposes a section-based protocol of tree species identification taking palm tree as an example. Section-based method is to detect objects through certain profile among different direction, basically along X-axis or Y-axis. And this method improve the utilization of spatial information to generate accurate results. Firstly, separate the tree points from manmade-object points by decision-tree-based rules, and create Crown Height Mode (CHM) by subtracting the Digital Terrain Model (DTM) from the digital surface model (DSM). Then calculate and extract key points to locate individual trees, thus estimate specific tree parameters related to species information, such as crown height, crown radius, and cross point etc. Finally, with parameters we are able to identify certain tree species. Comparing to species information measured on ground, the portion correctly identified trees on all plots could reach up to 90.65%. The identification result in this research demonstrate the ability to distinguish palm tree using LiDAR point cloud. Furthermore, with more prior knowledge, section-based method enable the process to classify trees into different classes.


INTRODUCTION
The availability of LiDAR data has grown exponentially during the last decade (Alt, 1962), with the advancement of sensor technology such as the increased sampling rate and accuracy, Li-DAR data is potentially beneficial to alter traditional detection in forestry into new development.
The application of LiDAR data in forestry initially focused on mapping forest community, and primarily intended for managing and monitoring large-scale forest such as timber volume (Nilsson, 1996, TopEye, 1997) and estimation of mean height at stand and plot level.Recently, as the smaller footprint and higher sampling density LiDAR data emerge, it was demonstrated to detect individual overstory in forests (Brandt berg, 1999, Hyypp, 1999), estimate crowns parameters (Popescu and Zhao, 2008) and identify species (Brandtberg et al., 2003a, Holmgren and Persson, 2004, Moffiet et al., 2005).The new full waveform scanners that provide a higher point density and additional information about the reflecting characteristics of trees, also proved feasible to detect individual trees in forests and classify tree species (Brandt berg, 1999, Hyypp, 1999).But the critical issues including the calibration and the decomposition of full waveform data with a series of Gaussian functions (Reitberger et al., 2008) are usually difficult parts of the whole work.And the data acquisition is usually costly.Thus, LiDAR data based on a single wavelength band and relatively low flying heights are generally used for individual tree species identification.Furthermore, recent studies have shown that individual tree or larger areas canopy height distributions, can be obtained from LiDAR data acquired under leaf-on (Moffiet et al., 2005) or leaf-off conditions (Brandtberg et al., 2003a, Nsset, 2005).
Still there is great interest in automatic tree classification and in- * Corresponding author: lenvimore@whu.edu.cndividual species identification in wilderness areas by detecting structural parameters such as crown size, crown base height and other synthetic geometry parameters.The purpose of this paper is to develop a standard procedure for tree species identification using LiDAR data.
The overall goal of the study to identify individual palm tree using airborne LiDAR data.More specific steps are as follows: 1. Rough Classification of LiDAR point cloud; 2. Remove buildings using the decision-tree-based rules in section profiles; 3. Locate trees in depth image and pinpoint individual tree using section analysis; 4. Estimate specific parameters, such as crown height and crown radius; 5. Identify palm tree;

Study Site
The study area Wuzhizhou island is located in southern China (N18 • 18'51.07", E109 • 45'42.60")as Figure 1 covering 1.48km 2 .Forest comprises more than 95% of the island while bare rocks and constructions mainly comprise the rest 5%.
As a typical eco-tourism destination, Wuzhizhou island has approximately 2000 species of natural plants including banyans, coconut trees, palm trees, oak trees, cinnamomum japonicum trees, dragon trees and shrubs etc.For each specie of tree we could describe detailed visual and structural features such as crown base height and crown radius.In this paper, palm tree, which has a distinguishing crown shape and sparse branches, is employed as identification objective to illustrate the tree identification procedure.For specific dataset, the LiDAR system provided a 30 degree swath from nadir, for a total scan angle of 60 degree.With a cross-hatch grid of flight lines, the average laser point density is 4 per square meter.The point density translates into an average distance between laser points of about 1m.The average swath width was 350 m, with 4 flight lines in a northsouth direction.

Technical Design
With high quality LiDAR data available, the flow chart for processing described above in introduction is shown in Figure 2. 3.1.1Preprocessing: Firstly, build a digital surface model (DSM) by interpolating lidar point elevations to a regular grid with a spatial resolution of 0.5m using the triangulated irregular network (TIN) method.Lidar points used for creating the DSM included only the highest points in 0.5m by 0.5m cells so as to allow an accurate characterization of the top canopy surface.Then establish a raster Digital Terrain Model (DTM) used for the normalization of raw point heights to get absolute tree height from the raw point clouds and eliminate the interference from terrain and the buildings.The DTM is generated from the raw point clouds with TIN progressive adaptive filtering algorithm (P., 2012).Finally create a Crown Height Model (CHM) with a spatial resolution of 0.5m by subtracting the DTM from the DSM.The CHM represents a three dimensional surface that characterizes vegetation height across the landscape.
3.1.2Removing Building Points: Based on the CHM, the point clouds can be roughly classified into ground points, building points, short trees, medium height trees and tall trees.In order to extract the crown points, the echo information of point data is significant to identify the remaining artificial object points such as buildings from the profile of the current section.Besides, unique sign like blank signal under series horizonal points is also important.It is known that laser beams can penetrate tree branches in forest area and most LiDAR systems are capable of receiving multi-echo-return points.Notice that the building points has only the first return and the last return signal (Charaniya et al., 2004).With sections profile method, the building points are extracted using decision-tree-based method.Thus, the rest of the points are basically representing trees.Notice that building points are either the only-first return points or the last return points (Charaniya et al., 2004).An interesting phenomenon as highlighted by circle C is that the returns on the roof of building have the lowest height values, which indicating tree branches invaded the space atop of the buildings.However, in most cases, there should be no points over the roofs.
There are several buildings sparsely dotting the study site and can easily be mistaken as trees.From all above, the removal of building points from the high vegetation points starts with the determination of the section width w which depends on the average distance d of the ground points classified from the original points.
In order to exclude the building points completely, the sections of the whole study area are selected both along the X-axis and Y-axis.And the section number can be defined as Nx and Ny respectively.The parameters mentioned above can be calculated by formula (1), ( 2) and (3).Take the j section along the Y direction as an example.In a certain section, the Y value of each points can be regarded as a constant value.The building points are excluded using decisiontree-based method following the steps below: 1. Assume the total points in the j section is N .Assort all the points by the X value of the ground points, and calculate the distance in X direction, if di,i+1 > d threshold , then the Xi and Xi+1 is defined as the edge points.(see Figure 4) 2. Divide Xi and Xj into n sub-segments.To detect the wall points of the buildings, let the edge points Xi = Xi − d 2 .n is calculated as formula (5).
3. Take cluster Ctree into processing.Attain all the tree points in the range of current sub-segment S of XiXi+1.Count the number Ns of tree points and assort tree points by Z-ascending, then all the Ztmin and Ztmax of tree points in current S should be calculated next.If the edge points are in the current sub-segment, the current sub-segment can be called as edge-sub-segment.
4. Finally the building points can be excluded by following rules shown in Table 1.By adding the parameters mentioned above, the normal roof points number threshold values NT , the elevation calibration standard deviation σH and the minimum building height HT are also concerned.Table 1.The conditions to exclude the buildings in sub-segment Figure 4. Section display Picture shows the division of sub-segment in section, the original clusters of points and the related parameters Theoretically, sections can be derived from any directions.But two perpendicular directions are enough here to remove buildings.

Generating Tree Point Grid:
After removing all the building points, the ground points C ground and tree points Ctree are used to derive the tree locations.In the airborne LiDAR data, the tree points are mostly scattered on top of the tree crowns.As a result, the absolute CHM is calculated by subtracting the height value of DTM at each point from the height value of DSM.Thus, tree heights can be taken directly from the CHM.
The CHM maps are the rough surface of the canopy.Assume the highest point of a tree can be regarded as the location of the tree, then generate the grids of crown points first.The size of each cell is defined depending on the average ground point distance d.Put all the tree points to the grids according to the geographical range of the grid, and the value of each grid (GV ) can be defined by following formula (6).
where GV = the height of grid cell m = tree point amount in current grid ZCHM = the height of tree points

Initial Estimation of Tree Location:
The cell containing the tree top point should have the biggest value among all the eight neighboring cells.As a result, the template as defined by formula (7) and formula (8), will be used to derive the tree top grids GVtop.After all the GVtop, the P which has the biggest high value can be seen as the original tree top point Ptop, and all the Ptop cluster can be signed as CP top , .
where T = the convolution template to get original tree top cell HV = the convolution result HT = the threshold of HV to select tree top cell GV = the Z value of current gird cell GVtop = the grid containing original tree top point i, j = row and line number of the whole cell N = the tree points amount in current cell p, pt = certain tree point in current cell Zp, Zp t = Z value of the correspondingly points

Point segmentation of individual:
In section above, the size of grid cell is small enough to identify all of the tree tops in small scale and therefore, there must some neighboring small tree tops belong to one large tree.As the assumption that the topmost tree points stands for the tree location, the CP top will be assorted descending by Z-values.
Every tree is approximately axisymmetric.Take the tree peak point as the center of the section rectangle in the top view.And as the complex location relationship between the trees shown in Figure 5, two more directional section profiles will be added to extract the topmost tree following similar process to building removal.In the forest area, most of the tree location relationships fall into the categories as (c)-(f) in Figure 5.In the area with human activity, the trees will usually follow the condition (a) and (b) in Figure 5.It is worth noting that condition (d) and (e) are the most common and only in dense forest.The tree points can be projected onto the profiles which are defined by the center of the section, like Figure 6 shows.The more directional sections we use the more accurate the results will be.The tree parameters can be extracted from four directional section profiles such as S1, S2, S3 and S4 (see Figure 6).From figures in profile, the complex location conditions of different tree lead to different decision parameter and method.In order to identify palm tree precisely, the parameters (see Figure 7) can be extracted by analyzing the four section profiles in the steps detailed below: 1. Assort all the tree top points CP top descending by Z-values, and the first point P in the new CP top will be defined as the highest tree in the study area.If it is not the first point to locate the tree, the k th point P is the highest in the remained CP top points.
2. Divide 2D section profile divided into N sub-segments, record all the maximum Z value in each sub-segment St, and the segment ID is t(0 t N ).If there is no tree points in the sub-segment St, Zt =0.From the St, find out the P edge and the Pcross from both side of the using the formula (10).Compute all the section profiles of the current tree top and get all the tree edge points.Then there will be nine points to describe the current tree including tree top point, can be describe as the T R k , and k is the tree top ID in the CP top .The definition of Pcross and P edge .
(10) where 0 t − i, i < N, 0 < t < N, t, i ∈ N Pt = tree top point in current section profile Pt−i = the i th point on left side of Pt Zt−i = the Z value of Pt−i 3. Delete the fake tree tops around current tree point P by judging whether the remained tree top points are included in the new octagon which will define the original tree boundary.
4. After all the tree top points computed, the tree number can be derived simultaneously.Most of the tree points can be added to the corresponding original tree boundary.And some tree points in the gap between trees are still not added to any tree boundary.The nearest distance to the tree stem is used to classify the remaining tree points to their right trees, as the formula (11) shows.
where Premain = one of the unclassified points T r lef t = tree points cluster at left segment T r right = tree points cluster at right segment d lef t = 3d distance of Premain to T r lef t d right = 3d distance of Premain to T r right

Identification of Palm Tree
Generate necessary parameters for cluster analysis method to identify the palm tree species classify based on the key tree points and the segmented tree points.For each tree segments points, the extraction and definitions of the parameters are calculated by the following formula ( 12)-( 20).In order to make the parameters more distinctive, it is necessary to stretch the parameter to the interval (0-100).
H breast = ZP top − ZP edge (13) All of the cluster analysis methods for the traditional image analysis is available for the tree species classification but not identification.With parameters above, we enable the traditional cluster analysis to identify tree species using parameters as prior knowledge.Notice, θcrown and θcross can not be used together, thus, the first classification/identification processing begins with the P edge .If there is a P edge in the tree segment, the parameter Ratiocrown, Ratio breast and θcrown are used the most in areas with human activities.For four sections profiles for each tree segment points, there will be eight sets of parameters.If there is more than one θcrown or θcross values, then the average values are taken.

RESULT
Using the approach presented above, parameters in sub-segment after excluding buildings from analysis are shown as table below.In the flat area where the trees are sparse, most of the palm trees can be identified by the parameters trained by the Maximum Likelihood method.To examine the identification accuracy, the sample Area 1, Area 2 and Area 3 chose from the study area with different level of tree density are used.Area 1 is flat with sparsely located palm trees, Area 2 is tropical forest area and Area 3 in between Area 1 and Area 3. Palm trees data measured on the ground and the detected are shown as follows.Table 2 indicates that the total error identification percentage in the Area 1 is 2.44%, in the Area e is 9.52%.Surprisingly, the accuracy in Area 2 is 0, which clearly shows that the tree is much more difficult to detect in the natural tropical forest area because the high density make tree are fused closely.But for all of the conditions, the identification accuracy of palm tree specie is 91.51% while the total error identification percentage is 8.49%.

DISCUSSION
In this paper, the relative height from the ground is important to derive possible tree points.The height threshold is usually set around 1.0 2.0m depending on the forest condition of study site.
Given that more than 90% of the Wuzhizhou island is covered by forest, the threshold is set to 1.0m.
As the roofs and walls of the buildings affecting the tree points identification analysis, most of the buildings extraction approaches only removes the points representing the roofs while in our study, we have a further research on removing the wall points completely as well with the echo information in use.
For the purpose of deriving each individual tree location, the resolution of the Tree Points Grid is significant determining the number of the palm trees we can detect.Two typical sample test area are selected to test the section based methodology.Area 1 is the area with human activities, and Area 2 is located in the hilly forest.Considering that the point density is 4 per square meters, the average distance between points is 0.5m.In Area 1, the manual counted number of tree is 109, and in the Area 2, it is 73.The raster grid resolution is changed from 0.2m to 10m to attain the best resolution for the tree point grid, and the length of the tree point grid (see Figure 9).The original tree location detection method is very sensitive to the grid resolution, and decision-treebased method can maintain the final tree location and numbers.
From the chart below, 0.4m of grid length is suitable to detext almost all the trees, and the method is more effective in the Area 1 where the trees are located relatively sparsely.
Figure 9. Relationship between detection amount and grid size

Conclusion
According to Table 3 above, we conclude that section profile based decision tree method benefits the separated located area (Area 1) where most of the palm trees can be detected and identified, while palm trees in the forest area with high density (Area 2) of tree are much harder to be detected.As most of the trees of study area are in natural forest, and the location relationship between trees is very complex.Given that palm trees generally distribute scattered and under a few condition they are conjoint but still much more recognizable than the shrubs, Therefore, tree species identification is supposed to be applied combining the condition of their growth environment.In the condition like the Area 1, most of the tree crown segment always has a P edge , but in the Area 2, the tree crown segment rarely has a P edge unless there is an obvious high tree among the shrubs, and the parameters such as Ratiocrown, Ratio breast and θcross are used frequently instead of Ratiocrown, Ratio breast and θcrown.Moreover, parameters are suggested to be generated and adopted based on practical conditions since the location relationship among trees is quite complex.
In order to get all the trees species class, ground-based LiDAR data is needed to get more details under the top crowns, but it is still a challenging work because of the rocky terrain.Furthermore, the biomass and the law of tree distribution can be researched.Besides, neural network algorithm is a promising method to be applied if it trains with structural parameters.
In summary, LiDAR data can provide density precisely 3D point data sensitive to the changes of altitude, and it is quite useful to attain the profile data of all the objects on the ground.Geometry characteristics, frankly, the parameters calculated using the decision-tree-based method, are valuable to detect and identify the structure and classification of objects.

Figure 1 .
Figure 1.Overview of the Wuzhizhou island

Figure
Figure 2. Work flow

Figure 3 .
Figure 3. Different information of point classification As for figure (b), the only-first return points are red, the medium return points are yellow, the last return points are blue.
3) where w = the section width d = the average distance of the ground points Nx = the section number along X direction Ny = the section number along Y direction Xmax = the maximum X value of the study site Xmin = the minimum X value of the study site Ymax = the maximum Y value of the study site Ymin = the minimum Y value of the study site Decision If NS NT and Ztmin HT All the points in S can be classified to cluster C building If NS ¿ NT and Ztmin > HT There are branches upon the roofs.If point P is either in the only-first return or in the last return, and ZP − Ztmin > 2σH , then P ∈ C building If NS ¿ NT and Ztmin < HT , and the next sub-segment is not the same condition The minimum Z value in the two sub-segment is Ztmin,n.If Ztmax < Zmin−n, all the points in S can be classified to the cluster C building .And the point P which contents with Ztmin < ZP < Zmin−n, P ∈ C building .If NS ¿ NT and Ztmin < HT , and the next sub-segment is the same condition Drop, and process it in section along Y-axis.

Figure 5 .
Figure 5.The location relationship between the trees

Figure 6 .
Figure 6.Top view of one tree with four different section profiles The red point stands for the location of the tree top, and different color refers to different directional section profiles respectively

Figure 7 .
Figure 7.The parameters in tree section profile

)
Rcross = f abs(ZP top − ZP cross ) tree point in current tree segment Htree = ZP top Zcmax = ZP top Zcmin = min(ZP i ) ZP i = Z value of the point Pi

Figure
Figure 8. Identification Result

Table 2 .
Parameters of palm tree

Table 3 .
Accuracy assessment in different areaThe identification result of partly study site in Area 3. Ground points are orange, building points are red and palm tree points are green.Points in red circle of the second figure are of one palm tree points being missed.