A POINT CLOUD CLASSIFICATION APPROACH BASED ON VERTICAL STRUCTURES OF GROUND OBJECTS

This paper proposes a novel method for point cloud classification using vertical structural characteristics of ground objects. Since urbanization develops rapidly nowadays, urban ground objects also change frequently. Conventional photogrammetric methods cannot satisfy the requirements of updating the ground objects’ information efficiently, so LiDAR (Light Detection and Ranging) technology is employed to accomplish this task. LiDAR data, namely point cloud data, can obtain detailed three-dimensional coordinates of ground objects, but this kind of data is discrete and unorganized. To accomplish ground objects classification with point cloud, we first construct horizontal grids and vertical layers to organize point cloud data, and then calculate vertical characteristics, including density and measures of dispersion, and form characteristic curves for each grids. With the help of PCA processing and K-means algorithm, we analyze the similarities and differences of characteristic curves. Curves that have similar features will be classified into the same class and point cloud correspond to these curves will be classified as well. The whole process is simple but effective, and this approach does not need assistance of other data sources. In this study, point cloud data are classified into three classes, which are vegetation, buildings, and roads. When horizontal grid spacing and vertical layer spacing are 3m and 1m respectively, vertical characteristic is set as density, and the number of dimensions after PCA processing is 11, the overall precision of classification result is about 86.31%. The result can help us quickly understand the distribution of various ground objects.


INTRODUCTION
Ground objects, including buildings, vegetation, roads, are important components of urban facilities.Conventional photogrammetric methods can help conduct surveys on ground objects distribution and change.But these methods need a long production cycle and cannot meet the demands of rapid urban development.LiDAR is an active remote sensing technology and is able to get intensive point cloud to present precise 3D information of ground surface (Macfaden et al., 2012;Rottensteiner et al., 2005;Kim et al., 2011).LiDAR now has been applied to identify and classify ground objects, and this technology can improve identification efficiency and classification accuracy.
Many researches (Bork et al., 2007;Antonarakis et al., 2008;Dalponte et al., 2008) about point cloud classification have been conducted, and some of them combined point cloud data with other data sources, such as high spatial resolution images, to get more accurate results.In order to obtain higher classification accuracy, various researches have conducted.For instance, Zhang et al. (2016) split point cloud into hierarchical clusters and extracted the shape features of the multilevel point clusters, and then the precision of the classification was improved by utilizing the robust and discriminative shape features.Lin et al. (2014) introduced a method that could analyze local geometric characteristics of a point cloud by using a weighted covariance matrix.In this way, eigen-features that more reliable were obtained and the classification accuracy was improved.Zhu et al. (2017) utilized multi-level semantic relationships, such as pointhomogeneity and supervoxel-adjacency, to classify point cloud.Weinmann et al. (2015) presented a four-component framework to select better-performed neighborhood and features, and thus improved classification results.In addition, point cloud classification has been applied to many fields.Acharjee et al. (2015) proposed a novel filter algorithm based on point cloud classification.Ground objects classification with point cloud can also be applied to urban planning and urban construction (Guo et al., 2014;Niemeyer et al., 2012;Ramiya et al., 2015).
Different ground objects' point cloud data have different vertical structural features.For example, point cloud distributions of trees diverge from the bottom to the top because structures of trunk and crown are different.And since laser cannot penetrate the top of buildings, we can only acquire point cloud data of buildings' roofs, so there are few points at the bottom of buildings.Point cloud data of buildings and trees can be separated according to the distribution of points in vertical direction.In this paper, vertical structural characteristics of different ground objects' airborne point cloud data are analyzed and the characteristics are regarded as the basis of classifying urban point cloud.

METHODS
Figure 1 shows the whole process of urban point cloud classification.

Study area and urban airborne LiDAR point cloud data acquisition
Airborne LiDAR system integrates GPS (Global Positioning System), IMU (Inertial Measurement Unit) and a laser scanner, which are mounted on aircrafts or Unmanned Aerial Vehicles (UAV).Three-dimensional coordinates and intensity information of ground objects can be obtained by using airborne LiDAR system and point cloud data thus are generated.
Study area in this study need to contain dominant ground objects, including roads, buildings and vegetation, so we found an area in Wuhan where could satisfy the requirement.Airborne LiDAR point cloud data were acquired in summer.Trees had abundant foliage and ground objects in the area had prominent vertical features at that time.The number of points in acquired airborne LiDAR data was about 540,000 and the largest elevation difference in the area was nearly 34m.Although shapes of buildings in study area were relatively regular, vegetation, mainly trees, were close to buildings and there were some buildings with low heights, which brought great difficulties for point cloud classification.Figure 2 is the overhead view of the study area presented by an aerial image and point cloud data.
It is inevitable to be affected by some disturbances when acquiring point cloud data, so raw point cloud data will have some random errors and system errors.In order to rebuild the ground surface with point cloud data, raw data needs to be calibrated and preprocessed to eliminate the errors and then can be utilized for point cloud classification.

Horizontal grids construction and vertical layers segmentation
Airborne LiDAR data have high density and dispersion, and the data are unorganized.In order to organize and manage the points, virtual square grids are constructed.Each point in raw point cloud data has three-dimensional coordinates (Xi, Yi, Zi).Firstly, maximum and minimum values of X coordinate and Y coordinate, namely Xmax, Xmin, Ymax and Ymin, are found to determine the horizontal distribution extent of point cloud data.Secondly, appropriate grid spacing (l) is set according to sizes of ground objects.Then the number of grids (M × N) can be calculated based on formula (1) and (2).Above grids are numbered from 1 to M × N, and each point can find its corresponding grid (NUMi) according to its coordinates (Xi, Yi).NUMi can be identified based on formula (3) to (5).In formula (3) and ( 4), Hi is the identification number of the point in X direction and Wi is the identification number of the point in Y direction.In this case, each point will be put in corresponding grid and the relation between the point coordinates and the grid number can be built.max min For each grid, it is segmented into several layers with a proper spacing (S), and points in the grid will be distributed to different layers according to their elevations.If the elevation of a point is Zi, its corresponding layer number (Vi) can be identified by formula (6). Figure 3 shows the process of constructing horizontal grids and segmenting vertical layers.In subsequent processes, point cloud data in any grid and any layer can be chosen based on NUMi and Vi.
For each grid, when characteristic value of each layer has been calculated, the characteristic curve of the grid can be formed.Sequence numbers, namely j (j = 1, 2…, n), of layers are taken as abscissas and characteristic values of layers are taken as ordinates, and we can get n discrete points.A characteristic curve will be plotted by connecting these discrete points.Different vertical characteristics can generate different characteristic curves.In order to illustrate the process of plotting characteristic curves, representative grids of three dominant kinds of ground objects (Figure 4 6).Feature points on the curve, such as turning points and points with maximum and minimum characteristic values, need to be preserved, so proper sampling interval should be set.Assuming that the number of grids is m, discretized characteristic curves of all grids will form a n×m matrix (Figure 7).These vectors will create an n-dimensional space.In this space, the characteristic values in each dimension of the i-th grid are y1i, y2i, …, yni respectively.Imagining that the characteristic curve of the i-th grid is a point in n-dimensional space and its coordinates will be (y1i, y2i, …, yni).Now if we want to classify the characteristic curves we just need to classify characteristic points in n-dimensional space.
Figure 7.The matrix of discretized curves 2.4.2PCA processing and unsupervised classification: PCA can reduce the dimension of data and emphasize the differences between data sets.When the n×m matrix of vertical structural characteristic curves is created, PCA is employed to reduce the dimension of the matrix.This process can eliminate similar features and reserve significant features of the matrix, namely eliminate unhelpful dimensions in the matrix, and the number of the matrix's dimensions will decrease from n to k.Then column vectors in processed matrix are classified into three categories by using K-means algorithm, which is an unsupervised classification algorithm with high efficiency.Once vectors classification is completed, their corresponding curves, grids and points in these grids will be classified as well.

RESULTS
Classification method proposed in this paper has several processing phases.Several parameters, including spacings of horizontal grids and vertical layers, vertical structural characteristics, and the number of dimensions after PCA processing, need to be set manually in these phases.Variablecontrolling method can be employed to explore the effects of setting different parameters.In addition, three precision indicators, MR (Misclassification Rate), OR (Omission Rate) and CR (Correct classification Rate) are used to evaluate the result of classification, and they represent the ratio of misclassified points, omitted points and correct classified points to the total points of the certain category respectively.
We firstly combined different horizontal grid spacings and vertical layer spacings to explore the influences of these two parameters.There are four combinations of horizontal grid spacings and vertical layer spacings, which are 20m and 3m, 10m and 2m, 5m and 1m, 1m and 1m.And vertical structural characteristic and the number of dimensions after PCA processing are set as density and 11 for the four combinations.
Figure 7 and Table 1 present classification results and classification precision of point cloud data with these four parameter combinations.Among the four parameter combinations, when horizontal grid spacing and vertical layer spacing are set as 5m and 1m respectively, the overall classification precision is higher than that of other three combinations.
Figure Then two vertical characteristics, density and measures of dispersion, are applied to conduct point cloud classification, and horizontal grid spacing, vertical layer spacing and the number of dimensions after PCA processing are set as 2m, 1m and 11.The classification results and precision are evaluated as well (Figure 8, Table 2).When other parameters are the same, using density as the vertical characteristic can get higher classification precision.Density has better performance not only on the overall classification result but also on the three specific kinds of ground objects.

. Classification precision of using different vertical characteristics
The number of dimensions (N) after PCA processing can impose an impact on the expression of characteristic curves.When horizontal grid spacing and vertical layer spacing are 3m and 1m, and the vertical characteristic is chosen as density, we compared classification results when N is set as 11 and 5 (Figure 9, Table 3).The overall precision is 86.31% when N is 11, which is much higher than that when N is 5.When N is set as 5, some points of buildings are classified into vegetation while some points of vegetation are classified into buildings, the classification result is not satisfying.Appropriate horizontal grid spacing and vertical layer spacing are difficult to find.If the spacing of horizontal grid is excessively large, such as 20m, a horizontal grid may contain more than one ground object, and the classification result is coarse.While if the grid spacing is overly small, such as 1m, ground objects in grids may be not intact and their vertical characteristics cannot be presented well.The vertical layer spacing also has the same problem as the horizontal grid spacing.Vertical characteristics cannot be depicted completely and effectively if layer spacing is large, because some features will be neglected if there are many points in a layer and the characteristic of these points is presented with only a value.In this study, final grid spacing and layer spacing are set as 3m and 1m respectively.This combination of the horizontal grid spacing and the vertical layer spacing may be not the optimum one and some points are not classified into the correct class with this combination, but the generated grids and layers can help depict vertical characteristics appropriately.
Concerning vertical characteristic, measures of dispersion apparently cannot reflect the characteristic of point distribution, and the classification result when measures of dispersion is regarded as vertical characteristic is not fulfilling.On the contrary, the classification result significantly improve when density is set as vertical characteristic.
PCA processing can help reduce the dimension of discretized curves and emphasize prominent features of these curves.However, the number of dimensions after PCA processing need to be controlled.If N is excessively small, some features that are effective to classification may be removed by mistake and thereby result in lower classification precision.As the results mentioned earlier, classification precision is higher when N is 11 than that when N is 5, which can support the analysis.
The point cloud classification approach used in this research utilize the vertical characteristics of point cloud and point cloud are classified into three main classes without the assistance of other data sources.This method is applicable to ground objects classification in large study area and can improve the classification efficiency.The classification result can provide a general distribution of ground objects in study area.Nevertheless, the classification result is not precise enough, because the classification is based on regular grids while most ground objects are irregular.In further study, we consider to improve the precision of ground objects segmentation and classify ground objects into more detailed classes, not merely three dominant classes, to obtain more accurate classification results.

CONCLUSION
In this paper, vertical structural characteristic curves, which are formed based on various vertical structural characteristics, are utilized to classify urban point cloud data.This method does not have to filter ground points and does not need the help of other data sources either, making the processing simpler.Moreover, validation of this method is proved by comparing the classification results with the classification reference.However, point cloud data are classified into only three categories, and ground objects in the same category may have minor differences.For example, a city has broadleaved trees and conifer trees and they are both classified into vegetation category.And the categories of ground objects in study area are limited, which cannot prove the universality of this algorithm.So the classification algorithm needs to be improved to get more precise results, and the study area should be expanded to evaluate the effectiveness of this algorithm.

Figure 3 .
Figure 3. Horizontal grid construction and vertical segmentation 2.3 Vertical structural characteristic curves formation Airborne LiDAR data are of high density, which can help obtain detailed distribution structures of ground objects in both vertical direction and horizontal direction.Before classifying point cloud, features of different types of ground objects need to be extracted from massive point cloud data to parameterize the vertical characteristics of ground objects.For point cloud data in each grid, the type and the way of expression of their vertical characteristics are explored, and vertical characteristic values are calculated for each grid.Characteristic curves of different types of vertical characteristics are formed and the effectiveness of these curves to point cloud classification are evaluated.Vertical characteristics of point cloud data used in this research include point density (D) and measures of dispersion (MD), and these characteristics are calculated for every grid.Point density is the number of points in the layer, and for each horizontal grid, the number of points in each layer can represent the characteristic of point distribution in vertical direction.MD can be quantified in many forms, such as Variance (V), Standard Deviation (SD) and Coefficient of Variation (CV).Dispersions in X, Y, Z dimensions need to be considered when measuring the dispersion of a point cloud dataset (A = {ai∈R d | ai = (Xi, Yi, Zi), i = 1,2,…,n}).In this study, we choose CV as one of the vertical characteristics.CV can be classified as coefficient of range, coefficient of standard deviation and coefficient of average difference on the basis of different dispersions, and the most frequently used CV is the coefficient of standard deviation.CV in X, Y, Z directions are CVX, CVY, and CVZ respectively.CVX 1 ) are selected to plot their density curves.Point cloud data are segmented into 16 layers, and points distribute differently in these layers.Density curve plots (Figure5) of the three kind of ground objects indicate that vertical characteristics of different ground objects vary.Point cloud classification can be achieved by utilizing the diversities of point clouds' vertical characteristics.

Figure 4 .
Figure 4. Side view of point cloud of trees, buildings and roads (from left to right)

Figure 6 .
Figure 6.Discretization of a characteristic curveWhen a characteristic curve of a grid is discretized, it can be expressed as a two-dimensional vector Pi T = {(x1, y1i), (x2, y2i), …, (xn, yni)}.Assuming that the number of grids is m, discretized characteristic curves of all grids will form a n×m matrix (Figure7).These vectors will create an n-dimensional space.In this space, the characteristic values in each dimension of the i-th grid are y1i, y2i, …, yni respectively.Imagining that the characteristic

Figure 9 .
Figure 9. Classification results of setting different number of dimensions after PCA processing

Figure 10 .
Figure 10.Classification result generated by the best combination of parameters

Table 1 .
Classification Precision of different combinations of horizontal grid spacing and vertical layer spacing 7. Classification results of different combinations of horizontal grid spacing and vertical layer spacing

Table 3 .
Classification precision of setting different number of dimensions after PCA processingIn order to get the optimum classification result, various combinations of parameters are tested.When parameters are set as follows: point density as the vertical characteristic, grid spacing and vertical layer spacing are 3m and 1m respectively and the number of dimensions after PCA processing is 11, this combination of parameters perform the best.And with this combination of parameters, CR of buildings is 84.08%,CR of ground is about 89.38% and CR of vegetation is about 85.47%, which are considerably high in our research.