CLUSTERING OF MULTISPECTRAL AIRBORNE LASER SCANNING DATA USING GAUSSIAN DECOMPOSITION

With the evolution of the LiDAR technology, multispectral airborne laser scanning systems are currently available. The first operational multispectral airborne LiDAR sensor, the Optech Titan, acquires LiDAR point clouds at three different wavelengths (1.550, 1.064, 0.532 m), allowing the acquisition of different spectral information of land surface. Consequently, the recent studies are devoted to use the radiometric information (i.e., intensity) of the LiDAR data along with the geometric information (e.g., height) for classification purposes. In this study, a data clustering method, based on Gaussian decomposition, is presented. First, a ground filtering mechanism is applied to separate non-ground from ground points. Then, three normalized difference vegetation indices (NDVIs) are computed for both non-ground and ground points, followed by histograms construction from each NDVI. The Gaussian function model is used to decompose the histograms into a number of Gaussian components. The maximum likelihood estimate of the Gaussian components is then optimized using Expectation – Maximization algorithm. The intersection points of the adjacent Gaussian components are subsequently used as threshold values, whereas different classes can be clustered. This method is used to classify the terrain of an urban area in Oshawa, Ontario, Canada, into four main classes, namely roofs, trees, asphalt and grass. It is shown that the proposed method has achieved an overall accuracy up to 95.1% using different NDVIs. * Corresponding author


INTRODUCTION
Over the past two decades, airborne LiDAR data have been used in urban applications (Yan et al., 2015).Numerous studies have focused on extracting one object type in urban scenes, such as separation of ground from non-ground points in order to generate digital terrain models (DEMs) (Bartels et al. 2006), building extraction (Huang et al., 2013), road extraction (Samadzadegan et al., 2009) and curbstones mapping (Zhou and Vosselman, 2012).The number of extracted classes has been extended to three main urban classes, including ground, vegetation and buildings (Samadzadegan et al., 2010).Nowadays, multi-classes extraction is a very important topic for building 3D city models, maps updating and emergency purposes.However, most of recent studies focus on geometric characteristics of the LiDAR data (Xu et al., 2014;Blomley et al., 2016).
Recently, Teledyne Optech has launched the world's first airborne multispectral LiDAR sensor "Optech Titan".This sensor acquires LiDAR data in three channels C1, C2, and C3 at wavelengths of 1.550 m, 1.064 m, and 0.532 m, respectively.The data are acquired at pulse repetition frequencies (PRF) ranges from 50 to 300 kHz/channel, maximum combined PRFs of 900 kHz.The sensor's scan frequency and scan angle is up to 210 Hz and ±30°, respectively.The typical altitude ranges from 300 to 2000 m above ground level (AGL) in all channels for topographic applications, and from 300 to 600 m AGL in C3 for bathymetric applications.
In the past year, a few investigations have been reported on the use of the Optech Titan multispectral LiDAR data for different applications.Fernandez-Diaz et al. (2016) presented an overview on the use of the Titan data in land cover classification, measuring water depths in shallow water areas and forestry mapping.Most investigations have focused on extracting one or two objects, such as vegetation mapping (Nabucet et al., 2016), road mapping (Karila et al., 2017), discrimination of vegetation from built-up areas (Morsy et al., 2016a), or land/water classification (Morsy et al., 2017a).
The multispectral Titan data have been explored for land cover classification by converting the LiDAR points into raster images (Bakuła et al., 2016;Morsy et al., 2016b;Zou et al. 2016).In Morsy et al. (2016b), raster images were created from the three recorded intensity values and the LiDAR height was used to create a digital surface model (DSM).A maximum likelihood classifier was then applied to each single intensity image, the combined three intensity images, and the combined three intensity images with DSM.The combined intensity images demonstrated an overall accuracy of 65.5% for classifying the terrain into six classes with an improvement of 17% more than the single intensity images.Moreover, the use of DSM increased the overall accuracy to 72.5%.Zou et al. (2016) segmented the LiDAR images into objects based on multiresolution segmentation integrating different scale parameters.The objects were then classified based on a set of indices, namely NDVI, ratio of green, ratio of returns counts, and difference of elevation.The used method achieved above 90% overall accuracy for classifying the terrain into nine classes.
In this paper, we present a clustering method of multispectral LiDAR data collected by the Optech Titan sensor for land cover classification.The points' elevations are first used to separate non-ground from ground points.The recorded intensity values at the three wavelengths of the sensor are then used to calculate three NDVIs, followed by constructing NDVIs' histograms.The Gaussian function model is used to decompose the histograms into a number of Gaussian components.The intersection points of those components are used as threshold values to cluster the non-ground or ground points into roofs and trees or asphalt and grass, respectively.

STUDY AREA AND DATASET
The proposed clustering method was tested on a data subset covering an urban area in Oshawa, Ontario, Canada.The study area includes a variety of land objects such as building roofs, sidewalks, road surface, grass, shrubs, power lines, cars and trees.A flight mission was conducted on September 3 rd , 2014 in order to acquire multispectral LiDAR data using the Optech Titan sensor.A subset with a dimension of 490 m by 470 m was clipped for testing, and the LiDAR data are shown in Figure 1.The Optech Titan sensor acquired the LiDAR data in the three channels from 1075 m flying height with scanning angle of ±20°.The data were collected at 200 kHz/channel PRF and 40 Hz scan frequency.The tested LiDAR data contains about 796226, 825176, and 742158 points from C1, C2, and C3, respectively.Since the 3D reference points are not available, a number of polygons were digitized from an aerial image (Figure 2).Within those polygons, all points were labelled as reference points for the roofs, asphalt and grass classes, while only the canopy points of the trees class were labelled as reference points for the trees class.However, all the understory vegetation or ground points are classified.Table 1 provides the number of reference points for each class with total number of reference points equals to 36421.  1.The number of the reference points

METHODOLOGY
The multispectral LiDAR data clustering was applied directly to the 3D point clouds.The LiDAR points from the three channels were first combined and three intensity values for each single LiDAR point were estimated (Morsy et al., 2017b).For any point in C1, the median intensity value from C2, I C2 , and C3, I C3 , was estimated using the neighbouring points within a searching circle of 1 m radius from C2 and C3, respectively.The same applied for C2 and C3, so that each single LiDAR point has six attributes: x, y, z, I C1 , I C2 and I C3 .The LiDAR data clustering method was then conducted as shown in Figure 3.The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W7, 2017 ISPRS Geospatial Week 2017, 18-22 September 2017, Wuhan, China

Ground Filtering
The first step in the clustering method is to separate non-ground from ground points.Thus a ground filtering mechanism was applied to the LiDAR points in order to accomplish this step.
The points' elevations were first sorted in ascending order.Then, a statistical analysisskewness balancing was applied in order to label the higher elevations as non-ground points.The slope of each point with respect to its neighbouring points was then calculated and a threshold value (S_thrd) was applied in order to label additional non-ground points.In addition, few points with higher elevations were not labelled as non-ground points.Therefore, the remaining points were filtered using a moving circle with radius (r) and height threshold value (H_thrd) in order to label ground points.

NDVIs' Histograms Construction
The three raw intensity values I C1 , I C2 and I C3 were employed to form three NDVIs, namely NDVI C2-C1 , NDVI C2-C3 , and NDVI C1- C3 , which were defined in Morsy et al. (2016a), for non-ground and ground points as follows: Figure 4. NDVI C2-C3 histograms constructed from non-ground points (upper) and ground points (lower) A histogram of each NDVI, which ranges from -1 to1, was constructed with a bin size of 0.1.Figure 4 shows examples of NDVIs' histograms.Thus, a total of six histograms were constructed (three for non-ground points and three for ground points).

Gaussian Decomposition
A sum of Gaussian distributions G(x), described by Persson et al. (2005), was used to fit each NDVI histogram as follows: Where, x : the bin value N : the number of Gaussian components p j : relative weight of the Gaussian (j) f j (x) : the Gaussian function and defined as: Where,  j : the mean of Gaussian (j) To be able to model the histograms, the parameters N,  j and σ j have to be estimated.Since either the non-ground or ground NDVIs' histograms will be clustered into two classes (i.e., roofs and trees or asphalt and grass), the number of Gaussian components was selected to equal 2 (i.e., N=2).The peak detection algorithm was used to detect all histograms' peaks and their locations (i.e., the mean values) (Jutzi and Stilla, 2006).
The location of the highest two peaks was then used as the mean values ( j ) of the two Gaussian components.Based on the fact that the single Gaussian has two inflection points, the zero crossing of the second derivative was used to obtain the positions of the inflection points of each Gaussian component, and hence the Gaussian's half width (σ j ) was calculated (Hofton et al., 2000).
For each Gaussian component (j), the p j ,  j and σ j were fitted using the maximum likelihood estimate with Expectation -Maximization (EM) algorithm (Dempster et al., 1977).Following the notation of the EM algorithm described by Persson et al. (2005), the expectation (E) step computes the probability (w ij ) that each data bin (x i ) belongs to Gaussian (j), starting with that the two Gaussians have equal relative weights (p j ), using the following equation: The maximization (M) step computes the maximum likelihood estimates of the parameters (p j , µ j and σ j ) as follows: (7) (2) (3) (1) (5) The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W7, 2017 ISPRS Geospatial Week 2017, 18-22 September 2017, Wuhan, China Where, y i : the amplitude at bin x i n : the number of histogram's bins The two steps are repeated until convergence or a maximum number of iterations were achieved.The process stops when either (1) the difference between any iteration and the previous iteration is less than 0.001, or (2) the number of iterations is greater than 1000 (Oliver et al., 1996).The quality of the fitted Gaussians of the full waveform LiDAR data has been evaluated using the following formula as defined by Hofton et al. (2000) and Chauve et al. (2007): Thus, the quality factor ξ should be less than a prescribed accuracy.Similarly, in this paper, the quality of the fitted Gaussians was tested using the aforementioned equation.
According to Chauve et al. (2007), a histogram was considered to be well fitted if ξ was less than 0.5.The intersection point of the two adjacent Gaussian components was used as a threshold value (NDVI_thrd) to cluster the LiDAR data of non-ground and ground histograms into two classes each.If point had zero intensity value in any two channels, as a result of points' combination and intensity estimation, this point was labelled as unclassified point.

RESULTS AND DISCUSSION
The LiDAR data were clustered into four classes, namely roofs, trees, asphalt and grass.The accuracy assessment was conducted through the creation of confusion matrices, and hence the overall accuracies and kappa statistics were calculated.After the LiDAR points from the three Titan channels were combined, a ground filtering mechanism based on the skewness balancing and the points' slopes with respect to neighbouring points was applied.Thus, if the point's slope was greater than a threshold value (S_thrd=10), the point was labelled as a non-ground point.The consideration of the points' slopes makes the ground filtering mechanism applicable for not only flat but also sloped terrains.The output ground points were filtered using a moving circle with radius (r=10 m), so if the height difference was greater than a threshold value (H_thrd=1 m), the point was labelled as a non-ground point and the remaining points were labelled as ground points.
The three NDVIs were subsequently calculated using Equations 1, 2 and 3. NDVIs' histograms were then constructed and normalized for non-ground and ground points.A sum of two Gaussian distributions was used to model the histograms using EM algorithm.The fitting results of the non-ground and ground histograms are shown in Figure 5 and 6, respectively.The ξ of the fitted Gaussians are provided in Table 2.  2. The ξ of the fitted Gaussians Although the calculated ξ for all fitted Gaussians are less than the prescribed accuracy 0.5, some Gaussians are over-fitted as shown in Figure 5a and Figure 6c, whereas ξ was reduced from 0.34 to 0.08 and from 0.15 to 0.09, respectively when only one Gaussian component was fitted using non-linear least squares.However, in this case, the NDVI_thrd cannot be obtained, and hence the data cannot be separated into two different classes.Thus, other modelling functions could better fit those histograms as mentioned in (Chauve et al., 2007), such the Lognormal function, where the NDVI_thrd = +σ or the generalized Gaussian function, where the NDVI_thrd can be obtained from the intersections of two adjacent components.

Results Based on Gaussian Decomposition
Table 3 lists the NDVI_thrd obtained from the intersection of the two Gaussian components for both non-ground and ground histograms.The vegetation (i.e., trees or grass) has high reflectance at C1, C2 and C3.As a result, the calculated NDVIs of the vegetation points exhibited higher values than the builtup areas (i.e., roofs or asphalt).So that, for a particular point, if NDVI C2-C1 , NDVI C2-C3 or NDVI C1-C3 ≤ NDVI_thrd, the point was labelled as roofs or asphalt; otherwise, it was labelled as trees or grass.Figure 6 shows the classified point clouds based on the three NDVIs.The confusion matrices and overall accuracies are provided in Table 4, 5 and 6.As aforementioned the non-ground NDVI C2-C1 histogram (Figure 5a) was over-fitted, which affected the threshold value definition (shifted to the right).As a result, about 34.9% of the tree points were omitted (65.1% producer accuracy), and those points were wrongly classified as roofs.This omission caused a misclassification of the roofs class with about 26.9% (73.1% user accuracy).Similarly, ground NDVI C1-C3 histogram was over-fitted (Figure 6c) and the threshold value was shifted to the right as well.Thus, about 38.6% of the grass points were omitted (61.4% producer accuracy) and mainly classified as asphalt.In addition, about 39.2% (60.8% producer accuracy) of the roofs' points were wrongly classified as trees when NDVI C1- C3 was used.
Generally, there are three sources of classification errors, including the unclassified class with maximum error of 1.9%, the ground filtering with maximum error of 2.1%, and the NDVIs' histograms clustering.It should be point out that most of the unclassified non-ground points are power lines.Since the power lines points were only recorded in C1, the intensity values of those points were set to zero in C2 and C3, and hence labelled as unclassified points.This is useful for further extraction of more urban classes.

Comparison with Previous Studies
Previous studies achieved overall classification accuracies from 85% to 89.5%.The multispectral aerial/satellite imagery was combined with normalized digital surface model derived from LiDAR data (Huang et al., 2008;Chen et al., 2009) or combined with LiDAR height and intensity data (Charaniya et al., 2004;Hartfield et al., 2011;Singh et al., 2012).Compared to previous studies, the presented work in this research used the LiDAR data only and achieved an overall accuracy of 95.1% based on NDVI C2-C3 .Morsy et al. (2017b) used Jenks natural breaks optimization method to cluster the NDVIs histograms of the non-ground and ground points into the same four classes.We applied the same clustering method to the tested dataset in this research for comparison.The achieved overall accuracy based on NDVI C2-C1 and NDVI C2-C3 is very close, while the use of NDVI C1-C3 demonstrated a difference in the overall accuracy of 10.7%.This is mainly because that a significant improvement of roofs and grass detection was achieved using Jenks breaks optimization.

CONCLUSIONS
This paper presented a multispectral airborne laser scanning data clustering method for land cover classification of urban areas.The multispectral data were collected by the Optech Titan sensor operating in three channels with wavelengths of 1.550 m, 1.064 m, and 0.532 m.The 3D LiDAR points in the three channels were combined and three intensity values were assigned to each single LiDAR point as a pre-processing step.The clustering method starts with separating non-ground from ground points, followed by NDVIs computation.NDVIs' histograms were subsequently constructed and each histogram was decomposed into two Gaussian components.The intersection point of the two Gaussian components was then used as a threshold value to cluster non-ground points into roofs and trees, and ground points into asphalt and grass.This method achieved overall accuracies of 85.1%, 95.1%, and 79.2% using NDVI C2-C1 , NDVI C2-C3 , and NDVI C1-C3 , respectively.The use of Gaussian decomposition succeeds in finding threshold values in order to cluster the NDVIs histograms into four urban classes.However, in some cases, Gaussian components were over-fitted.As a result, the threshold value definition was affected and caused classification errors.Using other modelling could perform better in fitting the NDVIs histograms and obtaining the threshold values.The presented research work demonstrates the potential use of multispectral LiDAR data in land cover classification.In addition, compared to previous studies, the Gaussian decomposition achieved higher overall accuracy.Further investigations will be devoted to combine the results obtained from the three NDVIs.

Figure
Figure 2. Aerial image with the reference polygons Class Roofs Trees Asphalt Grass Number of points 111776 12315 5973 6957 Table1.The number of the reference points

Figure
Figure 3.The clustering method workflow

Table 7 .
Morsy et al. (2017b)DVI_thrd values for the non-ground and ground NDVIs using Jenks breaks optimization.Table8shows a comparison between the overall accuracies achieved using Gaussian decomposition, proposed in this research, and Jenks natural breaks optimization method, presented byMorsy et al. (2017b).NDVI_thrd values obtained from Jenks natural breaks optimization