POTENTIAL OF FULL WAVEFORM AIRBORNE LASER SCANNING DATA FOR URBAN AREA CLASSIFICATION-TRANSFER OF CLASSIFICATION APPROACHES BETWEEN MISSIONS

Full-waveform (FWF) LiDAR (Light Detection and Ranging) systems have their advantage in recording the entire backscattered signal of each emitted laser pulse compared to conventional airborne discrete-return laser scanner systems. The FWF systems can provide point clouds which contain extra attributes like amplitude and echo width, etc. In this study, a FWF data collected in 2010 for Eisenstadt, a city in the eastern part of Austria was used to classify four main classes: buildings, trees, waterbody and ground by employing a decision tree. Point density, echo ratio, echo width, normalised digital surface model and point cloud roughness are the main inputs for classification. The accuracy of the final results, correctness and completeness measures, were assessed by comparison of the classified output to a knowledge-based labelling of the points. Completeness and correctness between 90% and 97% was reached, depending on the class. While such results and methods were presented before, we are investigating additionally the transferability of the classification method (features, thresholds ...) to another urban FWF lidar point cloud. Our conclusions are that from the features used, only echo width requires new thresholds. A data-driven adaptation of thresholds is suggested. * Corresponding author.


INTRODUCTION
Airborne LiDAR has already proven to be a state-of-the-art technology for high resolution and highly accurate topographic data acquisition with active and direct determination of the earth surface elevation (Vosselman and Maas, 2010).Generally, two different generations of receiver units exist: discrete echo recording systems, which are able to record multiple echoes online and typically sort up to four echoes per laser shot (Lemmens, 2009) and full-waveform (FWF) recording systems capturing the entire time-dependent variation of the received signal power with a defined sampling interval such as 1ns (1 nanosecond) (Mallet and Bretar, 2009;Wagner et al., 2006).With signal processing methods, FWF data provide additional information which offers the opportunity to overcome many drawbacks of classical multi-echo LiDAR data on reflecting characteristics of the objects, which are relevant in urban classification.
Airborne LiDAR data have been used in various applications in urban environments, particularly aiming at mapping and modelling the city landscape in 3D with its artificial land cover types such as buildings, power lines, bridges, roads.Moreover, as urban environments are active regions with respect to alteration in land cover, urban classification plays an important role in update changed information (Matikainen et al., 2010).If FWF data is available, amplitude, echo width, and the integral of the received signal are additional information.Furthermore, a higher number of detected echoes has been reported for FWF data in comparison to discrete return point clouds.These additional attributes were successfully used in classification (Alexander et al., 2010).The classification methods applied reach from simple decision trees to support vector machines (SVM).(Ducic et al., 2006) applied a decision tree based on amplitude, pulse width, and the number of pulses attributes of full-waveform data in order to distinguish the vegetation points and non-vegetation points.(Rutzinger et al., 2008) used a decision tree based on the homogeneity of echo width to classify points from full-waveform ALS data to detect tall vegetation -trees and shrubs.(Mallet et al., 2008) used SVM to classify four main classes in urban area (e.g.buildings, vegetation, artificial ground, and natural ground).In these studies the parameters of the classification (threshold values, etc.) are set by expert knowledge or learned from training data.Thus, these values are optimal for the investigated data set.
The transferability of classification approaches between different full waveform LiDAR data sets has received less attention so far (Lin, 2015).The aim of this paper is therefore to:  demonstrate that high classification accuracy can be reached with decision trees, and to  study, if this classification approach using the selected features and the thresholds can be transferred to another data set, and finally to  suggest a method to re-compute the echo width threshold for different missions acquiring urban full waveform point clouds.
In this study, the following attributes are used. echo width: full waveform attribute, describing variation of the target along the ranging direction,  Sigma0: local smoothness,  echo ratio: a measure of surface penetration, and  nDSM: normalized digital surface model, height above ground.
Four main classes are derived for the built up areas of Eisenstadt and Vienna: buildings, vegetation, water body, and ground.They are classified based on decision tree method using OPALS (Pfeifer et al., 2014).
To quantify the transferability of parameter between the different regions/data sets, the parameters are applied for the Eisenstadt set and then applied to the Vienna set.This indicates which parameter is stable for various study areas and which need conversion.

Study area
Eisenstadt is a town in the south eastern part of Austria.It is characterized by buildings of medium size.The centre of Eisenstadt was selected for the analysis located on lat.N 4750'51", long.E 1631'5".
Vienna is the capital of Austria and characterized by old large buildings in the centre, but also open park areas and trees along a boulevard.The center of study area located on lat.N 4812'26", long.E 1621'52".

Data
The full-waveform airborne LiDAR data were available for the two mentioned cities. Eisenstadt area was scanned with a Riegl LMS-Q560 sensor in April 2010.The resulting point density was approximately 8 points/m 2 in the non-overlapping areas, while the laser-beam footprint was not larger than 60 cm in diameter.The Vienna city-center area was scanned with the same model of the scanner, in January 2007.The resulting point density was 12 points/m 2 in the non-overlapping area, and the laser-footprint was not larger than 30 cm.The investigated area covers 2.5 km 2 for Eisenstadt and 1.4 km 2 for Vienna.
Both raw full-waveform data sets were processed in the same way using the software OPALS and sensor manufacturer software.First, Gaussian decomposition (Wagner et al., 2006)  Additionally to the LiDAR data, RGB Orthophotos -projected in the same coordinate system -were used for visually interpretation.

METHODOLOGY
First, a number of attributes is computed for each point, using the paradigm of point cloud processing (Otepka et al., 2013).From these attributes different images are computed ("gridding") at a pixel size of 1m.A terrain model is derived also.Then, a decision tree is applied to classify each pixel into one of the four classes: building, vegetation, ground, and water body.Image algebra (e.g., morphological operations) is used in between to refine the results.The quality of the results is assessed using the completeness and the correctness measure.Mallet et al. (2008) showed that for urban area classification from Lidar data a combination of attributes should be used to obtain classification results of high quality.In their analysis of feature (attribute) importance, it was demonstrated that attributes considering the local dispersion of the point cloud, attributes describing geometric properties, and the echo width of FWF Lidar should be used together.This was used in the selection of attributes for the present study.

DTM creation
The Digital Terrain Model (DTM) give important geometric information about objects in urban area, e.g.object heights, and thus, they were directly derived from the LiDAR data.To calculate the DTM, first the LiDAR ground points were selected by applying the robust filtering algorithm (Kraus and Pfeifer, 1997;Pfeifer and Mandlburger, 2008) implemented in the software SCOP++.Then, the DTM was interpolated from the selected ground points using the moving plane interpolation implemented in OPALS.

Attributes for the classification
Prior to attribute computation in each point, the LiDAR point clouds are checked in order to remove erroneous points which influenced to the accuracy of further processing steps.The relative height of each point above the DTM, nH = z (point)z(DTM), was computed.All points with nH below -1m and above > 40m are removed.For the Vienna data set the highest buildings are approx.100m, but also no erroneously high points were found in the data.Thus only the lower threshold was applied for Vienna.
The value nH defines the attribute nDSM, i.e. normalized surface model (object height).The nDSM represents, as written above, the height of points above the terrain.In the classification it is used to distinguish all the point above the terrain such as buildings and vegetation from the ground points.
To distinguish buildings and vegetation points the Echo Ratio (Höfle et al., 2009;Rutzinger et al., 2008) is used.The echo ratio (ER) is a measure for local transparency and roughness and is calculated in the 3D point cloud.The ER is derived for each laser point and is defined as follows: n 3D = Number of points within distance measured in 3D (sphere).
n 2D = Number of points within distance measured in 2D (unbounded vertical cylinder).
In building and ground, the ER value reach a high number (approximately 100%), but for vegetation and permeable object ER < 100%.ER is created by using OpalsEchoRatio module, with search radius is 1m, slope-adaptive mode.For the further analyses the slope-adaptive ER is aggregated in 1m cells using the mean value within each cell.
The attribute Sigma0 is the plane fitting accuracy (std.dev. of residuals) for the orthogonal regression plane in the 3D neighborhood (ten nearest neighbors) of each point.It is measured in meter.Not only the roofs, but also the points on a vertical wall are in flat neighborhoods.Echo Ratio and Sigma0 both represent the dispersion measures.Concerning their value they are inverse to each other (vegetation: low ER, high Sigma0).What is more, Sigma0 is only considering a spherical neighborhood and looks for smooth surfaces, which may also be oriented vertically.The ER, on the other hand, considers (approximately) the measurement direction of the laser rays (vertical cylinder).Those two attributes play an importance role in discriminate trees and buildings.Using OpalsGrid module with moving least square interpolation the Sigma 0 image with the grid size of 1m was created.
The Echo Width (EW) represents the range distribution of all individual scatterers contributing to one echo.The width information of the echo pulse provides information on the surface roughness, the slope of the target (especially for large footprints), or the depth of a volumetric target.Therefore, the echo width is narrow in open terrain areas and increases for echoes backscattered from rough surfaces (e.g.canopy, bushes, and grasses).Terrain points are typically characterized by small echo width and off-terrain points by higher ones.The echo width also increases with increasing width of the emitted pulse.It is measured in nano seconds.OpalsCell module is used to create the EW image with the final gird size of 1m.
The local density of echoes can be used for detecting water surfaces.As demonstrated by (Vetter et al., 2009) water areas typically feature areas void of detected echoes or very sparse returns.It is measured in points per square meter.Density was also computed for 1m cells.
The attributes used for classification are thus: nDSM, Echo Ratio, Sigma0, Echo Width, and Density.

Object classification
First each pixel is classified using the decision tree shown in Fig. 1 including the threshold values.After the first 2 classes, water and building (candidates) are extracted, mathematical morphology is applied to refine the building results.The pixels not classified are then tested for fulfilling the vegetation criteria.If they are not in vegetation, they are considered to be ground.
Water is first identified, based on the low point density.As mentioned above, water has very low backscatter, and often no detected echo.
Building objects are distinguished from other objects by height (above 3m) and surface roughness.ER is used to distinguish buildings from tree objects.However, with various shapes of building roof and some buildings being covered by high trees, only ER is not sufficient and would include vegetation in the building class.Thus, EW is used to detect only hard surfaces.Buildings are contiguous objects and have typically a minimum size.This is considered by analysing all the pixels classified as buildings so far with mathematical morphology.A closing operation is applied first to fill up all small holes inside the buildings, and then opening is performed to remove few pixel detections ("noise") from the building set.This also makes the outlines of buildings smoother.ER, Sigma0 and EW are then used to classify trees.The building mask is applied to classify only pixels not classified before.Also this result is refined with image morphological operations.Finally, all pixels not classified so far are considered ground.

Echo Width normalisation
An initial assumption was that the thresholds for the decision tree derived for one data set can also be used for the other data set.The rational was that:  Density is a physical measure (points per square meter) and the overall shot density was similar (8 vs. 10 points per square meter).


Height above ground (nDSM) is a measure independent of the measurement device and also independent of the sampling distance.


Echo Ratio is by definition a relative measure and should therefore adapt itself to the data distribution. Sigma0 is the local plane fitting accuracy.For data sets of similar measurement accuracy (same sensor model used for both areas) and similar neighbourhoods, both number of neighbours and spatial extent, it should deliver comparable values.


Echo width obviously depends on the width of the emitted pulse (same sensor model used for both data sets), but may also depend on the footprint diameter (which was different in the two data sets investigated) or other effects.
Due to the doubts of echo width transferability, a method to normalize echo width is suggested.Weak, low amplitude echoes typically lead to a poor determination of echo width.Thus only stronger echoes (larger amplitude) are used for deriving the echo width normalization parameters.
Assuming that each data set contains some bright, flat surfaces (orthogonal to the incident Lidar signal), a minimum echo width, EW min , was chosen based on single echoes (i.e.extended targets) of high amplitude and narrow width.A maximum echo width, EW max , was chosen based on the assumption that in each data set tree crowns can be found.Those cause large echo width.Thus, strong, first-of-many echoes with a large width were chosen for a maximum echo width.One way to find specific values of EW min and EW max is to use quantiles of the distribution of echo width and amplitude.Using quantiles is suggested because of their robust stochastic properties.
The normalized value of EW for the two datasets can then be computed using: It is noted that this can lead to negative normalized EW, which may be left as they are or set to zero.Also values larger than 1 can appear, e.g. for very wide echoes not considered in the normalization due to low amplitude.
A different method to normalize EW value is proposed by (Lin, 2015) which used concept of Fuzzy Small membership.

Classification results
The thresholds for the classification were set manually, based on exploratory analysis of the data sets and on expectation of the objects.This was done for both data sets independently.
The main properties of ER, EW, Sigma0, nDSM, and Density values for both Eisenstadt and Vienna are summed up in Table 1.From that properties and combining with empirical selection, the threshold for each parameter was set in the Table 2. Table 2.The threshold values using for decision tree classification of buildings, trees and water body region for Eisenstadt and Vienna.
The results were evaluated quantitatively and qualitatively.
Based on the point density characteristic of water region it produces a good result.All the water bodies in the interested area are classified.However, some small parts of the study area where the laser signal could not reach the ground because of occlusion by high buildings, are misclassified.This could possibly be improved with the overlap of another strip.
While buildings in general can be classified well, very complex roof shapes and walls cause difficulties.It was observed that selecting threshold conservatively the shape of the building is maintained, while its size is reduced slightly.
The tree class includes high trees but also lower vegetation (bushes, etc.), also at heights below 3m.Especially for the latter category EW proofed helpful in distinguishing between vegetation and building edges and also in identifying single trees.For very tall trees, Sigma0 and ER allow reliable detection.Ground includes all objects such as: roads, grass land, car park, fields… A further split into artificial and natural ground was explored but finally not performed.Both Sigma0 and Amplitude were considered candidates for this separation.Natural ground tends to have higher Amplitude than artificial ground.However, while valid locally, no global thresholds could be found in the data sets studied.
The final classification results were then assessed based on Correctness and Completeness (Heipke et al., 1997).Some buildings, trees and water bodies are digitized manually as reference data.Comparing the results of the automated extraction to reference data, an entity classified as an object that also corresponds to an object in the reference is classified as a True Positive (TP).A False Negative (FN) is an entity corresponding to an object in the reference that is classified as background, and a False Positive (FP) is an entity classified as an object that does not correspond to an object in the reference.
A True Negative (TN) is an entity belonging to the background both in the classification and in the reference data.The Completeness and Correction for building, tree, and water class are given in Table 3.It is also illustrated for one building in Figure 4.The two main classes of building and tree feature values above 93%.

Echo width normalisation
After estimate the threshold values, a comparison of the used thresholds for both regions is carried out to find which parameters keep stable through different dataset and which required to be normalised.As can be seen in the table 2, the threshold of ER, nDSM, Sigma0 and Density can be applied for both Eisenstadt and Vienna.In other words, those values can be transferable between different regions.However, the EW threshold is notably different.Thus, the normalization suggested in Sec.3.4 was applied to evaluate its usability.
The Figure 5 and Figure 8 show the distribution of EW for the two regions.The ranges of EW are unexpected wide, from 4.003ns to 66.877ns for Vienna, and 0 to 29.000ns for Eisenstadt, given the emitted pulse width of approx.4ns.However, more than 96% of EW values fall in a more narrow range, from approx.7ns to 18ns for Vienna, and from approx.3ns to 10ns for Eisenstadt.This demonstrates the reason for normalization.
As suggested in Sec.separates the weak (66.6%) from the strong echoes ("highest third").The vertical line shows the maximum echo width EW max which is the EW at the 99% quantile of the strong echoes.
For the two different study areas the threshold applied for Echo Ratio, Sigma0, and nDSM were the same.Echo Width was shown to depend on the flight mission parameters.The cause was not studied, but the footprint size may have influence.It is noted that using the differential cross section, instead of echo width would not necessarily change this.The differential cross section is obtained by deconvolving (Jutzi and Stilla, 2006;Roncat et al., 2011) the received signal with the emitted pulse shape (more precisely the system waveform).
A simple model for normalizing echo width was suggested.Improvements of this model, e.g., choice of minimum and maximum echo width for normalization, could be investigated, e.g.histogram matching.

Figure 1 .
Figure 1.Decision tree for the classification.

Figure 5 .
Figure 5. Histogram of echo width value in Eisenstadt region

Figure 7 .
Figure 7. Eisenstadt dataset, Scatterplot of Echo Width versus Amplitude for the first-of-many echoes.The red horizontal lineseparates the weak (66.6%) from the strong echoes ("highest third").The vertical line shows the maximum echo width EW max which is the EW at the 99% quantile of the strong echoes.

Figure 8 .
Figure 8. Histogram of echo width value in Vienna region

Figure 10 .
Figure 10.Vienna dataset: Scatterplot of Echo Width versus Amplitude for the first-of-many echoes.The red horizontal lineseparates the weak (66.6%) from the strong echoes ("highest third").The vertical line shows the maximum echo width EW max which is the EW at the 99% quantile of the strong echoes.

Table 3 .
Accuracy assessments of Building, Tree and Water classes in Eisenstadt region.

Table 4 .
3.4, the minimum EW, EW min , value is the 5% quantile of single, strong echoes.Strong echoes are those that have amplitude more than 1% of the highest amplitude found in the data set.Thus, only 5% of all "strong" echoes have a shorter EW than this EW min .(SeeFigures6and9).The maximum EW is chosen as the 99% quantile of EW from the strongest third of the first-of-many echoes with the highest amplitude (Figures7 and 10).The thresholds for normalizing EW for Eisenstadt and Vienna are summed up in Table4: The EW thresholds for Eisenstadt and Vienna Applying the normalization, the threshold for normalized EW in Eisenstadt and Vienna are presented in Table5While the normalization brings those values closer together (buildings have normalized EW below 6% and 11% respectively, and trees have normalized EW above 3% and 9% respectively), they are not as close together as for the other thresholds (Table2).

Table 5 .
Normalized EW thresholds for Eisenstadt and Vienna.

Table 6 .
Accuracy assessments of Building, Tree classes in Eisenstadt region after Echo Width normalisation.