SEMANTIC CLASSIFICATION OF SANDSTONE LANDSCAPE POINT CLOUD BASED ON NEIGHBOURHOOD FEATURES

The technology of airborne laser scanning enables fast and accurate gathering spatial data containing also echoes from the terrain below the vegetation canopy that is beneficial for topographic mapping of wooded sandstone landscapes in Czechia, Poland, and Germany. The challengeable task is to determine the ground points in the point cloud because commonly used filtration methods do not successfully distinguish between vegetation and rock pillars and faces. In this paper, we replace filtration with classification approach using the features derived from characteristics of points within a neighbourhood of optimized sizes, such as eigenvalue-based features and echo ratio. Random Forest classifier is trained and tested on the manually labelled dataset with a density of almost 650 points/m from the Adršpach-Teplice Rocks. The overall accuracy reaches 87% but recall and precision of non-ground points are unsatisfactory. Misclassified non-ground points are located also within trees, thus we do not consider the result as suitable for DTM processing without further processing. * Corresponding author


INTRODUCTION
The sandstone phenomenon forms an unusual landscape in the northern part of Czechia and also extends beyond the borders of Germany and Poland (e.g. Bohemian and Saxon Switzerland, Stołowe Mountains, Adršpach-Teplice Rocks). In its wildest form, the landscape is called "rock cities" due to the similarities of steep walls and deep and narrow gorges with buildings along the streets in human-made towns. Intricately passable terrain together with the fact that in climatic conditions of the central Europe the sandstone areas are covered with dense vegetation complicate topographic mapping of these regions.
The airborne laser scanning (ALS) is a suitable method for topographic mapping in wooded areas as it allows for gathering data about the shape of the terrain below vegetation. The more challengeable part of dealing with ALS data is the processing because the categorization of object types, which the echo comes from, is not a trivial task. A lot has been discussed about the filtration aiming to select the ground points but commonly used algorithms (Axelsson, 2000;Kraus and Pfeifer, 1998) tend to smooth the rock formations or imperfectly filter out the vegetation points, depending on parameter settings (Lysák, 2016;Tomková, 2018).
Thus, in our study in order to determine the terrain points, we apply a state-of-the-art approach for classifying ALS point cloud from a sandstone landscape instead of using the filtration algorithms. The goal is to evaluate whether it is beneficial to distinguish between terrain and off-terrain echoes in the classification algorithm according to the properties of point neighbourhood instead of using a widely-used filtration algorithm based strictly on a point position in the space.
The extraction of relevant features is essential for classification. Adaptive neighbourhood selection is one of the approaches used to maximize the relevance of the features . It is employed together with features describing shape of point cloud within the neighbourhood expressed mainly by eigenvalues (Jutzi and Gross, 2009;Weinmann, 2016). The Random Forest classifier is trained and used to predict point's categorization.
The point cloud used for this experiment captures a part of Adršpach-Teplice Rocks and was acquired in December 2019. Laser scanner was carried on a drone in low altitude with speed about 2.1 m/s, The density of all echoes reaches 640 points/m 2 , density of the last echoes 530 points/m 2 . There are no buildings inside the area of interest and other man-made objects (e.g. stairs and fences) are small and rare. Due to this fact, we discriminated only two classes of points: (i) ground points, i.e. echoes from terrain and rocks and (ii) non-ground points corresponding by far mostly to echoes from vegetation.
The paper is organized as follows. Section 2 describes the methodology and it is further divided into subsections about point neighbourhood, extraction of features, selection of the relevant features and finally, the classification. As the acquired dataset is rather unique, its characteristics are more thoroughly discussed in Section 3. The results of our experiment are presented in Section 4 and Section 5 provides concluding remarks and suggestions for future research.

Neighbourhood Definition
The followed framework of point cloud classification requires knowledge of point characteristics derived from its neighbourhood (Weinmann, 2016). The neighbourhood of certain point allows to examine a part of or the whole object which the point belongs to and thus, to see the point in the context of the object. The neighbourhood can be defined using two different approaches: (i) as a number of k nearest neighbours or (ii) as a distance from the point either in 3D (spherical neighbourhood) or in 2D (cylindrical neighbourhood). The advantage of k nearest neighbours lies in adaptation to varying point density in the dataset. However, in comparison to the distance neighbourhood, it does not necessarily preserve the relationship to a real object size and therefore, the context of the object might be lost. In both cases, determination of the parameters k or radius based on prior knowledge of the dataset is essential.
To avoid assumptions and combine benefits of both approaches, the optimal parameters can be estimated automatically for each point of the dataset. According to Weinmann et al. (2014), it results in more relevant features and thus better performance of classification. Blomley et al. (2014) proposed a method for the optimization of cylindrical neighbourhood based on shape distribution features known from object recognition. Demantké et al. (2011) used the dimensionality features derived from eigenvalues for the selection of optimal radius of spherical neighbourhood. By minimizing the Shannon entropy of dimensionality features, this method looks for the neighbourhood which favours one dimensionality the most. Weinmann et al. (2014) followed up with finding optimal k nearest neighbours by minimizing the Shannon entropy E of relativized eigenvalues e i : = − 1 ln( 1 ) − 2 ln( 2 ) − 3 ln ( 3 ) (1) Although Weinmann et al. (2014) proposed it as a more general approach than used by Demantké et al. (2011) because of avoiding assumptions on the dataset for determining minimal and maximal possible values, adjustment of the upper limit of k nearest neighbours is necessary in our research. It is described in detail in Section 4.1.

Feature Extraction
When designing the experiment, all potentially useful features are calculated and afterwards, only those that have been proven being truly relevant and suitable for the classification, are selected. A lot of features have been developed for describing the local neighbourhood of a point, e.g. shape distribution features (Blomley et al., 2014), SHOT descriptors (Tombari et al., 2010), point feature histograms (Rusu et al., 2009). In the proposed experiment, we largely followed a strategy of combining interpretable 3D and 2D features (Weinmann et al., 2013;Hackel et al., 2016). In total, the employed feature vector consists of 18 features divided into 5 categories as shown in Table 1. The 3D eigenvalue-based features are derived from the normalized eigenvalues e 1 , e 2 , e 3 of 3D structure tensor representing the covariance matrix of neighbouring points (Jutzi and Gross, 2009). Similarly, the 2D eigenvalue-based features are based on normalized eigenvalues e 1 , e 2 of neighbouring points projected onto a ground plane. The echo ratio is the ratio of the number of points within a spherical neighbourhood and points within a cylindrical neighbourhood with the same radius and infinite height (Höfle et al., 2009). Higher values indicate a flat surface while lower values correspond to a vertically scattered object.

Number of points Height
Radius Difference of heights Δ

Local point density
3D eigenvalue-based features Change of curvature = 3 1 + 2 + 3 Table 1. 18 values of a feature vector; e i is the i th eigenvalue (downwardly ordered) and n z is the third component of the normal vector n.

Feature Selection
The effect known as Hughes phenomenon, confirmed by e.g. Weinmann et al. (2013), declares that more features as input into classification do not ensure better performance. The selection of relevant features is therefore a part of our workflow. Feature selectors can be divided into three categories: filter-based, wrapper-based and embedded. In contrast to the other categories, filter-based selectors are classifier-independent. It determines slightly lower resulting performance of classification but it also implies better robustness and less time-consuming processing.
The univariate filter-based strategies evaluate the most suitable features according to their scores in metrics that address properties like information, consistency, or dependency in feature-class relation. In addition, the multivariate strategies take into consideration also the correlation between features, so apart from irrelevant features, they also remove those that are redundant. As examples of multivariate filter-based selectors, Correlation-based Feature Selection algorithm (CFS) (Hall, 1999) and Fast Correlation-based Filter (FCBF) (Yu and Liu, 2003) can be mentioned. For a simple and fast selection of the generally relevant features, the univariate filters are satisfactory. In our work, we intend to evaluate normalized features based on the correlation and information gain.

Classification
Finally, the feature vector serves as an input into the classification which results in labelled points. Supervised classification methods require to be learned using training dataset with known labels. For the 3D scene analysis, approaches like Nearest Neighbour classifier (Jutzi and Gross, 2009), Maximum Likelihood classifier (Bartels and Wei, 2006), Support Vector Machine (Lodha et al., 2006), and Random Forest (Chehata et al., 2009) are extensively employed (Weinmann, 2016). Those classifiers evaluate each point separately and therefore their results might be spatially inhomogeneous and noisy. It does not meet the reality where for example the existence of vegetation point surrounded by ground points has the low probability. Contextual classification can improve this drawback by taking into consideration apart from the feature vector also class labels of neighbouring points. Conditional Random Fields (Niemeyer et al., 2012) and Markov Networks (Shapovalov et al., 2010) can be mentioned as representatives of the contextual classification. The disadvantage of these approaches lies in the increasing computational effort and exactness and completeness of modelling the relationships.
In our experiment, we decided to use Random Forest classifier due to the simple interpretability of trained trees and decision rules. The performance and computational efficiency of Random Forest together with the optimal neighbourhood selection was emphasized by Weinmann (2015). The Random Forest consists of certain number of independently trained decision trees and the resulting class is chosen by major voting by all of these trees. While decision tree is considered as a weak classifier due to high sensitivity to small changes in training dataset, the Random Forest utilizes the ensemble of decision trees to be a robust and stable classifier. Each tree is trained using the randomly selected sample of training data with replacement, resulting in the same number of observations as the whole training dataset. This method is called bagging (bootstrap aggregating). In each sampling run, also a subset of features is randomly selected. Not only for this reason, it is important to select suitable features as an input for the classifier, otherwise it might happen that some trees will be trained using mainly or only inconsistent and irrelevant features.

DATASET
As the research is focused on specific type of landscape and highly dense point cloud, it is appropriate to mention the characteristics of used dataset in detail. While the study area is briefly described in Section 3.1, the circumstances of acquisition of the point cloud are presented in Section 3.2, and Section 3.2 aims to clarify the process of preparing the point cloud to be used as training and testing sets in the classification. More details about the dataset can be found in Tomková, Lysák (2020).

Location of Interest
The location of interest is situated in the Adršpach-Teplice Rocks in Czechia, around the highest rock pillar Milenci (50°36'39.112"N, 16°6'49.084"E, and 81.4 meters high), see Figure 1. The area is protected since 1933 because of unique landscape as well as isolated occurrence of plant and animal species. With its 18 km 2 , it is the most extensive region of this kind in Europe. The local touristic circuit through the rock features is visited by more than 300,000 people every year. The shapes are results of erosion of about 130 meters heavy compact sandstone plateau sedimented in Late Cretaceous and lifted during Alpine orogeny.

Point Cloud Acquisition
The study area of approx. 100 × 100 m 2 is a part of a larger region which was scanned with the RIEGL's miniVUX-1UAV scanner carried on a hexacopter DJI Matrice 600 Pro. The acquisition was held in the middle of December 2019 in the In total there are 6.6 million points in the location of interest. Almost 70% of points in the dataset are the single returns, 84% last returns. Although it is common practice to search for the ground points only in last returns, it is not suitable for this kind of landscape because also many of first or intermediate echoes might come from rock faces due to their steepness and height variance. The purpose of obtaining the point cloud with such a high density is to evaluate the effect of data density on the rock shapes in the resulting digital relief model (DTM) within our future research.

Pre-processing
The pre-processing consisted of two steps. The aim of the first one was to derive a high-quality point cloud. The second goal was to manually assign the class to each point in order to prepare the training and testing sets for the points cloud classification. A part of the first step was a relative alignment of flight strips that locally left residuals due to difficulties with finding planes on uneven surfaces.
As it was already mentioned, the commonly applied filtration methods do not produce satisfactory results in the studied landscape. Thus, the reference dataset needs to be prepared manually. The functions of the toolbox LAS Dataset in ArcGIS Desktop (Esri, 2019, Figure 2) was used for this purpose. Each point was classified as ground or non-ground according to the interpretation of the profile and 3D view of the data and knowledge of the location. Despite the point cloud density, in the cases of low vegetation and thickets, the separation between the two classes might not be absolutely reliable.
To train the classification algorithm in acceptable computational time, the whole dataset was thinned. A random point within a cube of size 0.08 meters was selected resulting in dataset of about 1 million points. For the neighbourhood selection and calculation of features, the original dataset was used. Thus, the mentioned point density and other characteristics of the dataset remain valid. Figure 2. Profiles of manually filtered thinned data. Brown dots correspond to ground points and green dots to non-ground. In the images, Uhlířská rock pillar (Figure 1) with the tree growing from its face can be seen on the left and a high spruce between a rock face and a smaller rock formation on the right.

EXPERIMENT AND RESULTS
In the following part, we focus on a performance of our workflow. First, statistics of the optimal neighbourhood search are introduced in Section 4.1, followed by the feature selection presented in Section 4.2. Finally, the results of the established classification pipeline are shown in Section 4.3.

Optimal Neighbourhood
The optimal neighbourhood is searched by minimizing the entropy of eigenvalues derived from the neighbouring points, based on approach of Weinmann et al. (2014). Because the goal is to take into consideration the context of neighbouring points, we determine the minimal radius of neighbourhood to 0.1 m. Instead of 100 nearest neighbours at most as Weinmann et al. (2014) proposed, the upper limit is set to 2 000 due to the high density of the point cloud. Also, the step between 10 and 2 000 points is adapted to 30 intervals due to a higher computational effort with increasing number of choices. The intervals are growing logarithmically because we suppose that there is a higher probability of capturing relevant context within smaller numbers of points that tend to be closer to each other.
For the evaluation of results of this step, we consider the radius as a more intuitive indicator than number of neighbouring points. Figure 3 shows the distribution of the optimal neighbouring radius. It is obvious that there is no observable trend in the size of the optimal neighbourhood with respect to the classes. The lower the radius, the higher the frequency of Figure 3. Distribution of radius of the optimal neighbourhood with respect to the manually labelled classes. The points with bigger neighbourhood than 2.0 m (only 3.5% of all) are not shown in the graph.
occurrence is registered. In contrast to figures in Weinmann et al. (2015), the graph does not show the trend with increasing frequency at higher values because of the higher upper limit used in our experiment. Mean of radii computed from the dataset equals to 0.75 m, however, it is distorted by outliers with the maximum of 7.4 meters. For the half of the points, smaller radius than 0.6 m is chosen.

Feature Selection Assessment
The relevance of features is evaluated using two metricscorrelation and mutual information. Based on the results, only eight features are finally employed as an input into classification task -all the eigenvalue-based features except for the sum of eigenvalues, plus echo ratio. These features reach the best rating in both evaluation methods (Figure 4). A high value of mutual information is gained in case of height. But according to the correlation and also our knowledge of the dataset, height cannot predict the class because trees and rock pillars can be similarly high.  Table  1 for explanation of feature abbreviations.

Classification Results
Before the classification is trained, several parameters including number of trees and their maximal depth is tuned using the Grid Search approach. The dataset is randomly split into training and testing sets in ratio 7:3. There are about 80% of ground and 20% of non-ground points in each dataset that shows the imbalanced mixture of classes. The confusion matrix showing the percentage of correctly and incorrectly classified points in terms of classes and also class-wise recall and precision are in Table 2. The overall accuracy of testing set reaches 86.7% but due to the prevalence of ground points, the metric is distorted. If all the points are classified as ground, the overall accuracy will be 80%. Thus, Weinmann et al. (2015) proposed to consider class-wise precision and mainly recall. While these metrics for ground points are satisfactory, the classification of non-ground class is less successful. Only 64.0% of real non-ground points are identified and the rest is categorized as ground. As can be seen in Figure 5, these misclassified points (orange dots) are not just small thickets that are too hard to recognize but also isolated points within the trees. On the other hand, the error of almost 29% (complement to recall) of predicted non-ground points that should be in the ground class, is more acceptable. Moreover, a part of them are probably very low thickets and herbal cover which is difficult to separate from the ground even manually because there are no echoes from the ground below them. Other part of these misclassified points is correlated with occurrence of steep rock faces.  Figure 5. 3D view of a part of testing data. Brown and green dots remark correctly classified ground and non-ground points, orange dots are misclassified non-ground and purple misclassified ground points.

DISCUSSION AND CONCLUSIONS
The experiment presented in this paper mostly follows a traditional approach of point cloud classification, but applied on the unusual dataset. The primary motivation for this research is to test whether the classification can substitute filtration which is often not successful in determining the ground points in this kind of landscape, where the mixture of rock pillars and vertical faces together with high-grown trees and thickets is present. As the results show, the classification reaches high performance in total (comparable to Weinmann et al., 2017) but according to the imbalanced dataset, class-wise metrics are more explainable. The precision and recall of ground points are satisfactory. The lower precision of non-ground class means that a significant number of vegetation points remain in the ground class which might be an effect of low ratio of non-ground points in the dataset. When looking deeper into the spatial distribution of these points, it is evident that these misclassifications are among others single points of trees. These errors would result in non-existent shapes in modelling of DTM. On the other hand, the occurrence of the misclassified ground points is slightly more acceptable from the perspective of further processing, because these points would not significantly affect the shape or even the appearance of rock pillar in DTM due to the high density of the point cloud and relative spatial isolation of these misclassified points.
The isolation of misclassified points in general leads us to an idea of future work with using the contextual classifiers instead of point-wise approach. The other modification of the workflow that might increase the performance of classification in the case of steep faces is to pull out these points as another class.
Concerning the isolated ground points on trees, an idea for future research is that an increased minimal size of the neighbourhood may result in the characteristics that better reflect the context and thus, better distinguish between classes.
Our future work will also focus on classification of thinned point cloud to explore the effect of lower point density on the feature relevance and separability.