STORM DRAIN DETECTION AND LOCALISATION ON MOBILE LIDAR DATA USING A PRE-TRAINED RANDLA-NET SEMANTIC SEGMENTATION NETWORK

: As the expansion of cities and urban areas results in the construction of more impermeable road surfaces, a well designed urban drainage system becomes of greater importance. However, the accurate and up-to-date mapping of storm drains necessary to create accurate drainage models is often lacking. In recent years, mapping of the road infrastructure is increasingly carried out by highly efﬁcient mobile mapping systems but which lack automatic interpretation of the massive amount data. In this paper we present a fully automatic storm drain detection method to extract and locate storm drain inlets in mobile mapping lidar data. The point cloud is ﬁrst segmented by a pre-trained RandLa-Net model, which although untrained to segment storm drains, is able the segment storm drain clusters in the hardscape class. The results from this class are further processed by enforcing different requirements to only extract and locate storm drain clusters. Our approach is evaluated on a large testing dataset with 171 storm drains and achieves 81.9%, 95.2% and 88.1% for recall, precision and F1-score respectively. The majority of the false positive and false negative detections are due to incorrect point cloud segmentation of the RandLa-Net. In terms of localisation, our approach achieves an RMSE of 5.5 cm on the centre location while the dimensions of the bounding box are on average 23% off compared to the ground truth.


INTRODUCTION
As cities and communities continue to grow, rural areas are increasingly being urbanized. As a result, natural permeable soil is replaced by impermeable city roads and other urban infrastructures leading to an increased risk of flooding. A welldesigned urban drainage system is of great importance in order to drain excess rainwater and prevent large urban floods. In order to asses the current performance of a drainage system, an up-to-date and accurate drainage network containing the locations and dimensions of storm drain inlets and manhole covers are essential. However, such data is often lacking, incomplete or outdated (Bertsch et al., 2017, Wang et al., 2021. In order to map the current drainage infrastructure, traditional surveying methods such as total station or GPS measurements are carried out. Although very accurate, these methods are time consuming and labour intensive compared to newer more automated remote sensing applications (Jalayer et al., 2014). Compared to UAV imagery or mobile mapping data (image and/or lidar), large neighbourhoods can be captured with similar accuracy in only a fraction of the time needed for traditional surveying methods. However, in terms of time, the data capture is less important than the data interpretation and mapping on mobile mapping data. Therefore, automating this task is of great importance to make these remote applications viable for capturing the road infrastructure. As reported in (Alshaiba et al., 2020), this could reduce the overall cost by 22% and result in a time saving up to 91% compared to the traditional manual surveying methods. * Corresponding author Recent research has mainly focussed on automatic detection of different objects of the road infrastructure such as buildings, poles, traffic signs or the general road structure (road delineations, road markings, etc.) (Guan et al., 2016). However, detection of drainage infrastructure is generally overlooked and research on automatic storm drain detection is limited. This is partially due to the small dimensions of a storm drain compared to buildings, cars or poles making it more difficult to extract them from these massive datasets. Although some work has been successful in finding storm drains on lidar intensity ground images using deep learning (Yu et al., 2014, Yu et al., 2015, Yu et al., 2020, storm drain detection on raw lidar data is limited (Alshaiba et al., 2020).
In this paper, we propose a fully automatic storm drain detection method to extract and locate storm drain inlets in mobile mapping lidar data. First a pre-trained state-of-the art deep learning architecture semantically segments the massive mobile mapping lidar point cloud in different classes such as building, road, vegetation and hardscape. The latter includes man-mad road furniture objects such as poles, benches, traffic sings, etc. Additionally, the pre-trained model is able to segment storm drains within this class although it was never trained to do so. The hardscape point clusters are further processed in order to only extract and locate the storm drains in the dataset. Both detection and localisation performance are evaluated on a large testing dataset.
The remainder of this paper is organized as follows. Section 2 provides the related work on storm drain detection on remote sensing data. Section 3 presents our methodology. In Section 4 the experimental results are presented. Finally, the conclusions are presented in Section 5. Figure 1. Schematic overview of the proposed workflow. The mobile mapping point cloud is segmented into different classes of which the hardscape class is used to extract storm drain clusters. In each cluster, the storm drain location and size are determined by finding the minimal bounding rectangle of the cluster points.

RELATED WORKS
In the last decade, most research focusses on general object detection such as buildings, road structure (road delineation and markings for example), cars, poles, etc. (Guan et al., 2016). To achieve this, various approaches were investigated ranging from using basic manually designed low-level features to more complex machine learning and deep learning methods able to learn complex high-level features. In terms of drainage infrastructure detection, some work is mainly done on the detection of manhole covers on both lidar and image data while research on storm drain detection is more limited. While a similar approach could be implemented for manhole cover and storm drain detection, storm drains are significant smaller making object detection more difficult.
In (de Vitry et al., 2018), manhole covers and storm drain are detected on UAV imagery. To improve detection results, only the areas around the road edges are processed. Drainage covers are detected using a sliding window approach and a trained Viola-Jones classifier. For each detection cluster, seven properties are computed and used as input for a second classifier to filter out false positive detection. Different classifiers such as a Linear SVM, Logistic Regression and Artificial Neural Network were tested. Their approach achieves an average precision score (AP-score) of 65% and 73% for their single-view and multi-view implementation respectively. A more complex image based deep learning approach is investigated in (Boller et al., 2019) using 1000 high resolution Google Street View images. A Faster R-CNN using ResNet-101 as backbone architecture was trained on 4000 panoramic images and achieved an average precision of 72.3% and 74.5% for manhole covers and storm drains respectively while the smaller water supply network valves only achieved 49.5% precision. Similar drainage covers are detected using RetinaNet with ResNet as backbone architecture in (Santos et al., 2020). Although tested on a small testing dataset, the trained RetinaNet network shows promising results by outperforming the Faster-RCNN model. As expected, average precision for storm drain inlets were on average between 5% and 23% lower compared to manhole cover average precision.
Detection of drainage systems on lidar data is rarely performed on the raw point cloud. In (Alshaiba et al., 2020), manhole covers are detected by filtering the point cloud with a predefined intensity interval and fitting a minimal bounding rectangle around each cluster. False positives are discarded by applying different filters that remove bounding boxes which are too small or too big. Although this simple method achieves usable results, it is not robust as the approach can not distinguish the difference between a manhole cover and a dark square patch of asphalt in the road. Instead of using the raw mobile lidar data, the point cloud is generally converted into a ground intensity image in order to use state-of-the-art image detection methods.
Yu Yongtao published several papers (Yu et al., 2014, Yu et al., 2015, Yu et al., 2020 on this topic investigating different detection approaches. In (Yu et al., 2014), circular manholes and rectangle storm drains are detected using a marked point model on the intensity ground image. Later, Yu published (Yu et al., 2015) in which a deep learning approach was proposed which used a Deep Boltzmann Machine to generate high-level features in combination with a random forest model to classify the sliding window patches. In their most recent work (Yu et al., 2020), the Boltzmann/random forest combination was replaced by a deep learning classifier and a super-pixel segmentation strategy. Additionally, the marked point approach from (Yu et al., 2014) is used to accurately determine the final location and dimensions of the manhole covers. A similar approach on intensity ground images was proposed in our previous work (Mattheuwsen and Vergauwen, 2020), where we transfer learned several different backbone architectures on a relatively small training dataset to detect manhole covers. Our approach achieves 97.3% recall and 97.3% precision on the road surface and is able to locate the storm drains with a 2D confidence interval of 16.5 cm. In (Wei et al., 2019), manholes are detected on both high detail lidar ground images and colour RGB images using a combination of manually designed lowlevel features as input for a SVM classifier and a sliding window approach. The combination of both the lidar images and RGB images achieves high accuracy results, although this was achieved by a modified mobile mapping system with captured data at a very high point density and image resolution unachievable by commercial systems.
As this literature study clearly shows, detection of storm drains or manhole covers on lidar data is reduced from a point cloud detection task to an image detection task. This opens the door to well established image processing techniques which are generally more suited for detecting these smaller objects. However, with the 2D projection of the point cloud, the vertical information of the data is lost. In our previous work (Mattheuwsen and Vergauwen, 2020), we investigated an approach which captured the vertical information within a channel of the intensity ground image, but were unsuccessful in improving the results. With this paper, we aim to investigate a different route and aim to detect storm drains using deep learning point cloud segmentation networks instead of reverting to an image-based detection approach. Additionally, the use of a pre-trained deep learning model is explored so no training dataset must be created or new deep learning model must be trained.

METHODOLOGY
In this section, a detailed overview is presented of our proposed storm drain detection workflow which can be split up in three parts. Firstly, the mobile mapping point cloud is semantically segmented using a pre-trained deep learning model called RandLa-Net, into different general classes such as building, road, vegetation and hardscape objects. Secondly, the hardscape points are clustered together and the storm drain clusters are extracted that comply with the enforced requirements. Lastly, a minimal bounding rectangle is fitted around the surface points of the storm drain clusters in order to find the location and dimensions of each storm drain. The complete workflow is shown in Figure 1.

Semantic segmentation of hardscape objects:
In recent years, point cloud deep learning has made tremendous advancements in the performance of semantic segmentation. The RandLA-Net model (Hu et al., 2020) recently achieved promising results. RandLa-Net is a very efficient and lightweight 3D semantic segmentation network that can process large scale point clouds. Differently from other approaches, it uses random sampling and a novel local feature aggregation module to outperform the state-of-the art in a fraction of the processing time as shown in Figure 2. In our approach, we use a pretrained RandLA-Net model to semantically segment the mobile mapping point cloud. While different segmentation classes typically include objects such as buildings, roadways, vegetation, poles, etc., storm drains are rarely part of the segmentation task. In order to expand the segmentation task to include a new object, new training data including the new object is typically required. However, storm drains are relatively small compared to these other objects, resulting in a large class imbalance which leads to poor segmentation performance rendering this option ineffective (Phan and Yamamoto, 2020). A possible solution for class imbalance is to combine several smaller objects into one general class resulting in a better class representation. However, this approach requires additional postprocessing steps to extract and distinguish the different objects from each other within the combined class. A similar approach was applied in the large-scale terrestrial laser scanning (TLS) benchmark dataset semantic3D (Hackel et al., 2017). Aside from the typical classes such as building, vegetation and road, the dataset also contains a combined class which consists of man-made objects such as light poles, traffic signs, benches, fences, etc. Although storm drains are not specifically included in this class, a trained RandLa-Net model on the Semantic3D dataset is able to distinguish storm drains from the road surface and segment them within this class. This is possible because of the difference in representation of a storm drain in the mobile mapping point cloud compared to the terrestrial laser scanning point cloud of the training dataset. As shown in Figure 3, a significant cluster of points are captured below the road surface in the mobile mapping point cloud while this is not the case when captured by a TLS. This is due to the difference in acquisition height and method which is higher and dynamic for the MMS instead of static and generally lower to the ground for TLS. It is assumed that this subtle difference in the MMS point cloud causes the RandLa-Net model to segment this cluster as hardscape. As deep learning models are perceived as black box models, we are unable to pinpoint the decisive factor causing this. In our approach, we use the online available pre-trained RandLa-Net model trained on the Semantic3D dataset to semantically segment the mobile mapping point cloud. An example of the segmentation performance is shown in Figure 2. As presented in (Hu et al., 2020), the model achieves state-ofthe-art segmentation results outperforming all existing methods especially on the hardscape class by 10%.
Extraction of storm drains clusters from segmented point cloud: Using the segmentation results from the pre-trained RandLa-Net model, the points from the hardscape class are further processed in order to distinguish storm drain clusters from other objects in the hardscape class such as light poles, traffic signs, benches, etc. We define several requirements which each cluster has to comply to in order to be considered as storm drain. First, a maximum intensity threshold I th is enforced on the hardscape points as storm drains are generally made out of cast iron or another metal resulting in a relatively low intensity value. Second, the remaining points are clustered into smaller segments separated by a minimum distance threshold DCC using the Connected Components tool from CloudCompare (GPL Software, 2022). Figure 4 shows the clustering results where the hardscape points are coloured in purple with their corresponding bounding box in green or red depending if it contains a storm drain or not. As can be seen in Figure 4, the majority of other hard scape object clusters are well above the ground surface while the storm drain clusters are mainly bellow. The main requirement is therefore defined as follows: • Requirement 1: the cluster centre must be at least at a distance of D th below the ground surface For each cluster the vertical distance between its centre and the ground surface mesh is computed and clusters not fulfilling this requirement are filtered out. The mesh is created using a combination of CloudCompare's Cloth Simulation Filter plug-in to extract the ground points and the Poisson Surface Reconstruction plug-in to generate the mesh on the ground points. While this single requirement removes the majority of false positive clusters, two additional requirements are defined to further improve the results: • Requirement 2: Cluster must contain at least Smin number of points and not exceed Smax • Requirement 3: X and Y dimension of the cluster bounding box must not be smaller than Bmin and greater than Bmax All remaining clusters that comply to these three initial requirements are considered to contain a storm drain cover. Our values for the different parameters are summarized in Table 1 and determined based on visual analysis of some example storm drain clusters in the dataset.
Localization of the storm drain: For each remaining cluster, the storm drain centre and size (width and height) are determined by finding the minimal bounding rectangle around the cluster. However, not all cluster points are taken into account as only the points close to the surface should be used to find the the boundary. Only cluster points within the height range defined by Hsur ± H tol are considered as storm drain cover points. Hsur is the surface height as computed in Requirement 1 while H tol is the user-defined height tolerance. Figure 5(a) shows how only the surface points Psur of the storm drain cluster fall within the selection area. Using these points, the 2D minimal bounding rectangle with the smallest area is computed using a traditional convex hull approach. This approach makes use of the fact that the minimum bounding rectangle of a point set is the same as the minimum bounding rectangle of its convex hull which simplifies and speeds up the computation (Toussaint, 1983). Additionally, a side of the minimum bounding rectangle must be collinear with a side of the convex hull. Using this information, the minimum bounding rectangle is computed as follows. First, the convex hull and its corresponding corners are determined using the surface points Psur. Second, going over each edge of the convex hull, the corresponding minimum bounding rectangle and area are computed. The minimum bounding rectangle of the storm drain cluster is the minimum bounding rectangle of the convex hull edge with the smallest area. Additionally, the cluster points will contain some noise points outside of the boundary of the storm drain causing an overestimation of the dimension and location. Therefore, the optimal minimum bounding rectangle is found using RANSAC where for each iteration only a randomized subset of Psur is used to compute the convex hull and corresponding minimum bounding rectangle around the subset. The smallest area of the bounding rectangle is considered to be the most accurate fit of the storm drain boundary. The points within this bounding rectangle are deemed to be insiders. Figure 6 shows an example of the different bounding rectangles of the surface points using RANSAC. The magenta rectangle indicates the best fitting boundary around the surface points while the blue rectangles show the less optimal results from RANSAC. All points within the magenta bounding rectangle are considered to be inliers.
While the majority of the false positive clusters are removed during the cluster extraction step, two additional requirements are defined using the localisation results in order to further improve the results: • Requirement 4: Psur must contain at least Pmin number of points • Requirement 5: W idth/height ratio of the minimum bounding rectangle must be between Rmin and Rmax.
The different values for all the parameters are shown in Table 1 4. RESULTS

Dataset
All data used for testing the proposed method was captured by a modified Lynx mobile mapper SG from Optech. This system is equipped with a dual lidar sensor setup from the Lynx M1 Mobile Mapper and is able to capture highly accurate point The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France Figure 6. Shows the minimum bounding rectangle in magenta computed using RANSAC with the less optimal bounding rectangles in blue. Green point within the magenta rectangle are inliers while red points are the outliers. clouds at 500 kHz. A more detailed summary and analysis of this mobile mapping systems is presented in (Mattheuwsen et al., 2019). The testing dataset is a 2.5 km residential area located north of the city Ghent in Belgium. This neighbourhood of 400 m by 300 m was originally captured by 300 million points but was subsampled to around 80 million points at 2.5 cm as RandLa-Net does not make use of sub-centimetre resolution. All storm drains in the dataset were manually labelled resulting in 171 storm drain locations including their dimension (width and height).

Storm drain detection results
The evaluation of the detection performance of our proposed method is evaluated using recall, precision and F1-score. The individual influence of the different requirements is discussed and the final false positive and false negative detection are analysed.
The detection results of our proposed method on the testing dataset are shown in Figure 7 while the performance parameters are shown in Table 2. Overall, our approach achieves decent detection results and is able to find 140 of the 171 storm drains successfully with only 7 false positive (FP) detections. Without enforcing the five requirements for the hardscape clusters, detection performance would be poor with almost 1173 FP detections. By imposing requirement 1, more than 95% of the false positive detections are removed. The four remaining requirements seem to have less impact in removing FP detections although they succeed in removing the more difficult cases and fine tune the detection results. After applying all requirements, our method is able to achieve 81.9% recall, 95.2% precision and an F1-score of 88.1%.
Out of the 147 detections only 7 are false detections, mainly caused by low vegetation as shown in Figure 8(a-b). Small bushes low to the ground sometimes get falsely segmented as hardscape by the RandLa-Net while the different requirements were not able to filter out these false positive detections. In order to remove these flaws, it is possible to set the requirements parameters from Table 1 more strict, although setting them too strict will results in a lower recall rate. For example, increasing the distance threshold D th to 10 cm would remove all but one FP detection but also 13 TP detections. A different solution is to look into additional requirements that only filter out the false positive detections as investigated in (de Vitry et al., 2018). Additionally, it turned out that during the manual labelling of the ground truth data, a storm drain was missed, but fortunately it was detected by our method. As Figure 8(c) shows, the storm drain is located in a small ditch on the side of the road. This shows the advantage of automatic detection over manual labelling as we assumed that storm drains are generally located on the side of the road in the gutter instead of checking the whole dataset.
With a recall of 81.9%, 31 storm drains are undetected in the dataset. It is possible that due to too strict requirements, true positive storm drain clusters were filtered out. However, the results in Table 2 show that even without any requirements our approach is only able to detect 142 of the 171 storm drains (≈83% recall). This indicates that the main problem is with the pre-trained RandLa-Net model not segmenting the storm drains as hardscape but as road or grass as shown in Figure 9(ab). Additionally, the undetected storm drains in Figure 7 appear to display a pattern in the dataset. The streets they are located in, indicated by the blue rectangle in Figure 7, turn out to be different from all the other streets as they are narrower oneway streets with the storms drains and gutter in the centre of the road instead of on the side of the road. Although initially this may not seem like a problem, a closer look at one of the storm drains in these streets reveals that they differ in appearance. As Figure 9(c-d) shows, the noise/reflection points under the storm drain in the centre of the road are less obvious compared to a storm drain captured on the side of the road. Possibly because of this difference, the RandLa-Net model segments the storm drain as road or vegetation, rather than hardscape. As the semantic3D dataset is not specifically trained to segment storm drains, a possible solution is to train a dedicated model to achieve this task. As previously mentioned, creating a dataset with a separate class for storm drains would yield a massive class imbalance. Therefore, a viable solution is to include storm drains within another class such as the hardscape class of Se-mantic3D. Additionally, some of these problem areas could be predicted as the storm drain locations show a regular pattern within a street. A similar approach was proposed in (Bertsch et al., 2017) where it is assumed that storm drains are equally distanced from each other. By checking several pattern rules, the user could be notified of possible undetected storm drains in a certain area or street so they can be manually mapped.

Storm drain localisation results
Finally, the localisation accuracy of our proposed method is investigated by comparing the predicted location and dimensions with the ground truth data. As mentioned in the methodology section, the surface points Psur may contain some outliers which influence the localisation method negatively. For The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France    that reason, RANSAC is utilised where only a randomized subset of the surface points is taken into account to determine the minimum bounding rectangle during each iteration. In order to assess this approach, the localisation was performed several times with a varying subset size between 50% and 100% of the surface points. The 2D localisation results analysed in terms of mean error, root mean square error (RMSE) and 95% confidence interval (= 2 x RMSE) and are shown in Table 3. As these results show, our approach is able to determine the centre location of a storm drain with a RMSE of 5.5 cm. In general, the localisation accuracy improves using a smaller subset of points which confirms the surface points do contain some outliers. By removing these outliers, the 95% confidence interval improves significantly from 16.1 cm to 13.4 cm for 100% and 60% of the surface points respectively. Additionally, the predicted width and height of the localisation results with 50% subset are compared with the ground truth dimensions. Our approach achieves an absolute mean error of 11.3 cm and 7.0 cm for the width and height respectively. Taking into account the average dimension of a storm drain of 50 by 30 cm, our method makes a 23% error in both width and height which is significant. Additionally, the height of the bounding rectangle is equally estimated to be both too big and too small while the width dimension is mainly overestimated. This could mean that the remaining surface points Psur not always describe the true boundary of the storm drain. In future work, a different localisation method could be investigated such as the marked point approach proposed in (Yu et al., 2014).

CONCLUSION
In this paper we presented a fully automatic storm drain detection method to extract and locate storm drain inlets in mobile mapping lidar data. The point cloud is segmented by a RandLa-Net segmentation model which was pre-trained on Semantic3D dataset. This pre-trained model is able to segment the point cloud in different classes of which hardscape is one which contains general man-made objects. Although storm drains are not specifically included within any class, the model segments them as hardscape while it was never trained to do so. The hardscape class is further processed to only extract and locate storm drains in the dataset by enforcing different requirements which filter out the false positive detections.
The detection and localisation performance are evaluated on a large testing dataset containing 171 storm drains. Our approach is able to achieve 81.9% recall and 95.2% precision. The majority of the false positive detection are due to incorrect segmentation of the point cloud and could not be filtered out by the requirements as this would reduce the recall rate significantly. Additionally, the 31 undetected storm drains are mainly due to incorrect segmentation of the RandLa-Net model which could be solved by transfer learning a new RandLa-Net which is specifically trained to segment storm drains within the hardscape class. Alternatively, several storm drain pattern rules could warn the user of possible undetected storm drains. Additionally, the proposed method is able to localise the storm drain centre with a RMSE of 5.5 cm while the dimensions of the bounding rectangle showed a 23% error compared to the ground truth dimension. It is mainly the dimension prediction of the bounding rectangle that could benefit from an improved approach.