CLOUD-BASED SOLUTION FOR NATIONWIDE POWER LINE MAPPING

Automatic tools for power line mapping and monitoring are increasingly required by modern societies. Since traditional methods, like ground-based onsite inspections, are very labourand time-intensive, the use of Geomatics techniques is becoming the most promising solution. However, there is a need for an all-in-one solution that allows the entire 3D mapping pipeline in a nationwide data context. The aim of this paper is to introduce a novel cloud-based solution for nationwide power line mapping. The innovative aspects of the system are threefold. First, to exploit image-based 3D reconstruction algorithms to derive dense point clouds over power line corridors, thus demonstrating the potential of photogrammetry as a promising alternative to costly LiDAR surveys. Second, to supply an all-in-one web-based pipeline that automatically manages all steps of power line mapping, from 3D data generation to clearance anomaly detection. Finally, to exploit cloud-computing technology, to handle massive input data. First tests show promising results for (i) 3D image-based reconstruction, (ii) point cloud classification and (iii) anomaly detection. Figure 1. Examples of image-based 3D reconstruction of power line corridors. Starting from helicopter-based imagery (a, c), dense point clouds of cables, transmission towers, pylons and surrounding environment can be generated (b, d).


INTRODUCTION
The development of automatic solutions for mapping transmission and distribution electricity grids (hereinafter, power lines) is increasingly required by energy companies, as modern society needs a reliable and continuous supply and distribution of electric power. The tremendous impact of power outages on people and businesses is clearly demonstrated by two well-known blackout events, that left 95% of Italy and more than 10 million people in Europe without power in 2003 and 2006, respectively. These events, among others, showed how the safety of power line corridors, including both infrastructure components (cables, towers, insulators, switches, etc.) and surrounding key objects (terrain, buildings, trees, etc.), plays a vital role in present-day society. Particularly, power line monitoring involves two main aspects, the detection of potential hazards and the analysis of power line structural stability. The former is very critical and relies on the clearance anomaly detection, to check whether the distance between power line and non-power line objects is within the safety range. Since traditional methods, like ground-based onsite inspections by foot patrol, are very labour-and time-consuming, the use of Geomatics platforms, sensors and techniques is becoming the most promising solution. The wide overviews given by M u et al. (2009), M irallès et al. (2014) and M atikainen et al. (2016), clearly describe the efforts of the Geomatics community in providing for advanced mapping solutions of power line corridors. Although integrated solutions were also proposed (Kremer, 2011), airborne LiDAR, especially from helicopterbased platforms, seems to be generally the most widely adopted technology for power line monitoring (Zhu and Hyyppä, 2014;Guo et al., 2016a;Chen et al., 2018). Indeed, it provides for a fast method of data collection and classification with high automatism and accuracy of height information. However, airborne LiDAR surveys are still an expensive data collection technique. On the other hand, advances in the radiometric quality of the images as well as in photogrammetry and computer vision, particularly those related to the development of innovative DIM (dense image matching) algorithms (Haala and Rothermel, 2012;Remondino et al., 2014), have increased automation in image-based 3D reconstruction of scenes, with the goal of generating high spatial resolution 3D point clouds. If a suitable redundancy and a good geometric configuration of image rays are available, photogrammetric point clouds can today feature a spatial resolution equal to the GSD (ground sample distance) of the original imagery , and a vertical accuracy below the GSD level. Despite this, only few attempts have been made so far to reconstruct 3D point clouds of power lines from multiple images, mainly acquired from UAV-based platforms (Jozkow et al., 2015;Jiang et al., 2017;Zhang et al, 2017), while, generally, airborne imagery has been exploited only to extract the 2D position of power line components (Oberweger et al., 2014).

Power line 3D mapping
Starting from a LiDAR or photogrammetric point cloud of the power line corridor, the 3D mapping task generally involves three main steps, namely (i) point cloud classification, to extract power line points, (ii) point cloud modelling, to reconstruct the geometry of single power line components and (iii) clearance anomaly detection, to identify potential interference issues. The most recent studies exploit machine learning methods and a large number of features to accomplish the prediction of power line and non-power line class labels, e.g. by applying a Random Forest classifier (Kim and Sohn, 2013) or a JointBoost classifier (Guo et al., 2015). Once points are semantically interpreted, the modelling process aims to accomplish the 3D reconstruction task of single power line elements with a data-driven or a model-driven approach. Generally, continuous mathematic models are fitted to the cable points, to reconstruct power line spans. Either catenary curve fitting (Sohn et al., 2012;Jozkow et al., 2015) or other parametric models (Ritter and Benger, 2012;Guo et al., 2016a) are used to create the final model. Few studies focus on the automatic reconstruction of power pylons, by either adopting a data-driven (Han, 2012), a model-driven (Guo et al., 2016b) or a hybrid (Zhou et al., 2017) approach. Finally, distances between the infrastructure and surrounding objects can be measured to evaluate the clearance hazard. For instance, in Chen et al. (2018) the clearance measurements are piecewise solved based on differential geometry: the spots where the minimum distance is lower than the safe threshold, are considered as anomalies.

S calability and Cloud processing
M any research projects developed automatic algorithms to accomplish the single steps of power line 3D mapping. However, an all-in-one solution that addresses the entire 3D mapping pipeline, including the final web visualization and access of mapping results, is still missing. Furthermore, if nationwide scalability is required, it is crucial to figure out a solution which can efficiently process a massive amount of data. So far, some image-based 3D reconstruction services (Vergauwen and Gool, 2006;Tefera et al., 2018) and point cloud processing frameworks (Liu and Boehm, 2015) running in the Cloud have been developed. Nevertheless, a solution specifically designed for processing big geospatial data for power line mapping, is still missing. Finally, the potential for power line 3D mapping via photogrammetric techniques is still underexploited, despite its higher cost-effectiveness compared to LiDAR.

Paper objectives
This paper reports a step forwards in power line mapping, by introducing a novel cloud-based processing solution for nationwide applications. This solution combines state-of-the-art methods embedded in a web-based platform, designed to:  automatically perform the entire photogrammetric 3D reconstruction pipeline, from images to dense point clouds ( Figure 1);  automatically classify point clouds and detect clearance anomalies (either from photogrammetric or LiDAR data, if existing LiDAR surveys are available);  visualize 3D results and 2D ancillary data (maps, anomalies, images, etc.) in a web viewer;  manage new and existing spatial and non-spatial data, within a unique responsive web-based environment;  update existing power line maps.
In the following sections, the processing workflow (Section 2), the platform infrastructure and functionality (Section 3) and exemplary results (Section 4), will be described and discussed.

METHODOLOGY
The general workflow of data processing is summarized in Figure 2 and explained in the next sections.

Flight planning and image acquisition
Power lines feature various types of wires, depending on the transported tension (or voltage): low tension (LT), middle tension (M T) and high tension (HT) lines. According to the tension, wires (cables) have a diameter from some mm to some cm. There are three main issues, that should be considered when planning a photogrammetric survey of a power line corridor: 1) image scale should be large enough to represent the cable structure with enough pixel information; 2) image overlap should be large in order to increase the redundancy of image rays, thus enabling a reliable 3D reconstruction of wires; 3) given the elongated shape of transmission lines, the flight efficiency should be maximized via single-line image network. Aircrafts, helicopters and unmanned aerial vehicles (UAVs) have been used for power line mapping tasks. Each of them features specific advantages: aircraft are generally used for HT lines, while helicopters are more suitable for LT and M T. Indeed, helicopter's benefits are twofold: compared to aircrafts, they are able to fly closer to the power lines, thus achieving subcentimetre GSD, and can follow a line which has sharp turns; compared to UAV, they can cover larger areas more efficiently. In case of a LT line, with a helicopter platform mounting a dualcamera system with oblique backward and forward views (4864 x 3248 px), image acquisitions are planned according to the following rules:  few mm mean GSD on the ground (ca. 4 mm);  at least 75% image overlap;  single-line network, that follows the corridor shape and is partially misaligned with respect to the power line, to avoid self-occluded areas of the infrastructure elements;  good coverage of the corridor (ca. 30 m width by each side of the transmission line), to detect any potential interference between power line and non-power line objects. In order to georeference the image network and avoid as much as possible field surveying measurements, accurate navigation data (GNSS/IM U observations) are collected and postprocessed. A typical image network geometry is displayed in Figure 3, together with an example of the acquired images showing the high level of detail over the transmission line. A typical helicopter-based image network consists of some 1000 images (incl. both forward-and backward-looking views) acquired on a strip of approximately 2 km.

Image processing and 3D reconstruction
The photogrammetric 3D reconstruction problem is addressed by three steps, namely 2D feature-based matching, bundle block adjustment (BBA) and dense image matching (DIM ). First, image correspondences are identified across the different views at the original image resolution, by adopting a feature-based method with SIFT operator (Lowe, 2004). Second, image orientation parameters are estimated within a free-network BBA. To increase the precision of the triangulated 3D points, a threshold on the minimum intersection angle between image rays is set up (10 deg.). To solve the scale and datum ambiguities, the free-network bundle adjustment is followed by a rigid similarity transformation, using the post-processed onboard navigation observations as (mandatory) input. The adoption of field-surveyed GCPs (ground control points) as reference data requires (i) costly and labour-intensive campaigns, especially in case of impervious and long power lines corridors, and (ii) a time-consuming procedure of point marking, that strongly depends on the user's expertise. Therefore, a georeferencing approach based on on-board GNSS/IM U data is here preferred and confirmed to give an accuracy-level that meets the requirements of the present application (few decimetres). Finally, a dense 3D reconstruction via pixel-based image matching algorithm is carried out. This is performed using the first-level image pyramid and a 5-pixel size for the NCC (normalized cross correlation) window. An image block of some 1000 images normally produces a 3D point cloud of approximately 200,000,000 points. The entire image processing workflow is based on the open source pipeline COLM AP , with processing parameters customized ad-hoc to find the best compromise between efficiency (due to the massive size of input data) and accuracy/completeness of power line reconstruction.

Point cloud classification
Once the photogrammetric point clouds have been generated (or an external LiDAR point cloud has been imported, in case an existing LiDAR dataset is used), the classification phase starts, in order to semantically interpret the 3D points. The aim is to extract the following classes of interest: pylons, transmission towers, cables, vegetation, buildings, water, road and ground. The selection of these classes follows the national legislation on clearance anomaly detection, that sets specific clearance thresholds for these different power line / non-power line objects. To achieve this, we have adopted a classification approach following Weinmann et al. (2014). It is formulated as a supervised learning problem and executed in three steps:  feature computation: the selection of point features plays an essential role in machine learning problems, as it can strongly enhance the algorithm performance in terms of both speed and accuracy. Five geometric features are here experimentally used as relevant and suitable measures to characterize our point clouds: distance to plane, eigenvalues of the neighbourhood, elevation, local vertical dispersion and verticality. Additionally, features based on HSV (hue, saturation, and value) colorimetric content and number of returns are specifically exploited for photogrammetric and LiDAR point clouds, respectively. The training dataset of correct labels was manually annotated on existing power line point clouds. Particularly, the ratio between the number of points correctly assigned to each class was properly adjusted to prevent generating a biased learning model.  model training: starting from the point cloud (previously shifted to local coordinate system, to avoid working with cartographic coordinates), with the computed 3D features and the correct labels, a classifier is then trained using Random Forest (Breiman, 2001). This learning method was experimentally selected based on its efficiency and prediction accuracy. Literature shows that among the machine learning techniques, random forest has been an excellent tool to learn feature representations given their robust classification power and easily interpretable learning mechanism (Belgiu and Dragut, 2016).  prediction: once the classifier is generated, the prediction process can be performed on the input point cloud, by traversing the tree structure with feature information.
The adopted point cloud classification method is based on the supervised approach implemented in the Computational Geometry Algorithms Library (Giraudot and Lafarge, 2018) and the Random Forest Template Library (ETHZ Random Forest, 2018).

Anomaly detection and vectorization
Starting from the classified point cloud, the clearance anomaly detection computes the distances between the power line objects (pylons, transmission towers and cables) and the non-power line objects (buildings, vegetation, water, road and ground), to detect the spots where the safe clearance thresholds are exceeded.
Considering the scalability of the platform under development and the forthcoming massive amount of data (ca. 3 GB for km), it is inefficient to estimate an analytical solution of the point-topoint distance. Therefore, the adopted anomaly detection step is formulated as a nearest neighbour search problem based on k-d trees to iteratively compute the closest points. Indeed, k-d trees are efficient space-partitioning data structure, derived as generalization of binary search trees. Particularly, the root represents the whole point cloud, whereas the leaves (also called buckets) provide a completely disjointed partition of the points.
To generate a balanced k-d trees structure, thus ensuring that every k-d tree entry has the same probability, a sliding midpoint rule is applied, i.e. the axis and splitting point defined at each node are selected in such a way as to avoid long and t hin cells.
Once k-d trees are constructed, they can be recursively queried for the closest neighbours of any given point. Therefore, after introducing the clearance threshold specifically defined for each class, the algorithm only returns those non-power line points that are closer than this distance to any power line point. The the k-d trees algorithm available in SciPy (Jones et al., 2001) is adopted here. In order to efficiently manage a large amount of data, the NumPy structure (Van Der Walt et al., 2011) is furthermore exploited. Finally, once classes of interest are labelled and (potential) anomalies are detected, the 2D position of power line elements and anomaly spots should be identified on the map. This task is accomplished by two steps, i.e. data clustering and 2D vectorization. First, a density -based spatial clustering method is applied to segment each single pylon (or, transmission tower), anomaly spot and power line span. For this, we used the DBSCAN algorithm (Ester et al., 1996) available in scikit-learn (Pedregosa et al., 2011). Second, false anomalies (generated due to, e.g. errors in the classification resultssee Section 4) are detected and eliminated, by introducing a threshold on the minimum number of points a cluster should include to be accepted as an anomaly spot. Third, the vectorization step is performed: the 2D positions of pylons (or transmission towers) and anomalies are identified by the barycentre of their clusters, while power lines are modelled as linear segments connecting the points of each span cluster.

INFRAS TRUCTURE
The infrastructure of the web-based platform is summarized in Figure 4. It adopts the AWS (amazon web services) Cloud technology, to parallelize the processing workflows and to be scalable (AWS, 2018). The platform is developed with the IaaS paradigm (infrastructure as a service), using different types of instances and services. Particularly, the technological setup adopts AWS: Simple Storage Service (S3) for the storage, M ongoDB on an m4.large instance (2 CPU) as a NoSQL DB, and a m4.large instance for the 2 Http Server and 2 java Application server. Finally, an AWS g3.4xlarge instance with 8 GPU, 32 CPU is used for the GPU Server. Totally, the cost of the service amounts to 1.9 USD per hour. The system consists of a fully automated process and includes two main components: the manager and the rendering applications.

Manager application
The manager application (WebApp M anager) manages and runs the automated data processing steps. The latter depends on the input data, either imagery or LiDAR data. In the first case, images are uploaded in AWS S3 and processed as reported in Section 2.2, followed by cloud classification and anomaly detection. If LiDAR point clouds are uploaded in AWS S3, the semantic interpretation directly starts. At the end of the classification (Section 2.3), semantic point clouds and derived vector representations of power line elements and anomalies are exported and passed to the visualization framework (Rendering application). The back end of the WebApp M anager is multi-GPU and implemented in Java. Since multiple users can run different processing steps simultaneously, a FIFO (first in first out) multi-queue scheduling strategy is implemented, to handle concurrent reconstruction processes. Particularly, resources are evenly distributed among the users, based on the number of available CPU cores and GPUs on the server. The front end of the WebApp M anager provides the user with a GUI to visualize all the 2D/3D datasets uploaded in the system and ready to be processed, together with the status of the workflow steps. When a process ends, results can be displayed in the rendering application.

Rendering application
The rendering application (WebApp Rendering) manages the interactive visualization of all data ingested and generated by the processing workflow. The user interface includes the following components:  search box, that provides the user with a search tool, to query the power line database by code or name. Additionally, it displays anomalies, if any, detected in the selected 3D power line;  2D map navigator, based on OpenStreetM ap, that displays the mapped positions of pylons, transmission towers, power line spans and anomalies;  3D point cloud navigator, embedded in the landing page, that allows the user to interactively manipulate the point clouds in a 3D environment. It is based on Potree, an open source WebGL-based point cloud render, able to handle large point datasets (Schütz, 2016). Within the navigator, the user can visualize the point clouds, with all the semantic contents added by the process (e.g. labels and anomalies), and extract additional information (profiles, distances, etc.); The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W13, 2019 ISPRS Geospatial Week 2019, 10-14 June 2019, Enschede, The Netherlands  photo slider, that shows the images used as input for the photogrammetric process. Particularly, by selecting a point in the 3D navigator, the image where that point is visible in the centre most position, is automatically displayed. These windows are integrated into the same page ( Figure 5) and linked to each other, to interactively display the data of interest.

RES ULTS
To demonstrate the performance of the processing workflow, and critically discuss its open issues, two examples are hereafter commented and referred to as M T 1 and M T 2. They represent a subset of two helicopter-based photogrammetric surveys, performed over MT transmission lines (cable dimeter below 1 cm). In both cases, the flight height and on-board cameras were selected to achieve a sub-cm GSD on the ground.

3D reconstruction and classification
The results of the image-based 3D reconstruction pipeline are shown in Figure 6, with two detailed views of the generated RGB point clouds (2,041,851 tot. points reconstructed in M T 1, 801,357 in M T 2). Although small gaps are present on the transmission lines, due to the small size and poor texture of the cables, the amount of details reconstructed by the algorithm is suitable to clearly identify the power line elements. Indeed, the shapes of both pylons and cable spans are continuously represented by a good number of points, whose distribution is fairly even over the entire elements. The successive classification step returns promising labelled results (Figure 7). A first qualitative evaluation shows that the majority of points are correctly labelled, thus demonstrating that the selected features have good potential in characterizing both power line and non-power line elements. However, a few errors are visible, such as:  off-the-ground elements on the road (e.g. guard rails) are classified as cables, due to their elongated shape and colour;  misclassifications between power line elements and vegetation, e.g. tree trunks classified as pylons, or pylons heads labelled as vegetation, due to their geometric similarity; furthermore, small portions of cables are interpreted ad vegetation, if they are noisy reconstructed;  misclassifications between ground and roads, e.g. shaded parts of road labelled as ground, due to the DIM noise in such textureless areas, or some small unpaved roads wrongly identified as ground, given their irregular surface. These remarks are confirmed by a quantitative assessment, performed by comparing the classification results against the manually labelled ground truths. The recall (R) and precision (P) values (with their corresponding F1-score) are provided for each class in Table 1, together with the overall accuracy (OA) of the classifier on these two datasets. The classes "water" and "transmission tower" are missing in both datasets, therefore they are not considered in the following.

OA [%]
Cable ( (OA, in %), recall (R, in %), precision (P, in %) and F1-score values achieved in the two tests (the symbol "-" indicates that the class "Building" is missing in M T 1). Values in italic indicate the main issues, to be addressed in the future.   If, on the one hand, the classification results show a high level of completeness and quality for ground and vegetation points, on the other hand the performance of the classifier is less optimal, while predicting man-made objects. Particularly, the multi-class confusion matrix in Table 2 (corresponding to M T 2) clearly points out the main mislabelled cases. The latter, as discussed above, mostly generate from misclassification issues between ground and road, vegetation and ground, cable and vegetation, pylon and vegetation. These classification errors can cause false positives or false negatives to be generated during the anomaly detection step or force the k-d trees search to apply a wrong clearance threshold, since ground and road feature distinctive safety ranges.
Ground truth labels C

Anomaly detection
Starting from the labelled results, the anomaly detection step gives fairly promising results. Figure 8 shows the anomalies detected by the k-d trees nearest neighbour search, distinguishing between correctly detected alarm spots (true positive), false alarms, correctly eliminated by the a-posteriori density check (false positive automatically discarded) and false anomalies, not automatically discarded (false positive). M ost of false alarms are efficiently detected, since they are due to a sparse number of points erroneously labelled in the classification step. On the other hand, false positives are mainly generated by pylons misclassification errors. Indeed, when trees trunks are interpreted as pylons, or pylons heads are labelled as vegetation, the amount of points detected as anomalies exceeds the threshold set in the automatic density check. To address these issues, a more accurate classification of power line pylons should be developed, differentiating between the pylon body and its head, in order to model the overall shape in a more accurate way. Finally, a false negative is generated when small portions of the cables are labelled as vegetation and are situated close to vegetated areas: in this case, a more accurate 3D reconstruction of cables should be pursued, in order to avoid the misclassified noisy areas.

CONCLUS IONS AND FUTURE WORKS
We have reported the development of a Cloud-based solution, for nationwide power line mapping, mainly from image data. The strength and innovative aspects of the system can be summarized as follows:  it exploits image-based 3D reconstruction algorithms to automatically derive dense point clouds over power line corridors, thus showing the potential of photogrammetry as a promising alternative to (costly) LiDAR surveys;  it provides for an all-in-one web-based pipeline, that automatically manages all steps of power line mapping, from 3D data generation to clearance anomaly detection and data visualization;  it can also process and semantically segment existing LiDAR-based point clouds, showing the reliability and flexibility of the classification method;  it exploits Cloud-computing and -storage technologies, to upscale the power line mapping problem to a nationwide data context (i.e. long corridors of some km length). Tests executed so far showed the good performance of the processing workflow, that was able to generate promising results for (i) 3D image-based reconstruction, (ii) point cloud classification and (iii) anomaly detection. Until now, only a few datasets were available and evaluated, whereas in the future a larger quantitative assessment will be carried out, incl. the comparison between LiDAR-and photogrammetry-derived mapping results over the same area. The main open issues that will be further investigated in the future include:  a strategy for mathematically modelling the geometry of power line spans in order to cope with the small data gaps evident in the cable reconstruction results. So far, no parametric modelling was performed, in order to avoid inappropriate fitting results and use only the triangulated 3D points; however, the adoption of robust models, that also consider external ambient conditions, may improve the geometry reconstruction and anomaly detection steps;  solutions to increase the accuracy of the classification, that represents an essential pre-requisite for reducing the number of false positives/negatives in the anomaly detection step. This will involve (i) increasing the size of the training dataset, also considering other classes of objects (e.g. pylons heads), (ii) differentiating the training dataset based on the geographical area and land cover (ii) exploring the use of deep learning architectures for 3D classification (e.g. SPGraph -Landrieu and Simonovsky, 2018). Finally, the platform will be further generalized to manage and process other types of input datasets, e.g. terrestrial mobile mapping system (M M S) data, in the form of both point clouds and panoramic images. Indeed, especially in urban contexts, M M S surveys are able to cope with viewpoint restrictions of airborne data collection, thus reconstructing the geometry of power line elements with higher accuracy and completeness.