AUTOMATING POWERLINE INSPECTION: A NOVEL MULTISENSOR SYSTEM FOR DATA ANALYSIS USING DEEP LEARNING

Powerline infrastructure provides the backbone for the electricity supply of industrial, administrative and private sectors. Its maintenance requires regular inspections, that are still largely carried out manually. In this work, we propose an automated inspection system instead. We review current inspection processes as a baseline, give an overview of relevant inspection criteria, propose a suitable multi-modal sensor system, and discuss methods to automate the inspection tasks. In our system, we particularly focus on the high-level organization of the sensor data and inspection results to form a Digital Twin of the power line, that allows operators to browse through the recorded data in a meaningful way and review the status of their powerline from the desk. * Corresponding author


INTRODUCTION
Currently inspection of powerlines is done mostly in aerial missions, capturing data with Lidar, RGB cameras as well as specific thermal infrared and UV sensitive sensors. During flights, findings are reported by visual observation and later analyzed and verified using the captured data. Typically, the inspection personnel only operate one or two sensor modalities per flight, and additional missions must be planned for the other modalities. In this paper, we instead propose to integrate all sensors into a single multi spectral system, such that fewer overall flights are required. The combined multisensor system along with novel analysis methods based on deep learning allows us to integrate all observations in order to create a Digital Twin of the powerline infrastructure representing its current state. The sheer volume of the captured data mandates an automated analysis, which at the same time reduces the need for subjective human interpretation of the images and provides reproducible results.
We will give an in-depth review of the state-of-the-art in powerline inspection and management in Section 2., and our proposed system covers the whole range of inspection criteria from the current best practice. We therefore also rely on the whole range of established sensor modalities for powerline inspection. Vegetation distance and right-of-way violations are handled using 3D Lidar data. Components mounted on the infrastructure such as insulators are detected, localized and inspected based on high resolution RGB images. Thermal signatures of various components such as clamps are further analyzed using thermal infrared sensors. Finally, corona discharges occurring along the high voltage powerlines are found using a dedicated UV sensor.
A new sensor head for mounting on a helicopter was developed to cover all these inspection criteria and is going to be presented in detail in Section 3. It includes a Lidar scanner that is recording point-clouds with more than 300 pt/m², five 100 MP RGB cameras providing object resolutions of 2mm and coverage of all components from various viewpoints, four 640x512 thermal infrared sensors measuring temperature profiles along the powerline, and a UV sensitive camera recording corona discharges at 50Hz. Additionally, a high-grade GNSS-INS unit is used for precise georeferencing of all recorded data and a thorough sensor calibration is performed on-the-fly. For safety of the inspection flight, the entire sensor head is designed to operate at a distance of 40m away from the powerline infrastructure. Flying with 30 km/h this setup enables us to survey 100 km power line a day, recording 10s of terabytes of raw data along the way.
The analysis of this data and generation of the Digital Twin are performed offline using methods we will present in Section 4. As a first critical step, a semantic segmentation of the 3D point cloud is computed such as to identify and separate the relevant foreground regions like conductors, insulators and pylons from the background such as terrain and vegetation. Based on 2D bounding box detections in the RGB images, relevant components such as insulators and clamps are identified and, with the help of the 3D data, also localized in space, thus forming the backbone of our digital powerline model. For individual components, appropriate inspection tasks are then carried out to identify e.g. chippings on insulators or temperature increases along clamps. In parallel, suspicious signatures in the images from the UV sensor are identified and clustered in space and can later be associated with the identified components. At all stages of the analysis, the methods heavily rely on machine learning and in particular deep learning techniques. These are driven by training data and can evolve further to adapt to novel, previously unseen data during long term operation of the overall system.
We will present an outlook on further improvements of our system, as well as on the use of the gathered data in Section 5. For example, we aim at reducing the weight, size and power consumption of the sensor-head in order to get it mounted on an autonomous long-range UAS. In parallel, the learning-based analysis methods will be extended such as to establish a lifelong learning process that can adapt to new data efficiently.

STATE OF TECHNOLOGY
Today, the monitoring of powerlines is commonly performed using helicopters while flying above or besides the lines (CIGRE, 2017). Visual inspection by an observer is common practice and, depending on the observer's experience, achieves relatively good results. Besides that, the use of various assistance systems in order to detect additional findings as well as to map the power-line for geometric verification and documentation are available: The use of Lidar systems to get geometric data of the power-line also helps to detect the clearance between conductors and vegetation, which is hard to measure otherwise. Cameras are mostly used for the creation of an orthophoto, to document the findings and to assist in the interpretation of the Lidar point cloud. Thermal cameras help to detect invisible issues and UV cameras are used to detect anomalies along the conductors and insulators.
The list of typical inspection criteria is long, varies from country to country or even from operator to operator, and sometimes very specific inspection reports are generated on customer request (CIGRE, 2017). A first kind of common inspection criteria focuses on the right of way, such as clearance between conductors and vegetation, illegal construction underneath the powerlines as well as erosion or other terrain changes in the corridor. A second group of criteria focuses on the pylon structures, monitoring e.g. bent or missing metal bars within grid structures, woodpecker damage to wooden support structures, missing warning signs etc. The third major aspect covers damages to the conductor wires such as individual broken strands of wire, that are sometimes sticking out, damages due to flashover as well as bird-caging. Frequently such damages are caused by components mounted on the powerline, and condition monitoring of such components defines the fourth major category of inspection criteria, covering e.g. insulators, vibration dampers, spacers, clamps, aerial markers and bird protection elements. Many criteria can be covered by visual inspection in the RGB domain, but additional, well established sensor modalities include Lidar sensors for right of way monitoring (Jwa et al., 2009), measurement of thermal hot spots particularly along clamps, and detection of partial discharges along high voltage power lines by UV monitoring, that can hint at a range of problems. A clear and simple one-to-one assignment even just of sensor modalities to inspection criteria is often not possible, as e.g. certain types of insulators are susceptible to damages, that show up in the UV domain, while other types of insulators are not (CIGRE, 2017). In the present work, we instead focus on more general principles and on organizing the inspection information in the form of a digital twin.
The sensor setups of systems frequently used in powerline inspection produce Lidar point clouds with densities of 30-60 pt/m² of first and last echo Lidar data and image resolutions of RGB imagery with 1-2 cm GSD in nadir view and 0.5-10 cm in oblique view. Thermal cameras typically achieve resolutions of 5-10 cm GSD (Pless et al. 2012, GGS GmbH, 2019. For UV inspections, viewfinders with 50Hz live images are typically used and once a defect has been identified, the RGB images are used for documentation. Findings on the powerline and its components are usually directly inspected by trained personnel, others such as vegetation clearances are analyzed in the Lidar data. Lidar processing follows the typical processing pipelines using GNSS-INS for direct referencing, point cloud extraction and analysis (first/last echo). Distance analysis is common practice in many software systems. The data of the thermal camera is sometimes used as an overlay on the lidar and image data to identify additional findings or to confirm visually observed issues. The image data is processed besides the Lidar GNSS-INS processing to overlay color information to the point-cloud, to create an orthophoto as a map basis for the report, and most importantly for documentation of reported defects.
The report for the powerline operator is generated out of this data semi manually by joining observed and extracted findings in combination with the multi sensor data. The asset management, relating inspections reports to previous findings and long-term condition monitoring remain in the scope of power line operators, and each operator has their own established practice for such tasks. While support systems are in use at some operators, others rely on old paper plans and engineering drawings created when the powerlines were built, and information inevitably gets lost, once individual power lines are transferred between operators. There are no established digital twin models that are accepted as industry wide standard representations for life cycle management of powerlines.

Sensor-head requirements
In order to fulfill the requirements for AI based finding detection, we aim to maximize the resolution and accuracy of the entire sensor setup and significantly increase them relative to the common technology. Higher image resolution, better point density, better reproduction of the pylons within the Lidar data and many more aspects will reduce the required effort to get the algorithms working and train the system for automated finding detection on the digital twin.
The entire sensor setup has to be calibrated in order to skip onthe-job calibration as much as possible, such that an entire photogrammetric preprocessing process can be omitted. Direct referencing is required, and the sensors have to be calibrated for the intrinsic parameters and the sensor to sensor and sensor to IMU relation. GNSS INS is needed in order to provide highly accurate position and attitude information in cm accuracy for georeferencing the relevant sensor data.
The length of the power line to be observed on a daily mission should be 100 km, flown on both sides to capture data with all relevant perspectives in order to have a full inventory of the elements (Birchbauer et al. 2018).
The requirement for Lidar is an average of 150 pt/m² in a single flight pass, which for the combined forward and backward missions on both sides results in about 300 pt/m² on the line itself. Thus, the geometric determination of the pylons, the conductors and the infrastructure are guaranteed and can be used as a basis for the adjustment of all other sensor data. Such high resolutions also allow to identify and localize individual components such as insulators in the 3D data.
The resolution of the cameras should be at 2 mm GSD on the power line to identify small chippings and cracks on the components and for the orthophotos about 1 cm GSD on the ground is required in the nadir view. The camera must be able to use a very short exposure time to avoid forward motion blurring. To detect issues in an automated way, the system needs a comparable resolution to a human observer and also different views and redundancy to prevent a high rate of miss matches. Typical defects need to be visible in a 3x3 pixel matrix for reliable detection with automatic methods, resulting in 6x6 mm patterns on the object surface that can be spotted reliably.
The thermal resolution is defined to achieve at least 10 mm GSD in order to be able to resolve anomalies on the conductors and clamps. Such a high resolution is required to have at least one or ideally more pixels that represent only the conductor or clamp, without being influenced by the background. Additional challenges for thermal inspection are the low resolution of imaging sensors for thermal cameras (usually 640x512 pixels or lower) and the effects of material properties, that overlay the thermal emission.
Partial electrical discharges on the powerline are indicators for sudden changes in the electrical field, which occur e.g. at pointy objects or sharp edges. Such discharges create electromagnetic noise and emit photon swarms or flashes in the ultraviolet spectrum. Cameras sensitive in this band can capture images and help to identify the place and intensity of the finding. The detection of discharges requires a fast capture rate since discharges are linked to the frequency of the alternating current. Daylight is a problem for the camera to identify the small discharge radiation from normal solar radiation in the upper ultraviolet spectrum. Therefore, a daylight filter has to be used in combination with an amplified sensor to detect the small number of photons in the remaining spectrum, that are emitted by a discharge.
Data storage is a big task and the synchronization of the sensors to get all data properly georeferenced is a key issue in the entire setup. Fast data storage, several terabytes of raw data have to be saved in real-time together with exact timestamps based on GNSS-INS in order to have a very precise timing on all data.
For mission planning, a terrain model and information about the pylon coordinates, heights and design are needed to calculate the best flight trajectory for the pilot. Besides that, the optimal trigger points for the cameras must be calculated and if needed a setup of angles and orientation of the sensors to achieve the best representation of the infrastructure to generate a perfect digital twin. Limitations are predefined by the powerline operator e.g. minimal safety distance, minimum height above ground and others. In most cases a distance of at least 40 m has to be maintained.
To assist the pilot in steering the helicopter exactly on the planned trajectory, an FMS is needed with all information about track, position and speed to be followed up.

Sensor Configuration
Based on these requirements, the different sensors have to be selected, configured and integrated into a multi sensor head. To be flexible, the single sensors were mounted in special brackets to rotate and tilt the sensors for test-evaluation for the best parameters.
The Riegl VUX LR was chosen as Lidar system in order to have a high pulse repetition and scan rate. The long-range version has a stronger laser beam that enables a better echo from the conductor itself. The laser scanner can be tilted and rotated in order to better intersect the pylon and achieve a higher point density inside it while flying parallel to the corridor. With suitable tilt and rotation, we achieve average point density of 150pt/m² and more per flight direction. The synchronization for the final calibration process is managed via NMEA connection to the GPS Receiver and a PPT synchronization signal.
The camera setup is based on five PhaseOne iXU1000 cameras. CMOS sensors guarantee low noise even under difficult light conditions. A shutter speed of 1/2000 predicted forward motion effects and the central leaf shutter enabled also precise image processing. To record the entire pylon in 1.5-2.5 mm GSD, two forward looking oblique cameras have been combined, using 90mm and 110mm lenses. The same configuration was used for a second, backward looking row of sensors. For creating orthophotos, a nadir camera with 50mm lens was mounted alongside the other cameras. All cameras are triggered in a daisy chain configuration and manage a capture interval of 0.7 seconds. The daisy chain of the cameras results in perfect synchronization and the common event signal is stored in the GNSS receiver as well as the GNSS timestamp and position in the exif data header and the exif log.
A stable mounting between the IMU and sensors is key for accurate referencing. For the Lidar scanner, the lever arms between the sensor nodal point and the IMU center must be calibrated by measurements and the bore side angle adjusted with calibration data out of a test mission. The calibration of the cameras needs the steps, the internal calibration of focal length, pps and radial distortion and the measurement of the lever arms between the entrance pupil of the lenses and the IMU center. The bore side angles can be calibrated out of test missions with a sufficient number of GCPs. For the calibration a normal photogrammetric workflow with tie-point matching and bundle block adjustment is use.
In order to achieve thermal data with an GSD of 10-25 mm, a set of four microbolometer cameras (Flir A65) with 50 mm lenses have been integrated. A fan of these cameras captures the entire pylon and all conductors. The cameras were oriented forward oblique to get like the RGB camera a good view of the The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B4-2020, 2020XXIV ISPRS Congress (2020 pylons. The setup of the thermal cameras is synchronized with the GPS signal and achieves an exact timestamp for each image. To build up a camera detecting corona discharges, a Proxivision UV sensor with an amplified sensor back and daylight filter was selected. In addition, an RGB camera was mounted as a hook up in same view and co registered to display the discharge effects on the visible bands for a better recognition in the other data (Lidar, RGB and thermal). The gain control defines the sensitivity of the UV camera. In order to keep track with the alternating current, both cameras (UV and VIS) run in 50 Hz capture mode with synchronized data handling on the GPS timestamp.
The employed GNSS-INS is the L1/L2 dual antenna system NovAtel Span with a FOG IMU. This device builds the central navigation and georeferencing system for all sensors.
The mission planning was done with a specially developed tool in order to fly "snake-lines" and to capture with respect of the flight path the powerline in an optimal way. Input parameters are the coordinates of the pylons, their height, the pylon geometry, extraction of the conductors and insulators as the most important feature and predefined regulations and sensor parameters. The result is an optimal flight path and setting of the sensor rotation and tilt angles.
The Flight Management System was an adjusted version of AeroTopoL with instruments that guide the pilot exactly on track and indicates speed (35 km/h is absolute maximum) distance to next curve and others. A moving map in the background assists the operator for flight assistance and forecasting specific track issues.

Calibration
In order to analyze the data in the most rapid and smooth way, direct referencing of the data without intensive photogrammetric processing forces a proper calibration of the system. The Lidar data fully relies on GNSS-INS and cannot be processed without a well synchronized exact trajectory. There are two sets of lever arms to be measured and fine calibrated. The lever arm from the GNSS-antenna center to the IMU center is typically measured using a total station on the ground. It is important to have the coordinate system of the IMU well defined and measured in sub-cm accuracy.
The same procedure can be applied to the thermal cameras since they show image information that represent the structure. More difficult is the calibration of the corona camera. The most suitable way is to co-register the UV-camera to the monitoring vis-camera. That way we can receive a standard affine adjustment of the UV-data to the vis camera which itself can be processed in normal workflow as described before. The way we performed calibration was the use of the UV camera without daylightfilter to get visible matching points between the two sensors. The better but far more laborious way is to use UV targets in an indoor environment.
The stability of the calibration is assumed to be good enough as long as all sensors and mountings are kept together. All sensors are robustly mounted on a stable frame that does not show deformations bigger than the resolution of the IMU can measure.

ANALYSIS AND DIGITAL TWIN
Due to the amount of data recorded with our sensorhead, automatic analysis methods are mandatory to handle inspections efficiently. We consolidate the results of these automatic methods in a digital twin representation, that we create along the way. As a first step we compute a semantic segmentation of the 3D point cloud from the Lidar scanner, as we will describe in detail in Section 4.1. We also run a 2D object detector on all RGB images recorded with the cameras. The detected 2D bounding boxes are then combined with the semantically segmented 3D point cloud to create an inventory of components along the power line, which serves as the backbone of our digital twin of the infrastructure. These steps are discussed in Section 4.2. While our digital twin representation is therefore simplistic, it is already a very powerful way to deal with inspection results, as we will also show in this chapter.
We present details on three selected inspection criteria, that cover different powerline components and make use of different modalities provided by our sensorhead, but at the same time they serve as blueprints for three different paradigms of organizing information with the help of our digital twin. In Section 4.3 we present details of an insulator inspection module using the RGB imagery as an example of using the digital twin representation to correlate and consolidate information, that is already contained in the model. This is followed by a clamp inspection module based on thermal IR information in Section 4.4, which serves as an example for triggering targeted inspection routines based on the digital twin information. Additional we employ a UV inspection module that we present in Section 4.5 and which serves as a blueprint for integrating unspecific 3D inspection results into our digital twin representation to enrich the specificity of the methods.
An outline of the relations between the individual modules is shown in Figure 5, which also illustrates the information flow in a practical implementation of the presented analysis methods.

. Pilot Screen with Navigation instruments
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B4-2020, 2020 XXIV ISPRS Congress (2020 edition)

Semantic Segmentation
The first step of our processing pipeline aims at automatically classifying all the 3D points recorded with the Lidar scanner. Such semantic segmentation in 3D is a classical problem in geosciences (Weinmann et al., 2015). In our case the main goal is to identify relevant parts of the infrastructures like pylons, conductors and insulators, and separate them from the ground, vegetation and buildings. This information is of direct use for infrastructure operators, allowing them to assess vegetation encroachment (Ituen et al., 2008) construction activities and other issues related to their right of way. Apart from that, the information is also used as a first part of our digital twin representation, which for many cases allows us to restrict the attention of the later inspection modules to the relevant parts.
To compute the semantic segmentation, our implementation uses a classical feature-based approach as known from e.g. (Weinmann et al., 2015). Our feature vector comprises of Lidar attributes such as echo number and intensity as well as moments, densities and rank order statistics computed in spherical and cylindrical neighborhoods and uses multiple search radii comparable to (Hackel et al., 2016). After this feature vector is computed for each point, they are passed through a relatively simple neuronal network with 6 hidden layers for classification. Finally, we use the mean field approximation (Krähenbühl, Koltun, 2013) of a dense Conditional Random Field as regularization to remove individual outliers and misclassifications. This overall pipeline can also be replaced with alternative approaches such as deep segmentation networks (Qi et al. 2017), that are trained in a fully data-driven end-to-end fashion.

Object Detection and Inventorization
Small components such as individual insulators, vibration dampers, spacers and individual clamps are hard or even impossible to detect purely from 3D point cloud data. We can use the RGB images provided by our sensorhead to identify them in 2D, but these individual 2D views are often insufficient to analyze their condition. For example, a chipping on a ceramic insulator seen in a single image might be on the occluded, far side of the component, that is only clearly visible from a different perspective. Our sensorhead therefore records multiple redundant views of each component, and in our digital twin representation we relate these views to each other as well as to the 3D point cloud in order to get a complete view of the components. This allows an automatic inspection method to exploit the multiple viewpoints and an operator to browse through the multiple images of each component in order to assess potential damages with higher confidence. The data structure representing components is the second major part in our digital twin representation.
To localize and identify the components in the 2D images, we employ deep learning based 2D object detection methods from the computer vision community. In particular, the SSD metaarchitecture (Liu et al., 2016) with a ResNet backbone (He et al., 2016) appears to provide a reasonable tradeoff between accuracy and runtime, although alternatives can be considered as a drop-in replacement in our pipeline. In any case, the accurate calibration of the sensorhead provides the necessary geo-referencing of individual 2D images to relate the detections to each other and to the semantically segmented 3D point cloud. We achieve this by backprojecting the 2D bounding boxes onto the point cloud and intersecting and clustering the results (Birchbauer et al., 2020a). During this consolidation we can at the same time remove spurious false detections, that are not confirmed at the corresponding locations in other images, and promote low-confidence detections, that are confirmed by multiple other views. In our experience, both of these steps dramatically improve the detection accuracy, resulting in overall precision and recall values beyond 99% and localization accuracies around 10cm in 3D space for the components on a typical powerline.
After this inventorization stage, our digital twin model consists of the semantically segmented 3D point cloud along with 3D locations of detected components, where each component comes with a list of images and respective 2D locations showing it in high detail. This is also illustrated in Figure 6.  This consolidation dramatically simplifies the inspection task by removing irrelevant background data and highlighting the critical infrastructure components, but due to the number of components on a typical power line, manual inspection is still prohibitive. For example, there are easily tens to hundreds of insulators in a single pylon, and there are pylons approximately every 300m in powerlines that are running for hundreds or thousands of kilometers. We therefore apply a range of targeted and highly specific automatic inspection methods to the individual components in our representation. A complete list of these ever-expanding inspection criteria is beyond the scope of this paper, and we restrict our attention to three select samples in the following.

Insulator Inspection
Insulators are a particularly critical component in powerline infrastructure. On the one hand they provide electrical insulation, but at the same time they have to support the weight of the conductors between the pylons. Typical damages to e.g. ceramic insulators are chippings along the surface, caused e.g. by hailstorms, or heavy contaminations, as illustrated in Figure  7. Such damages are typically detected by visual inspection with suitably trained personnel, and in our automatic system we analogously use high resolution RGB images with a trained 2D image classifier.
In our implementation, we exploit the consolidated detections in our digital twin representation. This allows us to take crops of the RGB images around the detected insulators and pass them through a deep learning-based classification network. Various network architectures have been presented in the computer vision literature (He et al., 2016) and they again provide a tradeoff between accuracy and runtime, but in practice our digital twin representation with the consolidated list of components and images already leads to a significant reduction in runtime, allowing us to focus very much on accuracy. We therefore use relatively complex and deep network architectures, and currently base our classifier on a ResNet101 architecture (He et al., 2016), which already achieves 96% accuracy for classification of the relevant damages in individual 2D insulator images.
When combining the individual 2D image classification results to a single assessment of each component, this can be further improved, but care has to be taken on the exact way of combining the information. Simple majority voting ignores the fact that typical defects will only be visible on one side of the insulator. Obviously, this induces a tradeoff between sensitivity and specificity, and we currently report a defect if more than 10% of the views on an insulator are classified as damaged.

Clamp Inspection
Connections within high voltage power lines are typically achieved with screwed or pressed clamps. Due to loosening of bolts, internal corrosion or other defects, the connection quality can be weakened, which typically leads to clamps and wires heating up under load. The connection quality can therefore be assessed by measurements in the thermal infrared spectrum, and our sensorhead contains appropriate cameras to take these measurements.
In our system, we can again rely on the detected and inventorized list of clamps observed in the RGB images, as described in Section 4.2 and illustrated in Figure 8. Thanks to the consolidated component list in our digital twin, we can project the corresponding 3D positions into the thermal IR images and extract crops from there for further analysis, which dramatically reduces the search space to the relevant image areas. Such a cross-modal approach is particularly important, as it is almost impossible to detect clamps in thermal images alone, as exemplified by the sample images in Figure 8. However, the measured thermal signatures depend on many variables such as the surface condition and the current load of the power line, but also ambient temperature, wind velocity, sky as well as solar radiation and cloud coverage. Additionally, the relative velocity and distance to the powerlines affects the measurements. Further details on the thermal inspection module developed in our system are presented in (Komar et al., 2019).

UV Defect Detection
Particularly in high voltage power lines beyond 110kV, partial discharges may occur at places of strong electric field changes, which is typically at sharp edges or corners. Such partial discharges may hint at broken or damaged equipment along the powerline, but also the ionization of air resulting from such discharges may accelerate corrosion and cause further damage. We therefore aim at detecting so called corona discharges using a UV sensitive camera. The images from this camera typically show bright, white blobs whenever a discharge is detected and if multiple of these discharges are observed in a consistent 3D  The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B4-2020, 2020 XXIV ISPRS Congress (2020 edition) location, this hints at a problem on the power line infrastructure, as investigated in (Komar et al., 2019).
The processing steps of our algorithm for UV inspection are outlined in figure 9 and described in detail in (Birchbauer, Kähler, 2020b). We first detect the bright blobs representing the observed partial discharges as projected into the UV images. Based on the geo-referencing, we then create hypotheses for 3D location of the partial discharge along the viewing ray through the corresponding 2D detections. These hypotheses are validated using the blob detections in additional images taken from other viewpoints. Once a critical threshold in the number of accumulated 2D observations is reached, this method reports a partial discharge at the corresponding 3D position. A typical result of this approach is shown in figure 10, and it now has to be related to remaining inspection results. We therefore search for nearby infrastructure elements in our digital twin such as e.g. the suspension point of an insulator, a clamp or just a position on the conductors and associate the detected defect with this object. For an in-depth assessment, an operator can then easily browse through the high resolution RGB images of that component or location and analyze the cause of the problem.

Improvements on the Sensorhead
One aspect of improving the overall system addresses the sensor head. This will become smaller, lighter and cheaper in the future in order to fit into an unmanned aerial vehicle instead of the current helicopter setup. We are already starting work with the next generation of sensors that are smaller and lighter. At the same time, we can also save weight on the mounting plates by using carbon structures within a smaller and more compact new design. We expect to be able to reduce the weight of the mounting components by 50% or more. Newer sensors also offer improved quality or reduced costs.
Another aspect of improving the data acquisition is the development of a smart sensor head that can be directed to look at specific locations. Such a smart gimbal, that can keep points of interest in view (Birchbauer et al. 2018), can both compensate flight parameters such as roll, pitch and yaw and at the same time focus on specific parts of the powerline while passing by them, providing additional views. Cameras equipped with tele lenses and adjustable focus can then save on our current array of cameras, while maintaining the GSD values that the inspection requires.

Improvements of Automatic Analysis
Thanks to the modular pipeline and the common interface with our digital twin representation, we can easily extend or improve our analysis modules and add to our existing inspection criteria. As mentioned in Section 4, we can also e.g. replace the 3D semantic segmentation module with a deep learning based framework, once it offers benefits over our current approach, and our detection and classification networks can be replaced to keep up with the rapid progress of the computer vision community. Since virtually all our automatic analysis methods are learned algorithms, training data is paramount, and our models will continue improving with additional training data. One important aspect therefore is the long-term management of our trained methods with concepts from lifelong learning. Every new flight generates additional data, but we do not want to manually annotate all of this data due to the prohibitive cost this would incur. Identifying relevant data to minimize the manual annotation effort or utilizing self-supervision to automatically pre-annotate data are just two obvious solutions that will assist the further development of the trained models.

Future use cases of Digital Twin
At present, the digital twin representation in our automatic analysis framework is tailored to our inspection tasks. However, many other applications that will improve asset management for powerline operators can be envisioned. An obvious case is change detection: By comparing inspection results over multiple flight epochs, multiple previous and current observations of the same components can be observed side-by-side and the evolution of components can be analyzed in detail. Other use cases arise, if operators decide to digitalize legacy powerlines e.g. in preparation of repair or upgrade works. We expect that our current internal digital twin representation is capable of supporting a wide range of further use cases with little adaptation effort. Figure 9. Outline of the UV inspection module from blob detection (left) to partial discharge detection in 3D (right). Figure 10. Sample of partial discharges clustered in 3D.

CONCLUSIONS
We presented a novel system for automating powerline inspection, comprising of a custom-built, multimodal sensorhead and automatic analysis method based on machine learning. At first, we briefly presented the state of the art in powerline inspection, deriving a list of relevant inspection criteria. This also motivated our choice of sensors and the requirements for the overall sensorhead configuration. Due to the amount of data generated with these sensors, we require automatic analysis methods and presented samples of such automated methods, that emulate the manual inspection tasks of current best practice. At the core, we create a digital twin representation of the powerline, that allows us to organize the data in a meaningful way and gather and combine the information from various sensors and inspection routines in a unified manner. We also gave an outlook on the further evolution of the overall system and additional use cases of the digital twin. Overall the presented system provides a solid basis and marks a major milestone in automating powerline inspection even further.