DIGITAL TWINNING PROOF OF CONCEPT FOR UTILITY-SCALE SOLAR: BENEFITS, ISSUES, AND ENABLERS

A Digital Twin is a virtual representation of a physical asset or system with the purpose of optimizing intelligent behaviour of said physical entity. Digital Twin is a promising tool for asset management as the virtual entity can exist and aid at every stage of a systems life. However, the infancy of the concept means implementation remains at an early stage and particularly poorly defined within an asset management context. Practical case studies of digital twinning (the modelling process of generating and updating Digital Twins) are an important tool to ensure definitions from research are applied rigorously and to aid in their deployment with practitioners in real industrial applications. This-being-said, there are insufficient case studies for asset management digital twinning. In particular, the Digital Twinning process for utility-scale solar has not been considered. Utility-scale solar asset management often suffers challenges due to remoteness and scale of assets, contributing to high labour costs and thus could benefit enormously from an effective Digital Twin to increase precision and accuracy of fault detection and efficiency of labour for O&M tasks. In addition, the data sharing and analysis Digital Twins provide is vital for the immature solar sector. However, Digital Twinning of utility-scale solar has not been well considered and presents issues around cost-effective data collection and modelling. Therefore, this paper details the current state-of-the-art and challenges surveying utility-scale solar and the progress and application of Digital Twin to utility-scale solar. Then a novel proof of concept process for digital twinning of utility-scale solar is presented with a focus on geometric data capture for updating as-is models. Furthermore, the paper will consider Digital Twin requirements and their prescription to current O&M methods in utility-scale solar. Finally, the paper highlights currently available required technology as well as highlighting future technological improvements that would benefit the proposed proof of concept.


INTRODUCTION
Photovoltaic (PV) energy is an important resource to transition towards clean, renewable energy sources. In the UK, PV energy capacity more than doubled in the 5 years between 2014-2019 (5.5GW to 13.3GW) with utility-scale installation capacity nearly tripling (UK Government, 2021). However, PV modules are susceptible to a variety of faults that impact on their efficiency (Köntges et al., 2014, Djordjevic et al., 2014. The technical and economic challenge of locating and diagnosing these faults in installations of vast number of individual PV modules is 'monumental' and hence many PV plants operate with insufficient monitoring capability (Bosman et al., 2020).
Stakeholders of PV plants want to maintain maximal power output by minimising faults while also achieving low operation and maintenance (O&M) costs and hence must use the correct maintenance strategy to balance cost (e.g. sending out a service team) against performance lost (e.g. due to faults awaiting resolution) (Peters and Madlener, 2017). When making this decision, it is important to have comprehensive monitoring and modelling to understand what impact faults are having; for example, strings of PV modules are electrically connected in parallel so any performance mismatch can adversely affect the entire system in complex ways (Bosman et al., 2020). A PV Digital Twin is thus proposed as a significant aid for deciding, applying, and evaluating this strategy and hence an important advancement. * Corresponding author The Digital Twin is defined as a "realistic digital representation of an asset, process or system" (Bolton et al., 2018). It is an emerging topic that combines sensors, modelling, artificial intelligence (AI) and Information Communication Technology (ICT) for the purpose of monitoring, forecasting, and collaboration of a 'physical' object (Rasheed et al., 2020). Digital Twins have been applied in the AEC community to optimise planning, decision making, and management. Their ability to dynamically update geometric information improve on static BIM methods, particularly in the operation stage (Lu et al., 2020b, Khajavi et al., 2019, Sacks et al., 2020. The vast, interoperable data they capture and utilise has lead the UK's National Infrastructure Commission to propose a National Digital Twin; an ecosystem of sub-Digital Twins at all scales of infrastructure (NIC, 2017). The Centre for Digital Built Britain thus proposed the 'Gemini Principles' -a series of rules and philosophies for Digital Twins within the National Digital Twin (sub-Digital Twins), and a call for the development of sub-Digital Twins (Bolton et al., 2018).
However, a comprehensive Digital Twin for PV plants has not been considered. In addition, there is a restrictive high cost and manual labour effort to update high fidelity (module level, or greater) PV Digital Twins that limits their capacity. Current solutions cannot provide both detailed monitoring and high degrees of automation (Zefri et al., 2021, Mellit andKalogirou, 2021). Therefore, this paper will present a cost-effective and high fidelity automated digital twinning process for a utilityscale PV Digital Twin.
The rest of this paper is organised as follows: Section 2 details relevant literature on digital twinning and utility-scale solar monitoring, Section 3 presents a digital twinning system concept for utility-scale solar, Section 4 provides a discussion on the concept and future research, then Section 5 concludes.
2. BACKGROUND 2.1 Digital Twin 2.1.1 Definition A Digital Twin has 3 main components: the physical entity, the virtual entity, and the connections that link the two together. It is a dynamic and comprehensive system that integrates a variety of data sources (Grieves, 2011). Due to the infancy of Digital Twins in the AEC community, definitions will be briefly reviewed using key terminology presented by Jones et al. (Jones et al., 2020).
Parameters are the factors the Digital Twin has knowledge of. This may include geometric, health, functionality or environmental state data (Jones et al., 2020). The fidelity is the number of parameters used, or the accuracy of the twin. Although generally some works argue the twin should be near-perfect (Grieves, 2011, van der Valk et al., 2020, Singh et al., 2021, within the context of the National Digital Twin this should be suited to the Twin's purpose (Bolton et al., 2018, Jones et al., 2020. A state is a snapshot of the values of all parameters at one instance in time. Updating the virtual entity state by sensing the physical via a physical-to-virtual connection can be achieved by a variety of methods which are discussed in the next sections. The opposite direction (virtual-to-physical) facilitates actuation commands. Kritzinger et al. defines Digital Twins with solely manual data connections to be Digital Models and those with automatic connections from only physical-to-virtual to be Digital Shadows (Kritzinger et al., 2018). Due to the presence of 'semi-automated' methods, omission from the Gemini Principles, and focus on the physical-to-virtual connection in this paper, this distinction will not be made. This is not to say the advantages of automation are recognised and clear (cost, time, accuracy). The act of synchronising state is known as Twinning and hence its frequency is the Twinning Rate. Many works, partly due to being outside the context of infrastructure, describe this as 'real-time'. However, the rate should depend on constraints of the state-updating technology and requirements of the twin.

Updating the Digital Twin
Digital twinning is the process of creating and updating the digital model (virtual entity). This involves data acquisition, transmission, and model creation/updating. These steps are a foremost task in Digital Twins and vital to the success of later layers (Lu et al., 2020a). This can be achieved using an array of methods (potentially in combination), for example, manual methods (perhaps aided by RFID tags or barcodes), IoT devices, laser scanning, or imagebased techniques (including photogrammetry) (Khajavi et al., 2019, Volk et al., 2014, Tang et al., 2010, Lu and Lee, 2017. The choice of method will depend on requirements on level of automation, twinning rate, and captured parameters. As will be discussed, two are of particular interest in PV digital twinning: IoT device approaches use a wired or wireless network of devices that can measure a wide variety of environmental parameters (Khajavi et al., 2019). They have been widely used in Digital Twins such as pump vibration data in a Digital Twin of West Cambridge Campus to detect faults (Lu et al., 2020a) and temperature, humidity, and lighting data of building facades (Khajavi et al., 2019). They provide direct measurements (i.e. do not require further processing) at a high twinning rate. However, the initial economic cost and labour can be significant in these systems, and these problems scale linearly with increasing resolution (fidelity) and scale (fixed price/effort per sensor).
Geometric Digital Twinning involves the process of as-is geometric modelling. Geometric Digital Models (e.g. BIM) are an essential feature of most Digital Twins; this includes both 3D geometry data and semantic information (Rausch et al., 2021). Aside from traditional, labour intensive methods, there are two approaches to geometric digital twinning: laser scanning and image-based methods (Volk et al., 2014). Laser scanning can quickly produce accurate 3D point clouds, although requires power demanding, specialised equipment. Image-based methods use, for example, visible light, RGB-D, or thermographic cameras. They have a significant advantage of being able to detect much greater semantic information; since a laser scanner only measures distance whereas image-based techniques directly measure electromagnetic radiation (Lu and Lee, 2017). Image-based point cloud 3D reconstruction has shown to be capable of achieving acceptable, although normally poorer than laser scanned, spatial accuracy for infrastructure related tasks (Fathi et al., 2015).
In general, there are 4 steps for as-is geometric modelling. (1) data capture, (2) 3D reconstruction, (3) semantic processing, (4) geometric modelling (Czerniawski andLeite, 2020, Fathi et al., 2015). A fifth step is then required, (5) Digital Twin updating, which associates objects between each update. Geometric digital twinning can be performed to monitor changes of the physical entity over time, although is difficult to perform entirely automatically on infrastructure due to the complexity of real-world environments.

Image-based 3D Reconstruction
The process of image-based 3D reconstruction is completed using Structure from Motion and MVS algorithms (SfM-MVS). The typical pipeline of this is briefly given as: data capture; feature detection and matching; alignment and calibration; dense point cloud generation (Fathi et al., 2015).
The data captured is a series of 2D images taken from different locations. Features are detected and described in these images using a feature detection algorithm, of which there are many (e.g. SIFT (Lowe, 2004), SURF (Bay et al., 2006)). An initial estimate of image pose is then calculated by matching said features, which is then optimised using Bundle Adjustment. The resultant sparse point cloud is then densified using MVS techniques. (Fathi et al., 2015)

PV Module Monitoring Methods
In order to design a PV Digital Twin and associated digital twinning process, the methods of detecting faults must be examined. Due to aging, mechanical stresses, thermal stresses, or environmental factors, PV modules can experience various faults and inefficiencies such as: cell cracks, delamination, hot spots, shading, soiling, short circuits, and mismatched modules (Köntges et al., 2014, Djordjevic et al., 2014, Carletti et al., 2020. In order to decide on when and how a fault is resolved, the fault should be localised and classified accurately in the digital twinning process. This is so performance impact can be modelled effectively and thus if, where, and how a repair should be made. There are several, potentially complementary, methods of monitoring which are described below. 2.3.1 Electrical Monitoring By directly measuring electrical information (e.g. power, current, voltage), it is possible to detect faults. This method gives a good understanding of impact (Carletti et al., 2020); however, there is a limited ability to classify the fault (Mellit and Kalogirou, 2021) which leads to less actionable information as the cause may not be clear (Bosman et al., 2020). For example, a work using deep learning of time series data on outputs of PV inverters gave a recall score of 0.92 for predicting anomalies, but could only identify whether an anomaly, serious anomaly, or no anomaly was present (Arafet and Berlanga, 2021). Commercially available IoT based systems use a network of devices to give accurate, real-time information for the detection of faults. However, when deciding on the granularity of monitoring (i.e. how many devices are used), the cost, logistics, power supply issues and set-up effort must be weighed against the benefits of a higher resolution of monitoring -more accurate detection and localisation of faults (Daliento et al., 2017). Current high granularity solutions are not economically feasible for utility-scale plants (Mellit and Kalogirou, 2021).

Visual-Based Inspection
Another form of monitoring is visual-based, non-contact methods by visual inspection (i.e. visible light) or thermographic inspection. Both methods can reveal many common PV module faults, with thermographic being particularly efficient and effective (Carletti et al., 2020). They can, in comparison to electrical monitoring, identify faults to a significantly greater precision (intra-module level) and classify faults to a wider range of classes. Furthermore, unlike electrical monitoring, they can capture cause of faults, and identify potential future issues , Denio, 2012. Thermographic inspection should adhere to specific requirements: Electroluminescence inspection is performed by supplying power to the PV module and visually inspecting the module with a specialised sensor (such as a camera with removed infrared filter), ideally in dark conditions (Vidal De Oliveira et al., 2019). Some faults are difficult or impossible to detect visually or thermographically but can be detected using electroluminescence inspection (Alves Dos Reis Benatto et al., 2020).

Automating Visual-Based Inspection
Due to the scale and inaccessibility of assets, a cost-effective visual and thermographic inspection method for large-scale PV plants is Unmanned Aerial Vehicles (UAVs). This is capable of a sufficient GSD for deep inspections (Denio, 2012, Hernándezcallejo et al., 2019. Using this technique, it is feasible to achieve 100% module coverage in an inspection -albeit over multiple flights due to UAV battery constraints (Carletti et al., 2020).
Electroluminescence can also be performed by a UAV survey UAV inspections (visual/RGB and thermographic) still remain a significant manual process due to the need to analyse the data. There are 4 key tasks towards full automation after data capture: (1) module detection, (2) fault detection, (3) fault classification, (4) module localisation (Daliento et al., 2017). Full automation of these will produce a significantly less labour intensive, costly, and error prone digital twinning process. The work on this is thus reviewed.

PV Module Detection
The problem of PV module detection is object recognition; given an image, return the locations of any/all modules. Some works (Gao et al., 2015, Carletti et al., 2020 utilise the structure of PV arrays to build a grid of modules using a Hough transform to detect vertical and horizontal lines -although it is not clear how to handle cases of missing modules or modules not arranged in this regular grid, as in (Herraiz et al., 2020). (Dotenco et al., 2016) applies a watershed transform and morphological transforms to first detect rows, and then individual modules. This however was done on a small set on a specific layout of modules. (Addabbo et al., 2017) used a template matching method and normalised cross correlation similarity measure for thermographic UAV imagery, however only showed a limited set of results. From visual aerial imagery (Malof et al., 2016) used both a Random Forest Classifier based on local features and a Deep Convolutional Neural Network (CNN). (Wang et al., 2018) instead used an Objectbased image analysis and template matching method which has the advantage of not needing a large training set.

Fault Detection and Classification
The problem of fault detection is object recognition; given an image, return the locations of any/all faults. The problem of fault detection is object classification; given a fault, return the classification of the fault. Fault detection/classification can normally be achieved by statistical methods or deep learning methods. Some works only consider local hot spots, while others simply differentiate between local hot spots and global faults. Global faults cannot be detected on a PV module without the context of other modules on the site (e.g. the entire module is warmer than it should be). (Gao et al., 2015) detected local faults by a simple threshold of intensity above 3 standard deviations, global faults were detected based on a clustering algorithm due to the small data size. The detection rate of local faults, 80%, is poor and only based on a small result set. (Carletti et al., 2020) used a maxima search using a water filling algorithm to find faults without the need of a specific threshold. They also highlighted the issue of junction boxes -which can produce hotter regions on the PV module that are not defects. They removed these false positives by removing the outer section of the detected module, although this will also increase false negatives. (Dotenco et al., 2016) split the module into its component cells and performed statistical outlier tests (Grubbs' and Dixon's Q tests) to detect hot spots, overheated modules and overheated strings of modules. The advantage of statistical tests and rule based classifications is that classifications are more explainable. (Pierdicca et al., 2020, Herraiz et al., 2020, Zefri et al., 2022 use CNN based neural networks. Unlike the previous works, these utilise larger data sets. (Zefri et al., 2022) provides a much greater range of local defect classes which provides a clearer resolution action. (Pierdicca et al., 2020) use transfer learning to improve training speed and accuracy. In addition, (Pierdicca et al., 2020, Herraiz et al., 2020 apply their fault detection on the UAV imagery instead of on segmented modules (i.e. results of module detection are not used). This seems to unnecessarily increase the complexity of the task.

PV Module Localisation
The works previously analysed fail to resolve this challenge sufficiently; generally, current works cannot automatically precisely localise a PV module on a digital model of the system and instead at best provide an insufficiently accurate GNSS coordinate of the UAV in a field of densely packed modules Kalogirou, 2021, Zefri et al., 2021). Works that track modules between images (e.g. (Gao et al., 2015)) mean faults are not detected twice, but still do not localise the module. Due to the highly repetitive nature of PV plants, this task is tedious and error prone for human analysis. The proposed solution by Tsanakas et al. to solve module localisation is photogrammetry (Tsanakas et al., 2017) (see Section 2.2). This being said, several issues remain.
Firstly, there are several data capture issues that disadvantage photogrammetric algorithms: 1. Highly repetitive patterns. These cause issues with feature matching.
2. Moving reflection artifacts due to reflective PV module surfaces. Moving objects cause issues in SfM as the camera is assumed to be moving and the subject stationary.
3. Restricted image quality. Equipment is restricted to UAV capable devices with correct weight and power demands.
4. UAV height. Capturing closer to PV modules increases the percentage of imagery containing modules. While this improves the resolution of captured modules it also increase the percentage of imagery with repetitive patterns and reflections.
In thermographic imagery particularly, Zefri et al. found consistent catastrophic failures and distortions using standard UAV inspection and photogrammetric methods (Zefri et al., 2021). Registering spectra (aligning captured RGB visual and thermographic imagery) is possible if both inspections occur in the same flight using an appropriately enabled camera. However, in addition to issues when this not robust enough (see (Tsanakas et al., 2017)), this would force a significant cost on the process. The further problems specifically for thermographic imagery are: 1. Low resolution. Thermographic cameras have significantly lower resolution than RGB cameras and hence feature detection will detect fewer features.
2. Poor texture. Thermographic imagery has less texture and hence feature detection will detect fewer features.
3. Calibration. If the thermographic sensor re-calibrates such as in (Cardinale-Villalobos et al., 2020), features are less likely to match due to looking different between images 4. Flight plan. For accurate temperature readings, the camera should be angled correctly to the PV module. This may contradict the desired optimal plan for reconstruction.
Fewer features lead to poorer reconstruction accuracy.
Secondly, the solution to the challenges of module detection, fault detection, and fault classification have not been considered as part of the photogrammetric localisation solution. The output and process of photogrammetry may enable different approaches to these.
Finally, the approach is not framed against the 5 geometric digital twinning steps. The final step of digital twin updating is not immediately clear.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVI-5/W1-2022 Measurement, Visualisation and Processing in BIM for Design and Construction Management II, 7-8 Feb. 2022, Prague, Czech Republic Figure 1 shows a PV Digital Twin design suitable for inclusion in a National Digital Twin. Based on the previous literature, a digital twinning process is proposed. The proposed digital twinning process (Data Sources to Digital Model) contains 5 modules. A low granularity electrical monitoring IoT system twins with the Digital Twin frequently, this provides important real-time data whilst remaining low cost. This IoT network includes environmental sensors which provide important context to the other monitoring mechanisms including: irradiance (using a pyranometer), air temperature, wind speed and direction. APIs and sub-Digital Twins provide further data -examples may be a weather forecasting API or a Digital Twin of the national grid. Manual inspections can provide more specific, adhoc information such as from electroluminescence inspection. Finally, a high fidelity, module level UAV inspection occurs at a low twinning rate. Section 3.1 details the improved UAV inspection process.

Improved UAV Inspection Method
The UAV inspection module is a geometric digital twinning method and detailed in Figure 2. It is designed to handle the issues and challenges of aerial surveying of PV plants identified in the previous section. Furthermore, it is designed to be significantly more automated, with the possibility of full automation suggested.
3.1.1 Data Capture Data capture involves 3 steps. This is fundamentally unchanged from the photogrammetric method in Section 2.4.3. Data can be captured as images or video and be RGB imagery or thermographic, but data fusion (aligned imagery) is not required.

3D Reconstruction
The input to this module is 2D imagery. Image selection will be performed reduce the image count to optimise between computation time and reconstruction accuracy. This step is automated by utilising pre-bundle adjusted GNSS coordinates from the UAV and estimated height (from the flight plan) to estimate achieving the desired overlap. Any blurry images will also be detected and removed. Image enhancement will reduce re-calibration issues on thermographic imagery by utilising histogram equalisation. Module detection will use the methods described in Section 2.4.1 to find modules in each image. It is proposed a Hough transform method to detect PV arrays (grids of PV modules) is not used so the process is more generalised. Feature detection and feature matching occur as normal. The detected modules can be used as additional features in the SfM process. By utilising the domain-specific knowledge that the detected PV modules can be considered as planes, constraints can be added to the optimisation that is being solved; the planes will share a homography between a pair of images. Dense point cloud generation will then be performed as usual.

Semantic Modelling
Every detected module in all images will be projected into the point cloud at each corner. If the distance between two modules is within a given threshold they can be considered to match and be placed in the same group -they are the same module viewed in different images.
The number of groups should equal the number of individual modules inspected, and the size of each group equal the number of times that module was observed in all images.
The localised module position is then the average module of each group. This will be based off projected corner positions. This stage may require thresholds on size and position in 3D space to remove spuriously detected modules or incorrectly aligned images. Fault identification and fault classification will be performed using statistical methods as described in Section 2.4.2.
3.1.4 Twin Updating In order to update the twin, a point cloud registration is proposed using the iterative closest point (ICP) algorithm (Besl and McKay, 1992). This will allow module matching by an appropriate distance threshold.

DISCUSSION
The proposed twinning concept provides a combination of data input with varying degrees of automation, cost, and granularity. It is suggested that this provides the advantages of all each while minimising disadvantages. The process also does not require a significant alternation to current monitoring methods.
In addition, the improvements to the UAV inspection aim to improve automation levels. Increasing the robustness and accuracy of the 3D reconstruction stage by utilising context specific information will allow PV module localisation to become trivial. This removes a large manual tedious effort in fault monitoring, but also means every module can be stored in the Digital Twin for historical data analysis. Historical data on every module will enable greater data analysis for the prediction of faults and their progression -this is a key benefit of a Digital Twin and important understanding the aging affect of modern PV plants.
The remaining manual tasks are in the data capture stage. Drone-in-a-box technology should allow, once a flight plan is created, to remove this manual step. A UAV could be stationed at the PV plant and perform scheduled inspections or by instruction such as if the AI on the Digital Twin detects a change in electrical monitoring output.
Faults will be detected using statistical methods as this is more explainable than deep learning methods. For example, standardised fault definitions could be more useful in a warranty claim case. This work has considered improving digital twinning by improving post-data acquisition advances using the photogrammetric method. Alternatively, improvements in the data capture process by adding intelligence to the drone could be considered. Advantages of photogrammetry include: distortion of imagery is inherently removed, easily working with off-the-shelf equipment, fitting closely with current practises, and capturing more information such as PV module angles and vegetation height which are useful for PV Digital Twin functions like shading simulation or scheduling maintenance.

Further research
This work has highlighted the need for further research in several areas. Principally, the proposed novel PV geometric digital twinning process will be developed and tested on a selection of PV plants. The PV Digital Twins should be developed to improve the monitoring and intelligence maintenance and also expose data publicly to enable it work within a National Digital Twin. This will require digital twinning advances as well as the development of AI, human interactions and machineto-machine communication, and interoperable simulation programs.

CONCLUSION
Current monitoring methods for utility-scale solar cannot capture PV module level detail without an extreme manual effort. There is a need for a PV Digital Twin and cost-effective, automated PV digital twinning process for the detection and localisation of module faults. In order for a PV Digital Twin to have a high fidelity and sufficient twinning rate, improvements to these processes must be made to reduce the dependency of manual localisation of faults. The key contribution of this work has been to propose an improved UAV inspection framework based on geometric digital twinning. In order to determine the effectiveness of the framework, it will be developed and tested in a future work.