INVESTIGATING STANDARDIZED 3D INPUT DATA FOR SOLAR PHOTOVOLTAIC POTENTIALS IN THE NETHERLANDS

This paper presents our contribution to the development of a standardized 3D input data model for solar photovoltaic potential estimation. Presently, different input data and processing steps influence the calculation for estimating the potential of solar energy in the Netherlands. The variety in characteristics of input data and issues with temporal accuracy extracted from the national registers and databases makes it challenging to obtain a consistent and reliable result. To address this issue, we created a point cloud dataset that integrated from LiDAR point cloud and dense image matching which is complete, recent and positionally accurate. Furthermore, we made a 3D building model from the integrated point cloud and identified the effect of finer resolution in the photovoltaic potential analysis.


INTRODUCTION
Taking part in the global effort to develop an energy economy that is safe, reliable, and affordable, The Netherlands adopted the 'Energy Agreement for Sustainable Growth' in 2013 (Ministry of Economic Affairs of the Netherlands, 2016). In that energy policy, the Dutch cabinet has defined three main targets: (1) prioritize CO2 reduction; (2) optimize economic opportunities of the energy transition; (3) include energy transition targets into spatial planning policy. In general, solar energy is considered to be one of the key renewable energy sources to achieve these transition targets (Paardekooper, 2015). In his study, Paardekooper (2015) points out that presently the potential of solar energy is not well utilized in the Netherlands. However, data available to assess the potential are often inconsistent. To increase the utilization of solar energy, incorporating 3D models in GIS analysis can be used for assessing the potential of solar photovoltaic (Alam, Coors, & Zlatanova, 2013).
The main advantages of utilizing 3D models are (1) the ability to predict shadow casting, (2) visualize the vertical variations and (3) improve the communication process between users and the professionals to gain a better understanding of the presented information (Kurakula & Kuffer 2008). Crucial in creating 3D models are the properties of the input data and the processing steps. In The Netherlands, there is an extensive collection of geodata that can be used for the analysis of solar photovoltaic potential, e.g., BAG 2 , AHN3 3 , and point clouds derived from dense image matching. Each input data has its characteristics, which making it challenging for 3D modelling purposes. Kadaster as a main data provider in The Netherlands acknowledges this situation and they were interested in more detailed research in 3D input data input synchronisation in * Corresponding author 2 https://www.pdok.nl/introductie/-/article/basisregistratie-adressen-en-gebouwen-ba-1 3 https://www.pdok.nl/introductie/-/article/actueel-hoogtebestand-nederland-ahn3-addition to the already developed Dutch 3D national standard. Inspired by the noise modelling standard (see Peters et al., 2018), this study develops a standardized 3D input data model to estimate the solar photovoltaic potential for the Netherlands. Moreover, the shortcomings of the current 3D input data and its impact on 3D analysis are also investigated. In this context, we explore the opportunity to combine point cloud from LiDAR and point cloud derived from dense image matching for reconstructing a precise 3D building models.

Study area and data description
The inner city of Zwolle is selected for this study (Figure 1). This protected area of Zwolle has to deal with the line of sight regulation when applying solar photovoltaic installation (Boschman, 2017). The inner city of Zwolle is characterized by mixed residential and commercial buildings with diverse structures. Therefore it is a suitable study area for experiments with 3D data.  For the next part, a dense image matching point cloud is referred to as DIM (Dense Image Matching). The overview of the methodology applied in the current study is shown in Figure 2.

Pre-processing point cloud datasets
The pre-processing workflow is based on the explanations of data quality elements (ISO, 2013). Following ISO 19157:2013(ISO, 2013 there are six elements for spatial data quality, completeness, thematic accuracy, logical consistency, temporal quality, positional accuracy, and usability element. The elements used in this research are (1) completeness; (2) temporal quality; and (3) positional accuracy. Each element was translated into several techniques and implemented for the pre-processing step. It consists of four components and is described in section 2.2.1 -2.2.4.

Visual data check for completeness
Derived from the explanations of completeness from (ISO, 2013), completeness can be measured by "is there any unmatched data?" or "how complete the point clouds compare with the ground truth?". Therefore for this element, visual inspection by comparing two datasets was done as a first screening. Afterwards, statistic calculation with a change detection technique was carried out as part of the point cloud registration process (Section 2.2.4). Additional datasets as ground truth used Cyclomedia and Google Maps, which consist of street view imagery data.

Point density calculation
Oude Elberink & Vosselman, (2011) argued that the starting point to analyse the quality of 3D models is by determining the quality of input data. In this study, point clouds have been used as input data. Therefore point density is an important property to decide the quality of point cloud. The point density is calculated with Equation 1. A higher point density means lower values for point spacing (Esri, (n.d.)). (1) Equation 1. Formula to calculate point density from point spacing, adopted from Esri, (n.d.).

Point clouds classification
Point clouds classification is a process to assign points to predetermined classes. The classes applied were following the standard classification scheme from The American Society for Photogrammetry & Remote Sensing (2011). The point cloud datasets used are successfully classified into 4 classes: (1) unclassified; (2) ground; (3) building; (4) water. Because the main interest of this research is building, a separate dataset is created by subtracting the points classified as building.

Point clouds registration
This process is the second component to check the completeness element, the temporal quality and positional accuracy of the data.  (Myronenko & Song, 2010) and Normal-Distributions Transform (NDT) algorithm (Biber & Wolfgang, 2003). One of the widely used method to register two datasets point clouds is Iterative Closest Point (ICP) (Gelfand, Ikemoto, Rusinkiewicz, & Levoy, 2003), which was firstly introduced by Besl & Mckay (1992).
The principle of this relative positioning algorithm is to find corresponding points between two point cloud datasets ( Figure  3). In details, the algorithm steps are as follows: 1. For each point in the source (dense image matching) point cloud, find the closest point in the reference (LiDAR) point cloud. 2. Estimate the combination of rotation and translation with a mean squared error function that will best align each source point to found its match. 3. Transform the source points using the obtained transformation matrix. 4. Iterate the steps until the point clouds are aligned. The output of this algorithm provides a transformation matrix and a roughness value. This transformation matrix is used to transform the source point into a reference point, AHN3. The roughness value (mean and standard deviation) is equal to the distance of the point and the best fitting plane in the neighbouring points (Girardeau-Montaut, n.d.; Sirmacek & Lindenbergh, 2014). This value represents the distribution of the distances calculated between two point cloud datasets. According to Ahmad Fuad et al. (2018), this algorithm is the most suitable method to register point cloud datasets. Similar work for The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B4-2020XXIV ISPRS Congress (2020 integrating point clouds from LiDAR and dense image matching point clouds were performed in other paper (see (Kaartinen et al., 2005;Rottensteiner et al., 2014).

3D building model generation
The utilization of a 3D model for GIS analysis is widely used for many applications. Besides gaining a better understanding, it also stimulates the involvement of stakeholder (Onyimbi, Koeva, & Flacke, 2018). According to Biljecki et al., (2015), determined by the visualization aspect, the solar irradiation analysis falls into non-required visualization use cases. That is a use case that is not essential to visualize the result of GIS analysis in 3D to achieve the purpose of the use case. However, it needs 3D as its main input data.
The 3D building model was generated semi-automatic with the combination of integrated point clouds, rooflines, building polygons (BAG). The rooflines were manually digitized. Afterwards, the rooflines and building polygon were merged to segment the building polygon based on each roofline while maintaining the building ID. The result of this step is planar patches. The process was carried out using the RANSAC algorithm to segment the point clouds into planes.
RANSAC algorithm is well known to detect primitive shapes in both 2D and 3D (Schnabel, Wahl, & Klein, 2007). This algorithm was firstly introduced by (Fischler & Bolles, 1980) and consist of three parameters: (1) error tolerance to determine a point is compatible with the fitting plane or not, (2) number of subsets to try, (3) the threshold. The algorithm starts with randomly selecting a minimal subset of n points and estimating the corresponding fitting shape parameters. The remaining points are tested with the resulting candidate shape to see how many points that fit the candidate shape. After a certain number of iterations, the shape that has the largest percentage of inliers is extracted and the algorithm continues to process the remaining data.
Later on, the integrated point clouds were assigned to the planar patches. After this step, each planar patch was reconstructed. Afterwards, the planes were combined with the corresponded extruded building polygons to produce a full 3D building model.

Solar photovoltaic analysis
In general, solar photovoltaic analysis is determined from the areas with maximum solar irradiation on the rooftop. Solar irradiation is the amount of solar radiation received by the sun per unit area by a given surface area. Freitas et al. (2015) demonstrated a comparison from various approaches for solar photovoltaic analysis. The approach consists of three steps ( Figure 4).
First, as input data topographic and meteorological data is used. Topographic data such as geographic location, height information and urban elements are the main components.
Secondly, a solar radiation model with GIS analysis is used. This stage is analysing the effect of the sun over a specific geographic location with a time interval range. Thirdly is the visualization of the output. In this stage, the influence of different potential levels plays a role. Depending on the type of potential and user queries that take place as a threshold of suitability criteria, the outcome will differ.
Explained by Bódis, Kougias, Jäger-Waldau, Taylor, & Szabó, (2019); Freitas et al., (2015); Mainzer et al., (2014) there are four levels to determine the type of potential. The economic potential takes into account economic factors such as return on investment, payback time, and production revenue (Bódis et al., 2019;Mainzer et al., 2014;Paardekooper, 2015). The technical potential that takes into account the technical characteristics of the equipment, including the performance and efficiency of the photovoltaic modules (Bódis et al., 2019;Freitas et al., 2015;Mainzer et al., 2014). The third is physical potential, which is the maximum amount of solar energy in a geographical region without considering any limitations (Freitas et al., 2015). However, another term is used for this level of potential, such as theoretical (Mainzer et al., 2014) and resource (Bódis et al., 2019), but the concept is the same. The last is geographical potential, which considers the restrictions of the location (Freitas et al., 2015;Mainzer et al., 2014). In this research, we focused on two potentials. First, the physical potential to calculate the solar irradiation for the whole study area. Secondly, we calculate the geographic potential to focus on finding locations where energy can be captured.
In our study, the generated 3D building models were converted into raster because the main input of the tool is DSM. The DSM is generated with different pixel size (0.2m and 0.5m) as part of the experiment. The solar irradiation analysis was done for the whole year of 2019 with 0.5 as hour interval. The amount of solar irradiation is shown in pixel values in units of kWh/m 2 .
Afterwards, the suitability factor was defined as part of solar geographic potential estimation. In general, there are three parameters for suitability analysis of solar geographic potential. Roof orientation, the amount of solar radiation, and slope ( Figure  5). In this study area, the line of sight regulation as an additional parameter for suitability analysis is added. Figure 4. The sequential process to assess solar photovoltaic potential, adopted from (Freitas et al., 2015).
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B4-2020, 2020 XXIV ISPRS Congress (2020 edition) Figure 5. Parameters for suitability analysis of solar geographic potential.
For suitability analysis, all images were converted into binary raster images taking the criteria applied in Table 2

RESULTS
The results are explained in the following subsections. The first subsection shows the result of pre-processing point cloud datasets and in the second subsection, the result of the generated 3D building models are shown. The third subsection shows the output of the experiment, which included the 3D building model conversion and the result of solar photovoltaic potential estimation.

Pre-processing point cloud datasets
The first set of questions aimed to determine the quality of the input data derived from data quality elements, completeness, temporal quality and positional accuracy. This two dataset, later on, were classified starting with ground classification and followed by building classification. This process also removed noise from the point cloud. Class code values of 0 (never classified), 1 (unclassified), and 6 (building) will be evaluated to determine if those fit the characteristics of building rooftop. If they do not meet the criteria, then point clouds will be assigned to a class code value of 1 (Esri, n.d.). Both point clouds were classified into four classes: (1) unclassified; (2) ground; (3) building; (4) water.
Next, each point clouds were thinned to extract code class value 6 and to derive a consistent density to obtain a regularly spaced point returns. As expected, the amount of point cloud decreased to 6.223.009 and 4.619.911 points for dense image matching point cloud and LiDAR point cloud, respectively. However, after visual inspection, both point clouds contain false negative ( Figure 6). The method recognizes boats as buildings. This falsenegative caused by the definition of the smallest area size of the building during the classification process. Therefore, those points were reclassified to code class 1 (unclassified).
(a) (b) Figure 6. False-negative from thinning result from (a) DIM and (b) AHN, boat detected as building.
Afterwards, both datasets were registered using the ICP method as explained in section 2.2.4. After both datasets were registered, differences were analysed by determining the point to point distances for x, y, and z component focusing on the z value. This was done since the LiDAR point cloud has irregular points and the DIM point cloud have a regular point, so it is incomparable in x and y component. In Figure 7 and Table 4 we presented the mean distance (µ) and standard deviation (σ) from point to point distance with colour codes in the meters. The source point is DIM and the reference point is LiDAR. The red and blue colour represents source points that have a higher distance to the reference point. White colour represents the source point that is close to or overlaps.  Table 4. Mean distance (µ) and standard deviation (σ) of point to point distance calculation of two dataset.
Recognized by a large number of standard deviation in the z component, we noticed there is a difference between the two datasets. It can be assumed this is caused by unmatched areas. These unmatched areas furthermore were checked using image sources from Cyclomedia 4 and Google Maps 5 . From those resources, we noticed there are four unmatched areas because the building shape has been changed. As a result of this process, we can conclude that the difference is due to different time acquisitions.
To create the most current dataset and achieve the aim of the temporal quality element, during the registration of both point clouds we discard the LiDAR point cloud in the unmatched areas and used the DIM point cloud in those areas.

3D building model generation
The 3D building model generation successfully generated different primitive roof types, such as gabled, hipped and mansard as shown in Figure 8. The sequence of each process carried out with this technique is presented in Figure 9.
The first image Figure 9 (a), is the building polygon (BAG). The rooflines consist of ridge lines (white) and height jumps (yellow) They were assigned and merged to segment the building polygon which resulted in planar patches Figure 9 (b). The point clouds were continuously iterated with RANSAC algorithm to fit the candidate shape and the confidence parameter until they reach the consensus Figure 9 (c). Afterwards, the building polygons were merged again with the segmented roof from Figure 9 (c), to generate a full 3D building model in Figure 9 (d). (c) (d) Figure 9. The sequence of 3D model generation.

Experiment
To assess the effect of finer resolution for solar radiation analysis, we converted the 3D building model into two different pixel sizes, 0.2m and 0.5m. Also to identify whether the 3D building model is useful for this analysis we generated DSM from the integrated point clouds. This process generated four datasets, converted 3D building model in 0.2m and 0.5m pixel size and DSM in 0.2m and 0.5m pixel size. These four datasets were used as the input data for physical potential calculation.
In this subsection, the converted 3D building model and the result of the physical potential calculation are displayed. Figure 10 shows the result of the converted 3D building model and DSM. Figure 11 shows the result of the physical potential calculation for one year. The color range from blue to red represents the amount of minimum to maximum solar irradiation on the roof in pixel values in units of kWh/m2. The area receives annual maximum irradiation in the range of 1032,79 -1033,32 kWh/m 2 according to the calculation depends on the pixel size.
The final output presented in Figure 12 is the result after suitability analysis parameters in Table 2 are applied. (c) (d) Figure 10. The input data to calculate physical potential calculation with different pixel size left row is 0.2m and right row is 0.5m. Image (a) and (b) is the converted 3D building model and image (c) and (d) is the integrated point cloud converted into raster.
(a) (b) (c) (d) Figure 11. The result of the physical potential calculation with the converted 3D building model at the first row and the DSM at the second row. Pixel size 0.2m and 0.5m for picture (a), (c) and (b), (d) respectively. Figure 12. The final result after applying the suitability analysis parameters explained in Table 2.

DISCUSSION
The present study aimed to create a standardized 3D input data model to estimate the solar photovoltaic potential for the Netherlands. We explored the opportunity to combine point clouds derived from LiDAR and dense image matching. The integration of these point clouds is to achieve input data that is complete, recent and positionally accurate in alignment to ISO 19157:2013 of spatial data quality (ISO, 2013). The integrated point clouds are used as input data for semi-automatic 3D building modelling, including the building polygons and rooflines. This method is applicable to generate different roof types. Further work is currently done to develop a fully automatic method for generating a 3D building model. Challenges that need be better addressed in the future are related to the deviation position between the building polygons and the rooflines. Presently, it is difficult to decide which input data needs to be followed. In this case, we set the building polygon as a benchmark.
It can be seen from the experiment, that the result of solar radiation analysis carried with a 3D building model done following our methodology produces a much smoother result compare with the previously applied methods in Kadaster using only DSM derived from the point cloud. However, roof details such as dormers to quantify available surface area were hard to be obtained using this 3D model technique. Therefore for future work, image recognition technology could be utilized to detect objects on top of the roofs. This technology has already been applied to detect damages on the roads (Angulo, Vega-Fernández, Aguilar-Lobo, Natraj, & Ochoa-Ruiz, 2019) and rooftop segmentation (Collier et al., 2019).
The result of our experiment in Figure 10 (c) shows that using DSM with 0.2 m resolution generated directly from the integrated point clouds can detect and visualize roof details. Results show that solar radiation analysis is affected by resolution, which is in line with the result obtained from (Peronato, Rey, & Andersen, 2018). When analysing the results for the four datasets, we noticed that they are relatively constant. However, the choice of utilizing the 3D building model for solar photovoltaic analysis can be motivated by the fact that it could avoid data gaps and noise that likely happened if using DSM. For this aspect, it is interesting to introduce the user perceptions and requirements to qualify if the quality meets their needs.

CONCLUSION
This study contributes to the investigation of standard 3D input data for solar photovoltaic in the Netherlands. Furthermore, this study successfully explored the opportunity to produce a complete, recent, and positionally accurate point cloud dataset by integrating dense image matching point cloud and LiDAR.
The presented method allows constructing semi-automatic 3D building models from the integrated point clouds, building polygons and rooflines. The 3D building model supports the assessment of the solar photovoltaic potential. The further automatization of the 3D building model generation, which will be valuable to scale up the method, is presently under progress.
From the experiment conducted in Zwolle, the Netherlands, comparing two datasets with two different pixel size to see the effect of the finer resolution, the results showed that the finer pixel resolution has influence on the solar photovoltaic potential analysis. The current study provides an improved methodology to support the Kadaster in the Netherlands for estimating the solar photovoltaic potential.