A BUILDING CHANGE DETECTION METHOD BASED ON A SINGLE ALS POINT CLOUD AND A HRS IMAGE

For common remote sensing image change detection based on different time phases, it is difficult to solve the problem of surface tilt caused by the different shooting angle and time, which makes it difficult to complete the accurate registration. Some scholars utilized three-dimensional data for change detection to avoid registration problem, however, it costs very high for using 3D data to detect changes. Aim at this problem, this paper proposes a method of combining a single phase of ALS (Airborne Laser Scanning) data and HRS (High Resolution Satellite) image. It is composed of the following four steps: (1) Extracting the shadow area of the new-phase optical image, and superimposing it with the classified old-time point cloud data, and then the disappeared building area and the approximate building area are obtained according to the associated relationship between the shadow area and buildings. (2) Determining the unchanged building areas as positive samples, and taking the remaining area and vegetation area after removing the approximate buildings from the image as negative samples, and then GrabCut algorithm is used to segment new-phase HRS images to obtain building areas. (3) Comparing between building areas obtained in the previous step and the registered old-time ALS data to obtain a new building area with noise. (4) Denoising the results of the previous step to obtain the final new building area. Two datasets are used to verify the method. The detection accuracy of the disappeared buildings is over 85%, and the detection accuracy of the newly added buildings is over 70%. * Corresponding author: Yunsheng Zhang (zhangys@csu.edu.cn)


INTRODUCTION
As a result of urbanization, many buildings are experiencing constructing or demolishing each year. To automatically monitor building change situation is very important for the government. If a field survey is carried out for each building, it will inevitably lead to a undesirable cost on manpower and material resources. Therefore, it is important to automatically complete urban building change detection at a lower cost. Li (2013) referred automation and real-time tracking of change detection was needed. Therefore, the realization of automation for the change detection of buildings has become a hot research topic.
During the past decades, various building change detection methods have been developed. Most scholars directly use the spectrum and texture information of the building itself on remote sensing images for change detection. For example, Tao et al. (2017) proposed a building change detection method based on the Morphological Building Index (MBI); Zhang et al. (2018) proposed a method of combining pixel-level and object-level building change detection for high-resolution remote sensing images; Meng et al. (2008) proposed a building change detection method based on similarity calibration; Yu et al. (2018) proposed a multi-feature building change detection method combining MBI, texture features and edge features. When facing high spatial resolution images, a building roof shifts in different directions in different images, and the change detection method directly based on image contrast is susceptible to the accuracy of image registration. Some scholars have combined other data sources for building change detection. José et al. (2013) applied the support vector machine (SVM) classification algorithm to joint satellite and laser datasets for building change detection. Tian et al (2019) fused multi-spectral images and Digital Surface Models (DSM) to reliability conduct building change detection by using a reliability function.
However, in the existing technology, the accuracy of change detection using only the spectrum and texture of the building itself on remote sensing images is generally not high; and when combined with other data, there are problems such as large data volume and complex models.
To solve the above problems, this paper proposes a method of combining the classified old-time point cloud data and the newtime optical image for building change detection. The workflow of the proposed method is shown in Figure 1. Firstly, the shadow area is extracted from the remote sensing image, and then the image is superimposed with the classified and registered point cloud data to obtain the disappeared buildings and the unchanged buildings; Secondly, the unchanged buildings are treated as positive samples, and the combination of the remaining area (after removing the approximate buildings) and the vegetation areas are treated as negative samples. After that, the GrabCut algorithm is used to classify the new-phase remote sensing image. Finally, the unchanged buildings in the classification results are removed to obtain the newly added buildings.  Figure. 1 The workflow of the proposed method 2. METHOD

Shadow extraction
Due to the existence of obstructions, the radiation energy of the radiation source (the sun) cannot reach certain areas on the ground. These areas are the shadow areas on remote sensing images, and they usually have an associated relationship with obstructions, for example buildings. Therefore, there will be shadow areas near the building areas, and the positional relationship between the building area and the corresponding shadow area is related to the azimuth of the sun. In view of this observation, this paper uses the positional relationship between buildings and shadow areas to detect the disappeared buildings, at the same time to obtain the approximate building areas. The shadow extraction method includes the following four steps: In visible remote sensing images, most of the radiant energy is composed of sunlight, and the chromaticity of shadow areas should be the same as it is directly illuminated. Therefore, neither the shadow area nor the high brightness area will be affected by the normalized colour space (Xu et al., 2005). Therefore, we characterize the shadow feature f1 by obtaining the difference between the original colour space and the normalized colour space on the image; (1) (3) Due to the presence of obstructions, shadow areas show lower brightness on the optical remote sensing image. The brightness can be used as the feature f2. There are many ways to calculate brightness, such as mean value method, maximum value method, calculation in a HSV space, etc. But these methods do not take the sensitivity of the human eye to the three types of RGB light into account. Therefore, according to the human eye's sensitivity to the three types of RGB light. The following brightness calculation methods are designed for different sensitivities (Liu, 2011).

(4)
Due to the existence of gaps between leaves in the vegetation area, spot-like shadow areas will be formed in the vegetation area. Therefore, it is necessary to construct the characteristics of the vegetation area to remove these spots to obtain a purer shadow area. In the visible light range, vegetation is generally green, so it can be characterized by feature f3 as calculated by equation (3) (Shen, 2014). (5) In the comprehensive decision-making part, after the above analysis, for feature f1, the shaded area and the highlight area present larger values. For feature f2, the shaded area and vegetation area present smaller values, and the highlighted area presents larger values. For feature f3, the vegetation area shows a larger value, and the other parts show a smaller value. Therefore, the following formula is constructed as the final decision item (the three features have been normalized to [0, 1] before calculation), and then the Otsu threshold segmentation is used to obtain a better shadow extraction result.
Among them, α, β, and λ are the corresponding weights of the three features, which is adjusted by a try-and-error method.

Disappear building extraction and positive sample selection for GrabCut
The disappeared buildings are those that exist on the old-time point cloud data, but do not exist on the new-time remote sensing images. This paper uses the method shown in Figure 2 to extract the disappeared buildings and select positive samples for the subsequent GrabCut algorithm (Rother et al., 2004).
Firstly, the registered point cloud data and the shadow extraction results of the remote sensing image are superimposed, and then an adjustable "probe" (the length of the "probe" in this The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France paper is 10 pixels) is extended along the direction opposite to the azimuth of the sun for each building. If the probe touches the shadow, the building is considered unchanged and regarded as the positive sample of the GrabCut algorithm. If the probe does not touch the shadow, the building is considered to disappear.  Figure. 2 Flow chart for the extraction of disappeared buildings and the positive sample selection of GrabCut algorithm

Negative sample selection
In Section 2.1, it is analyzed that buildings and shadows have an accompanying relationship. Therefore, the certain range of the azimuth of the sun in the shadow area is the approximate area of the building. The removal of these approximate building areas from the remote sensing image is a pure negative sample. However, due to the presence of vegetation shadows, these approximate building areas will be mixed with many vegetation areas, which would result in the selected negative samples with little or no vegetation. Therefore, it is necessary to extract part of the vegetation separately as additional negative samples. This paper uses a visible light remote sensing image vegetation index calculated by equation (9) to extract vegetation (Meyer and Neto, 2008). After that, is compared to a given threshold as shown in equation (11) (in order to obtain a pure negative sample, is set to a large value) to obtain a purer vegetation area.
Where, the is an extended green index, and is an extended red index calculated by equation (10).

Extract buildings by GrabCut
After selecting some positive and negative samples for buildings, the GrabCut algorithm is employed to segment the images to classify and obtain buildings. The GrabCut algorithm is an effective image segmentation algorithm for extracting the target foreground from the complex background. It has three improvements on the basis of the GraphCut algorithm (Ding and Zhang, 2012): (1) Gaussian mixture model (hereinafter referred to as GMM) is used to replace the histogram figure; (2) In the process of GMM parameter learning and estimation, it replaces a minimum estimation with an evolvable iterative algorithm to complete the energy minimization and improve the segmentation accuracy; (3) Reduce interaction through incomplete labelling operate. However, when the algorithm processes larger images, iterative processing will slow down the image segmentation speed. Therefore, super-pixels are used to replace the previous single pixels to build an s-t network graph model. It can not only maintain high segmentation accuracy, but also speed up the segmentation speed.
The GrabCut algorithm based on graph cut theory transforms computer vision problems into pixel labelling problems. The image segmentation problem is a typical binary label combination optimization problem that marks pixels as foreground/background. The basic idea is to determine the label of the pixel by constructing an energy function about the label and obtaining the minimized energy function (Qiu and Wang, 2012). The energy function is shown in equation (12), which is composed of data constraint items and smoothing constraint items.
If all pixels form a vertex set , the foreground and background are regarded as endpoints s and , and each vertex is connected to s and to form an edge, and the two edges are given different weights and summed to form the energy function of the vertex. The sum of the energy functions of all vertices constitutes the data constraint item in the above formula, which indicates the degree of similarity between each pixel and the two endpoints, and is used to determine the possible mark of the pixel; In addition, adjacent vertices also form edges. The sum of the energy functions of these edges is the smoothing constraint item in the above formula, which indicates the degree of similarity between adjacent pixels to ensure the smoothness of the result. By minimizing the energy function , the vertex set is divided into two vertex sets connected to the source point s and the sink point (Boykov and Jolly, 2001).
In this paper, the invariant buildings are used as the foreground, and the remaining part of the image after removing the buildings and the vegetation area are used as the background, and the building change detection is converted into a binary classification problem. We use GrabCut algorithm to extract the building area from the new time optical image, then we remove the unchanged building area, after that we can get the new building area. In this paper, the invariant buildings are used as the foreground samples of the GrabCut algorithm; the vegetation area and the shadow area are used as the background samples of the GrabCut algorithm, and the GrabCut algorithm is implemented using the python language combined with the OpenCV package programming. The algorithm performs five iterations to extract the building area. What's more, to improve efficiency, the optical remote sensing image is segmented into super-pixels firstly by SLIC (Achanta et al., 2012), and then these super-pixels are used as nodes to construct a graph model for segmentation by GrabCut. After segmentation, the building area was obtained.

New building detection based on GrabCut classification results
The results obtained by GrabCut segmentation include newly added buildings and unchanged buildings. As long as the unchanged buildings in the classification result are subtracted, the newly added buildings are obtained, and then a median filtering and small blocks removing process are performed on the difference result. Finally, a denoising processing is performed to get accurate new buildings.

EXPERIMENTS AND ANALYSIS
In order to verify the effectiveness of the proposed method, this paper selects old-time ALS data and new-time high-resolution optical remote sensing images of two regions in a certain area, detect building changes, and uses the confusion matrix as the accuracy evaluation criterion. The experimental results show that the detection accuracy of the two experimental regions reaches 89% and 84%, respectively.

Experimental data
Two datasets shown in Figure 4 are used for experiments. The details are as follows: (1) ALS point cloud data: the aerial laser point cloud data was obtained in 2014, the point cloud density is about 10 points/m 2 . As shown in Figure 4  (2) Remote sensing image: it was obtained in 2018. It is generated by fusing WorldView3 panchromatic and multispectral images. The resolution is about 0.31 meters.
Since the point cloud data obtained in 2014 and the remote sensing images in 2018 have different coordinate systems, it is necessary to perform the registration of these two datasets. The registration process is shown in Figure 3. Firstly, select the control points in the remote sensing image and point cloud data, calculate the affine transformation model between the two datasets. Secondly, correct the RPC parameters of the remote sensing image, and then use the corrected RPC parameters to project the point clouds. Finally, interpolation is performed to fill in some holes to complete the registration of point cloud data and remote sensing images.  Figure. 3 Registration process of old-time point cloud data and new-time optical remote sensing image

Evaluation criteria
In this paper, the confusion matrix is used to evaluate the accuracy. Confusion matrix, also known as error matrix, is a standard format for expressing accuracy evaluation, which is represented by a matrix with n rows and n columns. Each column of the confusion matrix represents the predicted category, and the total number of each column represents the number of data predicted to belongs to each category; each row represents the true attribution category of the data, and the total amount of data in each row represents the number of data instances of each category. The specific evaluation indicators include overall accuracy, mapping accuracy and user accuracy. These accuracy indicators reflect the accuracy of image classification from different aspects. Through the confusion matrix, the accuracy of the classification results can be seen very intuitively, so we use it to detect the accuracy of the proposed method.

Experimental results
Firstly, the shadow extraction method was performed on the experimental remote sensing images, the results were shown in Figure 5 (a) and (b), and then superimposed on the building obtained from the 3D ALS data. Then the unchanging building areas and the disappearing building areas were obtained. Secondly, the vegetation area was extracted on remote sensing images of the two experimental. The vegetation extraction results are shown in Figure 5 (c) and (d). After that, the vegetation areas and the disappeared building areas were treated as negative samples, and the unchanged building areas were treated as positive samples, and the GrabCut algorithm is used to segment the whole image, and finally get the newly added building areas.
In order to verify the effect of the proposed method, the real change area is marked by means of human-computer interaction, as shown in Figure 6 (a) and (c), and the detection result is shown in Figure 6 (b) and (d). It can be seen from the results that the unchanging building areas of the two experimental areas can be accurately detected; for the experimental area I, the detection of the disappearing buildings is more accurate, but there are large new areas near the boundary of the experimental area which was not detected by the method in this paper. For the experimental area II, the detection of the newly added buildings is more accurate, but the detected boundary of each newly added building area is incomplete, and there are vanishing buildings near the boundary were false detected.  Table. 1 and Table. 2 respectively. It can be seen from the confusion matrix that the overall accuracy of the change detection in the two experimental areas can reach more than 84%; for the experimental area (a), the detection result of the newly added building area is poor, and there are some missed detection cases, which is consistent with the intuitive observation results from Figure 6; for the experimental area (b), the detection result of the newly added building is also slightly worse due to the incomplete boundary of the detected new building area.

CONCLUSIONS
This paper proposes a novel building change detection method that combines a single-phase ALS point cloud and satellite image. It does not require manual selection of training samples, and uses the associated relationship between shadows and buildings to achieve high-precision extraction of disappeared buildings. The building area and non-building area on the newphase image are acquired, and the building extraction is modelled as a segmentation problem of foreground and background. The GrabCut algorithm is employed to achieve the extraction of buildings in uncertain areas. Experimental results show that the proposed method achieves satisfying performance, especially for the extraction of disappeared buildings. The accuracy is up to 88%.
However, there are also some shortcomings. For example, some low-rise buildings have very small shadow areas (only a few pixels), and some buildings are surrounded by vegetation. No shadows can be found on it, and they will be misjudged as disappearing buildings. Therefore, in the future, LBP texture features, MBI index, depth features would be added to improve the classification results.