HIGH-RESOLUTION SAR COHERENT CHANGE DETECTION IN URBAN ENVIRONMENT

The identification of changes in urban settlements is of great interest for damage assessment after natural disasters, cadastral mapping and monitoring urban development and illegal activities. Radar-based remote sensing from space-borne platforms is quite useful in this scenario and Synthetic Aperture Radar (SAR) data is attractive due to its wide coverage, the day and night all-weather availability, and the sensitivity to slight changes in the scene structure. In this context, the launch of the European Space Agency (ESA) constellation Sentinel-1 has played a significant role: the exact repetition of the acquisition geometry, the repeated illumination and the sensitivity to centimetric changes thanks to the C-Band (5.4GHz) radar payload make Sentinel-1 the perfect instrument to monitor urban settlements. Coherent Change Detection (CCD) techniques are able to detect even the finest change in the structure of a target, so small to be comparable with the wavelength. This sensibility is an advantage, but turns into a drawback especially in an urban environment where every subtle change may cause an unwanted detection. This paper tackles the problem of the huge amount of triggered detections over urbanized areas with a combination of a high-resolution coherent multi-change detection technique and Geospatial Information System (GIS) post-processing. The final result is a map of buildings that are changed in the scene due to relevant variation of their structure. In this contribution, the complete workflow is explained, and a preliminary validation is carried out by means of a set of images gathered by Sentinel-1 and a set of optical images over the city of Manchester.


INTRODUCTION
The ansence of cloud coverage, the possibility to sense the scene night and day, the repeated geometry of acquisition and, the most important, the high sensitivity to fine changes in the spatial structure of the scene are all peculiarity that contribute to SAR images very popular for change detection purposes. Further development in this area was made possible thanks to the Copernicus mission Sentinel-1: starting from 2014 free and open data is available opening the possibility to analyze long time series over urban areas (Washaya et al., 2018, Hakdaoui et al., 2019. The monitoring of urban settlements is an interesting topic in the remote sensing community. Tracking the rapid changing in the urban landscape and its effect on the environment (Ehrlich et al., 2018), monitoring buildings construction or demolition (Tamura, 2015) and the generation of land cover land use maps (Pan et al., 2019) are just three of the multitude of possible applications of remote sensing and change detection techniques over civilized areas (Gamba et al., 2008. A SAR is a coherent sensing system in which both the amplitude and the phase of the signal backscattered from the targets on the ground are measured. The amplitude is related to the brightness, like for optical images, while the phase is sensitive to the target's location with the accuracy of a fraction of a wavelength. Coherent change detection (CCD), exploits the phase information of the back-scattering process leading to a much higher sensitivity even to fine changes (in the order of a fraction of a wavelength) (Preiss and Stacy, 2011, Mian et al., 2019, Wahl et al., 2016. This sensibility is so high that a huge number of changes can be triggered due to slight variations of features that are not of interest: these nuisance detections can hinder the recognition of interesting changes. The urban environment is one of the most challenging ones for * Corresponding author. the CCD methods since plenty of sources of decorrelation are present: parking lots and green areas, for example, can generate a huge number of false detections. The challenges for a CCD method are not only limited to the space domain (a lot of unwanted detection in different places), but also in time since multiple changes may occur along the total observation period. In this scenario, assuming a model where only one change happens in the whole complex time series can be insufficient thus a more robust estimator must be implemented and a method to suppress all the changes that are not of any interest must be realized. The method here proposed for detecting changes affecting buildings in an urban environment exploits the sensitivity of coherent approaches while reducing at most the number of false alarm basing on the maximization of the change likelihood. The entire processing workflow is validated with a case study over the city of Manchester, UK.

METHODOLOGY
In this section the complete processing procedure is presented starting from the mandatory pre-processing of the data and ending with the GIS-based post-processing. The workflow is also depicted in Figure 1.

Pre-processing
SAR data needs an ad-hoc pre-processing: this includes focusing, amplitude calibration, fine coregistration at sub-pixel level and a debursting procedure (if the image is acquired in bursts like for Sentinel-1). If a coherent processing is needed, also the image's phase needs to be properly calibrated, since, as the phase senses the sensor-target optical path, the following elements have an impact: 1. The satellite changes position (see Figure 2). This condition generates in the interferometric data a term that is propor- Figure 1: Processing scheme: in red the pre-processing procedure, in green the proposed M-CCD and in blue the GIS-based post-processing.
tional to the local topography and must be removed. This term is expressed as: where R is the slant range (i.e. the distance from the master sensor to the target), B ⊥ i is the normal baseline, that is the orbit deviation of the i th acquisition respect to the reference in the direction perpendicular to the line of sight., θ is the local incidence angle, λ is the radar wavelength and q the height of the target over flat earth.
2. The change in the position of target or a part of it, that can be indicator of a change.
3. The variation of the optical path due to the propagation in the atmosphere.
In the standard interferometric pre-processing an external Digital Elevation Model is used to compensate for the term related to topography up to the limits represented by the orbit knowledge and the DEM resolution. The so-called Atmospheric Phase Screen (APS) perturbation must be compensated and it can be done by jointly exploiting a stack of images as extensively discussed in (Monti-Guarnieri et al., 2018, Manzoni et al., 2020.

Estimation of the residual height
The compensation of the phase term due to topography can be inaccurate in an urban environment, where there is a large height dispersion DEM resolution cell: the DEM used does not include buildings and still has finite vertical resolution leading to a residual topographical phase term that can be compensated. This residual term can generate two unwanted effects: 1. false alarms can be triggered by the coherent technique due to the variation of the target phase that is indeed due to the change in the baseline. The apparent change is due to the imperfect compensation with the known DEM; 2. a geocoding error that locates the target in a wrong geographical position, preventing the proper functioning of the GIS post-processing that requires high accuracy in the localization of the detection.
In Figure 2 the problem related to the geocoding error is depicted: the target at position P will be geolocated at position P if its elevation q is assumed to be zero. The location error on the ground would be d = q/tan(θ), which could account for tens of meters in an urban scenario. The solution is to recur to a processing able to extract the residual height q from the data and, in turn, using the local incidence angle θ to find the displacement d.
, Figure 2: The acquisition geometry of a multipass SAR. The master acquisition is denoted with M while the B ⊥ i the normal baseline concerning the master and the i th slave. With the proposed procedure it is possible to retrieve the residual height q of the target. By knowning then the local incidence angle θ it is possible to compensate the distance d (geocoding error).
In presence of a single point scatterer in the resolution cell, a single pixel of the i th SLC image can be modeled as: Where ρ is the complex reflectivity function entailing both the amplitude and the phase of the backscattered signal, B ⊥ i is the normal baseline between the master orbit and the i th slave, θ the local incidence angle and q the residual height to estimate. Since the geometry of the acquisition is known very well thanks to precise orbits available for Sentinel-1 products, the only parameter to estimate in equation 2 is the residual height. The simplest solution that doesn not require any matrix inversion is to explore a set of possible heights by cross-correlating the complex time series of a pixel with the model. This operation minimizes the L2 norm between the data and the model:q The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) where g = [g1 g2 g3 . . . gN ] T is the complex time series of a single pixel in the scene and is the vector containing all the complex terms due to the residual topography.
The mean accuracy in the estimation of the height can be derived by a simple linear model where the unwrapped phases are employed: where φ is the Ni × 1 vector containing the time series of the phases of a single pixel in the scene, Kz is the Ni × 1 vector containing all the phase to height conversion factors as in equation 2, q is the residual height to be estimated and w is the Ni × 1 vector representing the noise. The variance on the estimate is easily derived as in any linear model with uncorrelated noise: With, for example, σ 2 w = 1, an average baseline of 30m and an average incidence angle of 35 deg we can obtain a standard deviation on the estimate σq = 4m with as low as 10 images. The requested accuracy in terms of target localization is thus reached opening the path for accurate GIS post-processing.

Single-change CCD
In this section, we will revise the method for coherent change detection concerning a single decorrelation point in the whole time series presented in (Monti-Guarnieri et al., 2018). This processing is indeed fundamental to properly introduce the novelty provided by this contribution in the next section. The change detector is based on the Generalized Likelihood Ratio Test (GLRT) and can be formulated as: Where x is the Ni × 1 vector containing the complex time series of a pixel in the scene, Ns is the number of independent samples in the neighborhood of the pixel used to estimate the coherency matrix Γx, Γ0 and ΓN c are the coherency matrices in case of no change and change at instant Nc respectively and Λ is a suited threshold for the detection. All the possible ΓN c are tested against Γ0. The absolute minimum of Z(x) is found and if it is under threshold a detection is triggered. The detector has a large number of degrees of freedom in the sense that a lot of parameters may be tuned resulting in completely different outputs: the threshold can be changed, the two polarization can be averaged to assure consensus between them, the estimation window can be made of different sizes, the number of images used and their temporal sampling can be tuned. The presented method was designed for the detection of a single decorrelation point in the stack: this condition can be fulfilled in very stable areas where a sudden decorrelation (e.g. an earthquake, see (Monti-Guarnieri et al., 2018)) happens. This model must be extended to take into account multiple decorrelations points in the history of the target.

Extended CCD for multiple changes: M-CCD
In the presence of a turbulent environment like the urban one, decorrelations are frequent in time. In the case of an unknown number of changes, the statistically optimal detector becomes rapidly unfeasible since we have to test all the possible coherency matrix with all the possible number of changes. Given a dataset composed of Ni images, the total number of coherence matrices to be tested is 2 N i −1 − 1. Keeping Ni low would decrease the quality of the estimator, but keeping it high could become unfeasible. Two sub-optimal detectors (from now on called M-CCDs) have been formulated with a significantly lower computational load: 1. Local minima M-CCD: supposing only one change in the covariance matrix it's possible to find all the local minima under a certain threshold in the GLRT history: this solution is computationally inexpensive since the number of covariance matrices to be tested is just Ni − 1.
2. Segmentation-based M-CCD: supposing again only one change in the covariance matrix the algorithm proceeds by finding the absolute minimum of the GLRT history. If it's over threshold the procedure ends with no detections, while if it's under threshold the dataset is segmented and the GLRT repeated in the two subsets now generated. The procedure continues in a recursive manner until no more absolute minima are detected under threshold. This method is computationally more intensive than the first one.
An example of detection is given in Figure 3: a simulated dataset has been created with 2 images and two changes. The first one at image 10, the second at image 18. The top figure shows the behavior of the single change CCD of section 2.3 where only the change in the middle of the stack is detected. The segmentation approach, instead, continues the search in the two subsets from image 1 to 9 and from 11 to the end of the stack. In the first subset (middle image) nothing is found under threshold, while in the second case a detection is triggered at image 18.  The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) changes at different time instants. The number of independent samples used to estimate the covariance matrix is 5 to properly simulate the real scenario where we want to obtain the highest resolution possible and where the estimate of the sample covariance matrix has a large variance superimposed. In the framework of a multi-change detector, there is not a unique way to define what is a correct detection and what is a false alarm. A detector may trigger a change in a correct position, but at the same time miss the others: the detection is correct, but it's not complete.
In our scenario we decided to be as severe as possible: if the detector triggers all the detections in the correct positions we call that a "correct detection" while if it detects just a portion of changes it's not counted as correct detection. On the other end, a false alarm event is triggered even if only one change is detected in a synthetic dataset with no changes.

Probability of correct detection
Local minima Segmentation Figure 4: Probability of correct detection for the two sub-optimal algorithms. The M-CCD segmentation approach shows significantly better performances than the local minima approach.

Probability of false alarm
Local minima Segmentation Figure 5: False alarm rate for the two techniques. The probability of false alarm is the same for the two methods since an absolute minima is always also a local minima.
The identical Probability of False Alarm in figure 5 is expected and it is due to the fact that the absolute minimum is always also a local minimum, thus, the number of false detections is the same. The difference is instead in the probability of correct detection (Figure 4 where it is quite clear that the segmentation approach outperforms the local minima procedure. The rationale behind the differences of performances can be derived in the example shown in Figure 3: the change in the middle of the stack hides the notch in the GLRT function associated with the change happening later in the stack. This will prevent the detection of the latter changes by the local minima approach compromising in this way the probability of correct detection. For the following, we decided to keep only the second solution since it's the one showing better performances, while keeping at the same time the computational processing quite low. From now on, every time we will refer to M-CCD, it is implied that we use the segmentation-based M-CCD just explained. The final output of the CCD processing is a set of geolocated points, each one representing a triggered detection in a particular location and in a particular time instant. In order to further reduce the number of false alarms and to identify buildings affected by changes, a post-processing of such points is mandatory as explained in detail in the following section.

GIS post-processing
This procedure aims to process the detections obtained by the extended CCD method in order to identify and extract only the buildings interested by changes. It is worth noting that the procedure here described assumes the availability of geospatial dataset related to land use/land cover or buildings. Nowadays this is almost everywhere ensured thanks to the diffusion of open data policies or crowdsourced mapping initiatives such as OpenStreetMap (Jokar Arsanjani et al., 2015). In alternative, a SAR-derived mask of urbanized areas could be used opening the possibility to perform the CCD processing in areas where no geospatial dataset is available. The post-processing is based on a two-step GIS-based procedure: • Selection by location: it allows the extraction of the subset of points specifically located inside the buildings polygons, thus the subset of points (or targets, in radar jargon) that identifies changes over buildings; • Association of detections: it focuses on the identification of the buildings interested by changes (hereinafter SARdetected buildings) and it is performed by associating to each building polygon the total number of points lying inside it.
A metric that it is possible to use to classify the relevance of a change is thus the number of detections happened inside the perimeter of a building: the greater the number of points the more significant the change occurred.

Description of the dataset
In this section, a validation of the mentioned procedure is performed over the city of Manchester, UK. For the validation we used a set of 17 SAR images gathered by the constellation Sentinel-1 from June 2015 to December 2016. The description of the case study is summarized in Table 1.

Extended CCD processing
In the following we have tested different combinations of parameters: each combination is hereinafter called RUN. Each RUN with the corresponding set of parameters is depicted in Table 2. The choice of the threshold is tightly related to the number of independent looks taken for the estimation of the coherence matrix as it is clear in equation 7. The polarization can be taken as VV only, VH only or VV+VH in the sense that both are averaged for the coherence matrix estimation. The estimation window can be made of any size: a small window reaches higher resolution but then the interferometric phases are very noisy, while bigger windows ensure higher quality with the drawback of a poorer resolution.

GIS post-processing
The GIS post-processing was performed by exploiting the dataset of buildings of Manchester related to the year 2015, which is provided as open data by the national mapping agency for Great Britain (Ordnance Survey). Table 3 reports information about the detections resulting from each RUN. Considering the total number of changes detected by the extended CCD method, RUN02 obtains the largest amount of detected changes (more than 240,000) while RUN03 obtains the smallest (about 14,000). The other RUNS are also in the order of tens of thousands of detections. It is worth noting that the selection of the subset of changes located inside buildings allows to significantly reduce the total number of identified changes; in fact, considering the different RUNS, the initial data sets are reduced by percentages which range between 66% and 84%. As mentioned in subsection 2.5, for each RUN, a vector dataset of buildings with the corresponding number of detected changes was generated. Table 3 shows that the RUN02 identifies 11,313 SAR-detected buildings; compared to RUN02, the number of SAR-detected buildings obtained by RUN01 is less than a half (4,154). Even less is the amount of buildings identified by RUN04, only 724. Finally, the RUN03, although characterized by a total number of changes less than RUN04, identifies a number of SARdetected buildings that is doubled (1,675), meaning that there are many buildings with associated a very low number of changes. The discrepancy between RUN01 and RUN03 can be explained by noticing that the residual height has been estimated in the latter case as mentioned in Section 2.2, the covariance matrix has been corrected for the residual phase factor concerning the height and thus a much lower number of detections is expected. In RUN02, on the other end, a much more gentle threshold is used and the number of detection increases accordingly.
A more detailed analysis of results can be performed by taking into account the number of changes associated with each building. Figure 6 shows an example of SAR-detected buildings dataset derived from RUN01 on a limited area of Manchester.  characterize them: it is clear that most of the buildings are affected by a number of changes lower or equal to 5 (blue colored buildings).   As expected, the graph highlights that the more the χ value increases the more the number of changed buildings decreases since buildings with less than χ detections are no more count as changed.
In particular, a steep drop in SAR-detected buildings number is showed for χ values ranging between 1 and 5: considering χ equal to 3, the number of SAR-detected buildings is reduced by 50% for RUN04 and of 70% for the other RUNS; if χ is set to 5, the SAR-detected buildings are reduced of the 60% for RUN04 and of the 85% for the other RUNS. It is interesting to notice that, when the threshold χ increases, the numbers of SAR-detected changes for each RUN converge to a common consensus (see, for example, RUN01 and RUN04 in red and green respectively). This highlights that the choice of the configuration of the estimator is less and less important. If a lot of changes are inside the boundaries of a building, the change is very relevant and thus it will be detected no matter the size of the estimation window, the polarization, or the selected detection threshold (Λ).

Comparison with optical images
A proper validation cannot be done since no official information about construction sites and/or building renovation is available. Therefore, to assess the actual occurrence of changes over SAR-detected buildings, a comparison with high-resolution optical photos has been performed by means of visual interpretation. To this purpose, the historical imagery of Google Earth related to the years 2013 and 2017 was considered. A significant sample of buildings was extracted from each dataset (RUN) by considering a confidence level of 95% and a margin of error of 5%. After that, for each sample's building, the validation of changes on the optical images was performed. It is worth noting that no threshold value was imposed, thus all the buildings with at least one change were taken into account. Results of comparison, proposed in Figure 8, show that for all the considered RUNS about 60% of SAR-detected buildings are also visible in optical images (blue color bar). Besides the identification of new buildings or destroyed buildings, the SAR method proved to be able to detect also minor changes such as roofing replacements (Figure 9a) or installation of photo-voltaic panels (Figure 9b). For what concerns the disagreements, i.e. the SAR-detected changes to buildings not confirmed by optical, several different reasons can be adduced as an explanation. From the optical side, some changes can be missed due to different illumination conditions between the images; furthermore, SAR method identifies changes at biweekly or monthly level: the same is not possible for optical since only two images (years 2013 and 2017) are available; thus, some particular recurrent changes, e.g. due to mobile structures, cannot be detected in optical analysis.  Figure 8: Validation of SAR-detected changes per building by photo interpretation of optical imagery. The percentage of agreement, i.e the SAR-detected changes confirmed by optical, is about 60 % (blue color) for all the considered RUNS. Disagreements are probably due to the occurrence of parking, 22% -26% (red color), trees, 3.5% -8% (green color), and construction sites, 3% -4.5% (yellow color), nearby the SAR-detected buildings. The percentage of buildings in gray color (4.5% -7%) represents the added value provided by SAR with respect to optical.
From the SAR side, there are many conditions that can cause the non-confirmed change detections. During the comparison activity, three different kinds of conditions were considered: • the existence of parking at the top of the buildings or adjacent to them ( Figure 9c); • the presence of trees covering part of the buildings ( Figure  9d); • the existence of construction sites, and thus storage areas for materials and vehicles, nearby the buildings (Figure 9e). threshold on the number of detection falling inside the polygon representing a building (χ = 5, for example), we are able to highlight major changes that are detected by all the different RUN making less and less important the actual setup of the M-CCD. Finally, a manual comparison has been done with optical images over the same area: a good agreement (about 60%) is presented between the two datasets and, in most of the cases where there is no agreement (28% -31%), the cause of the divergence can be attributed to nearby sources of decorrelation.