AUTOMATIQUE TECHNIQUE FOR OBTAINING URBAN/SUBURBAN CHANGE DETECTION INFORMATION FROM HIGH-RESOLUTION SATELLITE IMAGES

Multitemporal satellite images can serve for monitoring the same geographical area on the earth and recognize potential changes that have occurred. Actually, high-resolution (HR) satellite images can provide detailed pieces of information about change detection. In the current work automatic change detection is carried out in urban and suburban areas of Crete, Greece. QuickBird images of spatial resolution 60 cm/pixel and WorldView images of resolution 30 cm/pixel, that have a relative time difference of 13 years, are utilized. The scene is classified into buildings, vegetation, water and ground as soon as multiindex representation has been applied on the images. Afterwards, automatic change detection is feasible by pixel-per-pixel comparison of the classified multi-temporal images. The water index and the vegetation index which are utilized in the current work are compared with vegetation and water detections which appear in the literature. The experimental results indicate the superiority of the particular spectral indices. Moreover, the presented methodology not only specifies if changes have arisen or not but also provides particular information regarding the types of changes. In the future, further experimentations could lead to the optimization of the proposed methodology.


In general
Urban areas are the center of human habitation as well as of social and economic activities. Regionally and globally, natural as well as human systems are importantly affected by cities, although the latter cover only a small percent of the Earth's land surface. As urban areas rapidly expand, any adjacent forest, cultivated land and water areas usually disappear. The prementioned fact causes problems in the environment, in the ecology as well as in resource management. Actually, applications such as urban expansion and planning, monitoring of urban landscape, deforestation and damage control need that land cover/use changes can be detected. Since earth observation technologies are rapidly expanding, remote sensing (RS) technologies and specifically high-resolution (HR) satellite images can be utilized for the task of change detection. In RS, change detection is the proceeding that provides the route to recognizing changes that have taken place on the Earth's surface by mutually processing two or more temporally different images having been acquired on the same geographical area. Indeed, due to the repeat-pass orbiting of satellites, RS images can be acquired regularly over a given target area and the processing of these multi-temporal RS images can prove really convenient for the change detection of urban and suburban information. Then, urban phenomena can be monitored and predicted while disaster response and sustainable development may be supported upon decision timely and efficiently. In fact, detection of urban targets, such as buildings and water bodies, or classification of urban land use/land cover can be followed by change detection to monitor the landscape Gong et al., 2020;Marin et al., 2015;Ragia, Panagiotopoulou, 2021a).
During change detection from HR images the basic challenge is the perplexity of radiometric and real semantic changes. In multi-temporal HR images, the spectral signature of a specific object may vary at different dates due to several factors such as different imaging conditions, mis-registration and disparity of vertical structures. In fact, radiometric changes that can occur between multi-temporal HR images may be related with their semantic meaning. Then, the different types of radiometric changes that exist in a specific problem and data set can be identified and modeled. Remote sensing HR images are highly complex mainly due to the following factors a) the inherent geometrical complexity, the spectral nonhomogeneity, and the multiscale properties of the objects b) probable differing acquisition conditions of multitemporal data. Firstly, objects which are homogeneous from a semantic viewpoint (e.g., buildings) often present spectral signatures that at HR result inhomogeneous due to the different sub-objects from which they are composed. Secondly, differences in the acquisition geometry and in shadows are caused by different angles of sensor view, which subsequently leads to considerably varied object representations in the acquired images. Furthermore, in optical images spectral signatures are steeply amended because of differences resulting from seasonal effects and illumination conditions. Thereafter, during comparison of multitemporal images, there arise a big set of possible radiometric changes having significantly different semantic meaning, thus different cause of the change itself . * Author to whom correspondence should be addressed 1.2 Literature on change detection using high-resolution images

Machine learning based techniques
In the literature several works with change detection on HR images have been presented that rely on machine learning (Pacifici, Frate, 2010;Volpi et al., 2013;Gupta et al., 2018;Ragia, Panagiotopoulou, 2021b). A novel automatic change detection method relying on Pulse-Coupled Neural Networks (PCNN) is examined in (Pacifici, Frate, 2010). During each algorithm iteration, the PCNN produces one wave per image and specific signatures of the scene are created, which are then compared for recognizing if a change has happened or not. The presented approach aims at uncovering changed subareas in the image and no single pixels get analyzed. The technique is rather fast whilst an overall object accuracy of 90.7% is achieved. Textural and morphological features are utilized on the subject of supervised change detection in very HR images in (Volpi et al., 2013). The usage of nonlinear support vector machine (SVM) brings an efficient nonparametric solution to the nonlinearity of the multi-temporal signals. Additionally, the data requirements of the model get relaxed. Due to the spatial smoothing that is caused, the class separation performed by the SVM model as well as the spatial coherence of the change detection maps are significantly improved. A semantic segmentation technique for satellite images is presented in (Gupta et al., 2018). The employed architecture is similar to Unet but the encoder building blocks are stimulated by ResNet architecture. The proposed model was optimized for the combination of dice loss, cross entropy and focal loss. Multichannel probability maps are produced as output, since a confidence score is assigned to each class by the model. Prediction is performed on four variants of the same patch. The four probability maps generated are rotated back to the original orientation of the image and averaged. The final probability patch for the image is constructed by combining all the patch probability maps using bilinear interpolation. The probability map is then finally converted to its RGB counterpart by taking argmax along the channel dimension. Also, fully connected conditional random fields are applied as the final postprocessing step in order to make predictions smoother and find a negligible increase in the final validation metric. Another novel change detection technique is presented in (Ragia, Panagiotopoulou, 2021b). The active learning (AL) algorithm Bayesian AL disagreement (BALD) is applied on WorldView images of urban and suburban areas. Various cases of selecting different amounts of images in the training set of a convolutional neural network (CNN) go through experimentation. The validation accuracy of classification as changed or unchanged of the BALD algorithm proves superior to that of the random sampling algorithm. In fact, as the amount of training images increases, the accuracy also increases.

1.2.1.1
Water detection Also, there have appeared some studies in the literature that focus on water detection (Huang et al., 2015;Faridatul, Wu, 2018;Tsikdogan et al., 2017;. A new machine learning based methodology for recognizing the water types from urban HR remote sensing images is developed in (Huang et al., 2015). In the particular framework, firstly water bodies are taken out at the pixel level and secondly water types are further pinpointed at the object level. The presented methodology gets validated by means of GeoEye-1 and WorldView-2 images over two immense cities in China. Experiments prove that the proposed technique achieves satisfactory accuracies not only for water extraction but for water type classification as well. Also, three novel spectral indices and an automated approach for the classification of the four major urban land types called impervious, bare land, vegetation, and water are presented in (Faridatul, Wu, 2018). For the distinction of impervious and bare land, a modified normalized difference bare-land index (MNDBI) is proposed. Regarding the identification of vegetation and water areas, a tasseled cap water and vegetation index (TCWVI) is presented. Additionally, a shadow index (ShDI) for extra improving water detection by dividing water from shadows is proposed. Land covers are classified using a decision tree algorithm. The proposed methodology reaches an overall classification accuracy of 94-96% and predominates over the SVM algorithm. The machine learning based work in (Tsikdogan et al., 2019) presents a next-generation surface water mapping model, DeepWaterMapV2, which uses improved model architecture, data set, and a training setup for producing surface water maps at lower cost, with higher precision and recall than its predecessor (Tsikdogan et al., 2017). DeepWaterMapV2 is robust against a diversity of natural and artificial input disturbances, like noise, different sensor characteristics, and small clouds. Actually, the model can even "see" through the clouds, when the scene is not fully blocked by clouds. Despite the fact that the model has been trained only on Landsat-8 images, it also supports data from various satellites, including Landsat-5, Landsat-7, and Sentinel-2, without any extra training or calibration.

Context and feature based image modeling techniques
In the literature there also have appeared some works that are based on image modeling with regard to context and features (Bovolo, 2009;Falco et al., 2013;Wen et al., 2016;Huang et al., 2017a). A parcel-based context sensitive method for unsupervised detection of changes in very high geometrical resolution images is proposed in (Bovolo, 2009). The scene, and thus changes, are modeled at different resolution levels defining multitemporal and multilevel parcels or small homogeneous regions shared by both original images. A multilevel change vector analysis is applied to each pixel of the considered images. The adjustable nature of multitemporal parcels and their multilevel representation allow the right modeling not only of complex objects but also of borders and details of the changed areas. The work in (Falco et al., 2013) presents a change detection technique relying on morphological attribute profiles (AP). A multilevel AP is constructed for each original image and then, the computation of the AP derivative (DAP) shows the regions that have been excluded at each level of the relative AP. The multilevel behavior of the DAP permits the extraction of connected regions at different levels of the profile. Afterwards, a region-based analysis is executed for each pixel to recognize the most suitable resolution level for the AP comparison. The morphological APs prove effective in modeling the spatial context information by utilizing geometrical features. In fact, the proposed methodology presents superiority in detecting areas whose the geometrical properties were altered during the two image acquisitions separately from spectral variations. (Wen et al., 2016) presents an innovative change detection method based on the multiindex image representation for urban scenes. The proposed technique achieves satisfactory change detection accuracy with a set of low-dimensional but semantic information indexes. A blockbased approach, where the frequencies of the information indexes are regarded for change detection, and a cell-based approach, where each block is extra divided into a series of cells, are implemented. According to experiments, the cellbased approach significantly outperforms the block based one for binary as well as for class-specific change detection. Thereafter, the positioning order of the primitives in a scene is substantial for the scene-based image interpretation. In the study presented in (Huang et al., 2017a) multi-view ZY-3 satellite data are used to produce multi-temporal orthographic images through photogrammetric derivation. A general framework for accurate urban change analysis in a multi-level, namely pixel, grid, and city block, approach is proposed. The results verify the accuracy of the proposed technique for monitoring fine urban changes, attaining Kappa coefficients of ~0.8 at the pixel level and a correctness of 93-95% at the grid level.

1.2.2.1
Water detection Also, there have appeared some studies in the literature that focus on water detection (Galindo et al., 2009;Liu et al., 2018;Chen et al., 2020). In work (Galindo et al., 2009) the problem of detecting swimming pools in QuickBird HR images of urban areas is addressed. Colour analysis is applied for water detection, approximate segmentation serves for an initial, rough localization, whilst active contours techniques are applied to refine the pools' shape. The algorithm has been tested in both satellite and aerial images. The work in (Liu et al., 2018) presents a method for detection of open surface water in urbanized areas with GaoFen-3 (GF-3) Synthetic Aperture Radar images of high spatial resolution. Building shadows are taken off controlled by the correspondence of buildings and their shadows. The Receiver Operating Characteristic curves of the water detection results show that the presented technique increases the Probability of Detection to 98.36% and decreases the Probability of False Alarm to 1.91% in comparison with the thresholding method. The presented technique can remove building shadows and identify water with high precision in urban areas, from which water resource management could greatly benefit. In reference (Chen et al., 2020) a novel open surface water detection technique for urbanized areas is presented. Inequality constraints and physical magnitude constraints are used for recognizing water from urban scenes. According to experimentation on spectral libraries and HR remote sensing images, by the usage of a set of proposed fixed threshold values the specific technique outperforms or gives comparable results with algorithms based on traditional water indices that need fine-tuning for optimal results. Regarding surface glint and hyper-eutrophic water, the technique is not much effective. Also, the proposed method is physically justified, simple for implementation, and computationally efficient, thus it could be applied in problems of large scale water detection.

The present work
In the present work automatic change detection is carried out in urban and suburban areas of Crete, Greece, by utilizing multitemporal QuickBird and WorldView images. Classification of the scene into buildings, vegetation, water and ground by means of multiindex representation is performed. This work is an extension of (Panagiotopoulou, Ragia, 2021). Actually, the novel vegetation and water indices which have been developed in (Panagiotopoulou, Ragia, 2021) in the current work are compared to the vegetation and water detections which appear in (Gupta et al., 2018) and (Tsikdogan et al. 2017; 2019), respectively. As soon as classification has been performed, the multi-temporal images are compared pixel-per-pixel and automatic change detection is performed. The experimental results prove the superiority of the specific spectral indices and of the developed change detection method.
The current work is organized into five Sections. Section 2 describes the study area and the satellite data. The methods that are followed are given in Section 3. Section 4 presents the results, while the conclusions are drawn in Section 5.

STUDY AREA AND DATA
The region under study is Georgioupoli in the island of Crete, Greece (Ragia, Krassakis, 2019). This area is located on the Northern part of the island and it is mostly a tourism area. WorldView images of spatial resolution 30 cm/pixel and QuickBird images of 60 cm/pixel serve for the experimental data. Table 1 gives the four spectral bands of the satellite images. The WorldView image bands 2, 3, 4 and 5 represent the blue, green, yellow and red channels, respectively. Regarding the QuickBird image, the bands 1, 2, 3 and 4 correspond to the blue, green, red and near-infrared channels, correspondingly.
The satellite images present a relative time difference of 13 years and depict two different scenes near the shore. The WorldView images show a more built-up area in comparison with the QuickBird images. In fact, the changes having been induced in the area during the time passed mainly due to human activities are to be recognized by the proposed methodology. Figure 1 demonstrates the QuickBird and WorldView images of "Scene 1" and "Scene 2". Bicubic interpolation per the factor of 2 has been applied on the QuickBird images to increase the spatial resolution (Bratsolis et al., 2018). Thereafter, all the images that are depicted in Figure 1 present the same spatial resolution of 30 cm/pixel. Moreover, the QuickBird images have been geometrically co-registered to the WorldView ones.

METHODS
Firstly, the images get classified into buildings, vegetation, water and ground by means of multiindex scene representation (Wen et al., 2016). The morphological building index being implemented by the algorithm which is described in (Huang, Zhang, 2011;Huang et al., 2017b) is used to perform building detection. With regard to vegetation detection, the following spectral index is utilized (Panagiotopoulou, Ragia, 2021): where B = radiance of the blue channel G = radiance of the green channel VEG = vegetation index The vegetation signals get intensified through the difference between blue and green bands. Actually, in a classic urban scene, there appears comparatively low reflectance of the soil in the blue channel (Wen et al. 2016). So, if the green band is used to diminish the effects of buildings and water, vegetation detection occurs. Experimentation designates that the multiplicative factor 0.5 to the green channel is necessary. For comparison purpose, vegetation detection is also performed by means of the semantic segmentation technique which is presented in (Gupta et al., 2018). Multichannel probability maps are generated as output whilst prediction is carried out on four variants of the same image patch. All the patch probability maps get merged by means of bilinear interpolation to produce the final probability map for the image. Additionally, a postprocessing step of applying fully connected conditional random fields is followed for smoother predictions.  In concern with water body detection, the spectral ind and (3) are used (Panagiotopoulou, Ragia, 2021) For the WorldView images:

WTR = 3(G − Y)
For the QuickBird images: where G = radiance of the green channel Y = radiance of the yellow channel R = radiance of the red channel WTR 1 = water index for the WorldView WTR 2 = water index for the QuickBird images The difference between green and yellow bands or between green and red bands can strengthen the water signals. As a matter of fact, in urban scenes, soil gives a reflectance peak in the yellow or red channel (Wen et al., 2016) buildings present spectral similarity to the bright soil, lowering the effects of both soil and buildings utilizing the red or yellow channel, can lead to water detection. The multiplic in equations (2) and (3) is found to be essential after experimentation. Water detection is also carried out with the CNN based technique which appears in (Tsikdogan et al.  (Panagiotopoulou, Ragia, 2021).

)
(2) (3) yellow channel = water index for the WorldView images = water index for the QuickBird images e difference between green and yellow bands or between green and red bands can strengthen the water signals. As a matter of fact, in urban scenes, soil gives a reflectance peak in low or red channel (Wen et al., 2016). As long as buildings present spectral similarity to the bright soil, lowering the effects of both soil and buildings utilizing the red or yellow channel, can lead to water detection. The multiplicative factor 3 (3) is found to be essential after experimentation. Water detection is also carried out with the echnique which appears in (Tsikdogan et al., 2019). The specific model called robustness against a diversity of natural and artificial input disturbances while it can even "see" through the clouds in case of not fully clouded scene. Although the particular model has been trained only on Landsat-8 images, it also supports data from other different satellites without any additional training or calibration.

RESULTS
Figures 2-3 depict the information maps of the images. In specific, the multiindex urban representations of QuickBird and WorldView images regarding "Scene 1" and "Scene 2" are demonstrated. Buildings are represented with the red color, vegetation is denoted with the green color, water is shown with the blue, while soil and roads are characterized with the black color. In both figures the buildings have been detected technique in (Huang, Zhang, 2011; as vegetation is concerned, Figure 2 shows the detection r from applying the vegetation index 2021) while Figure 3 depicts vegetation d technique in (Gupta et al., 2018). Regarding water, the (Panagiotopoulou, Ragia, 2021) has been applied in Figure 2 whereas the water detection method of (Tsikdogan et al., 2019) has been utilized in Figure 3.
Figures 4-5 demonstrate the corresponding image histograms as resulting from the multiindex urban representations of Figures 2 and 3. According to the information maps and the related histograms of Figure 4, in "Scene 1" after 13 years there has been an increase of the ground per 12%, the water has increased per 1% as well as the vegetation has increased per 11%. Concerning the buildings, these are recognized to have decreased per 25%. With regard to "Scene 2", the information maps and the histograms depict tha years, buildings have decreased per 2%, vegetation has increased per 9%, water has increased per 3% while soil and roads have decreased per 9%. Regarding the information maps and the associated histograms of Figure 5, in "Scene 1" the following changes have occurred after the time of thirteen years The specific model called DeepWaterMapV2 presents against a diversity of natural and artificial input disturbances while it can even "see" through the clouds in case of not fully clouded scene. Although the particular model has 8 images, it also supports data rent satellites without any additional training or

RESULTS
3 depict the information maps of the images. In specific, the multiindex urban representations of QuickBird and WorldView images regarding "Scene 1" and "Scene 2" are strated. Buildings are represented with the red color, vegetation is denoted with the green color, water is shown with the blue, while soil and roads are characterized with the black color. In both figures the buildings have been detected by the Huang, Zhang, 2011;Huang et al., 2017b). As far as vegetation is concerned, Figure 2 shows the detection result vegetation index of (Panagiotopoulou, Ragia, while Figure (Panagiotopoulou, Ragia, 2021), Panagiotopoulou, Ragia, 2021), Black = Ground (e.g., soil and roads): Multiindex urban representation of (a) QuickBird image of "Scene 1" (b) WorldView image of "Scene 1" (c) QuickBird image of "Scene 2" (d) WorldView image of "Scene 2". 5 demonstrate the corresponding image-based histograms as resulting from the multiindex urban representations of Figures 2 and 3. According to the information maps and the related histograms of Figure 4, in "Scene 1" after se of the ground per 12%, the water has increased per 1% as well as the vegetation has increased per 11%. Concerning the buildings, these are recognized to have decreased per 25%. With regard to "Scene 2", the information maps and the histograms depict that after 13 years, buildings have decreased per 2%, vegetation has increased per 9%, water has increased per 3% while soil and roads have decreased per 9%. Regarding the information maps and the associated histograms of Figure 5, in "Scene 1" the changes have occurred after the time of thirteen years has passed, buildings have decreased per 22%, water has increased per 1.2%, vegetation has increased per 15% and ground has also increased per 9%. As far as "Scene 2" is concerned, buildings have expanded per 3%, water has rised per 1%, vegetation has increased per 7% whilst ground has decreased per 11%. The previously-mentioned structural changes are shown in Table 2. The above-mentioned percentages in histograms can be found in compliance with visual inspection of Figure 1. In fact, the changes which are detected through Figure 2, thus by means of the water and vegetation indices of (Panagiotopoulou, Ragia, 2021), are in better accordance with Figure 1 than the changes which are recognized through Figure 3, so via the literature (Gupta et al., 2018;Tsikdogan et al., 2019) based indices. Figure 3. Red = buildings (Huang, Zhang, 2011;Huang al., 2017b), Green = Vegetation (Gupta et al., 2018), Blue = water (Tsikdogan et al., 2017;, Black = Ground (e.g., soil and roads): Multiindex urban representation of (a) QuickBird image of "Scene 1" (b) WorldView image of "Scene 1" (c) QuickBird image of "Scene 2" (d) WorldView image of "Scene 2".
In this work automatic change detection is carried out by pixelper-pixel comparison of the classified multi-temporal images that are shown in Figures 2-3. Figures 6 and 7 demonstrate the results of the pixel-per-pixel comparison for the two scenes under consideration. Subject to the color of the change detection map, a different structural change has occurred. Particularly, red, green and blue colors represent change in buildings, vegetation and water, respectively. Additionally, yellow color denotes simultaneous change in buildings and vegetation while magenta color shows concurrent change in buildings and water. With concern to turquoise color, it denotes coincident change in vegetation and water. No change or ground is shown with black color. Thereafter, the proposed automatic change detection method not only indicates whether changes have taken place or not but additionally determines the sort of changes.  The left column is derived from the usage of the indices of (Panagiotopoulou, Ragia, 2021) and the right column results from the indices of (Gupta et al., 2018;Tsikdogan et al., 2019).
Change detection is also performed by means of the unsupervised technique which is presented in (Celik, 2009). Kmeans clustering is applied on feature vectors which are extracted using local data projection onto eigenvector space, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France having been produced by principal component analysis. Figure  8 demonstrates the change detection results for both scenes. The technique in (Celik, 2009) indicates only whether changes have occurred or not. Thus, in Figure 8 white color denotes change while black color means that no change has taken place. The specific algorithm gets as input the images of Figure 1. However, the proposed change detection technique, whose results are illustrated in Figures 6 and 7, needs as input the information maps of Figures 2 and 3. The change detection results which come about from the two different methodologies, proposed one and that in (Celik, 2009), are not in full compliance. This can be expected due to the different nature of the applied methodologies. The performance of the methods is evaluated qualitatively. Additional experimentation would be needed in future work, with new image datasets having ground truth annotation, to carry out quantitative evaluation of the methods performance.
(a) (b) Figure 6. Change detection from the QuickBird image to the WorldView image of (a) "Scene 1" (b) "Scene 2" after 13 years have passed: Red = buildings, Green = vegetation, Blue = water, Yellow = buildings and vegetation, Magenta = buildings and water, Turquoise = vegetation and water, Black: no change or ground (proposed technique). The classified multi-temporal images of Figure 2 have been utilized.
(a) (b) Figure 7. Change detection from the QuickBird image to the WorldView image of (a) "Scene 1" (b) "Scene 2" after 13 years have passed: Red = buildings, Green = vegetation, Blue = water, Yellow = buildings and vegetation, Magenta = buildings and water, Turquoise = vegetation and water, Black: no change or ground (proposed technique). The classified multi-temporal images of Figure 3 have been utilized.

CONCLUSIONS
In the present work high-resolution satellite images from the QuickBird and WorldView satellites are used for automatic change detection in urban and suburban areas of Georgioupoli, in the island of Crete, Greece. There is a time difference of 13 years among the satellite images. During processing, all images present the same spatial resolution equal to 30 cm/pixel. Firstly, the image scene gets classified into buildings, vegetation, water and ground (i.e., soil and roads) via multiindex scene representation. Building detection is performed by means of the already existing from the literature morphological building index. As far as vegetation and water detection is concerned, these are performed by spectral indices from the literature. For comparison purpose, different water and vegetation detection techniques from the literature are also applied. Then, automatic change detection is carried out by pixel-per-pixel comparison of the classified multi-temporal images. The change detection methodology of this work provides indication about whether changes have taken place or not and also gives determined information concerning the kind of changes. The proposed method could prove useful during routine urban monitoring as well as contribute to content-based image retrieval and complex pattern recognition. In future experiments the proposed change detection method along with the best chosen spectral indices will be validated in a greater number of urban and suburban areas than now. Also, public domain image datasets with ground truth annotation will be utilized for better quantitative comparisons.