BUILT-UP AREA DETECTION FROM HIGH-RESOLUTION SATELLITE IMAGES USING MULTI-SCALE WAVELET TRANSFORM AND LOCAL SPATIAL STATISTICS

Recently, built-up area detection from high-resolution satellite images (HRSI) has attracted increasing attention because HRSI can provide more detailed object information. In this paper, multi-resolution wavelet transform and local spatial autocorrelation statistic are introduced to model the spatial patterns of built-up areas. First, the input image is decomposed into highand low-frequency subbands by wavelet transform at three levels. Then the high-frequency detail information in three directions (horizontal, vertical and diagonal) are extracted followed by a maximization operation to integrate the information in all directions. Afterward, a cross-scale operation is implemented to fuse different levels of information. Finally, local spatial autocorrelation statistic is introduced to enhance the saliency of built-up features and an adaptive threshold algorithm is used to achieve the detection of built-up areas. Experiments are conducted on ZY-3 and Quickbird panchromatic satellite images, and the results show that the proposed method is very effective for built-up area detection.


INTRODUCTION
In recent years, built-up area detection from high-resolution satellite images (HRSI) has attracted increasing attention because HRSI can provide more detailed object information; therefore, the finer-scale built-up areas can be detected and more accurate boundary can also be obtained.However, builtup areas are compound geographical objects consisting of different types of man-made structures, and thus the textural and structural features in HRSI become clearer as well as more complex due to the increased spatial resolution, which makes it more challenging to accurately detect built-up areas in HRSI than in medium-and low-resolution images.
Many methods have been proposed to model the textural and structural patterns for built-up area detection in HRSI.A builtup presence index (PanTex) was constructed based on anisotropic rotation-invariant textural measures by the graylevel co-occurrence matrix (GLCM) to describe the textural features of panchromatic satellite data for the discrimination of built-up areas (Pesaresi et al., 2008).However, PanTex is suitable for satellite images with a resolution about 5m (e.g.SPOT-5), rather than higher resolution.
With the improvement of spatial resolution, it becomes more difficult to accurately detect built-up areas due to their spectral confusion and spatial complexity.Much effort has been made to overcome this issue.The local feature points based on Gabor filters were used to locate the buildings followed by spatial voting to achieve the detection of urban areas (Sirmacek and Ünsalan, 2010).To better locate built-up areas, corner points and straight lines were employed to indicate the existence of building features and the spatial voting algorithm was also used for modeling their spatial distribution (Tao et al., 2013;Chen et al., 2016;Ning and Lin, 2017).However, using only local corner or line features are not sufficient to discriminate between built-up and non-built-up areas in complex scenes.Also, the spatial voting is a global algorithm and the computing time will increase sharply when the number of feature points or lines is large (Li et al., 2015).
In this paper, we introduce multi-resolution wavelet transform and local spatial statistics to model the spatial patterns of builtup areas in HRSIs.By multi-resolution wavelet decomposition, the high-frequency subbands representing the detail information were extracted and fused to construct a saliency map, which was then further modulated and enhanced by Getis-Ord statistic.Based on the derived saliency map, an adaptive threshold technique is utilized to achieve the detection of built-up areas.

Feature Representation Based on Wavelet Transform
In this study, wavelet transform (WT), a well-known theory in signal processing, is used to model the spatial textures and structural features of built-up areas in HRSI.The input image can be decomposed into a low-frequency approximation and its high-frequency detail information at a coarser spatial resolution.Give an image f (x,y) at spatial resolution L, the decomposition process can be expressed as follows (İmamoğlu et al., 2012).[ , , , ] ( ) where integer L is a decomposition level; AL is the lowfrequency approximation component; HL, VL and DL represent the high-frequency detail coefficients of three different directions (horizontal, vertical and diagonal, respectively).
In our model, 3-level decomposition by WT is used for an input image, and the detail information at three levels are extracted to generated feature maps.It should be noted that 9 maps will be derived by this procedure, because it includes three directions at each level.To integrate these information at multiple levels and in multiple directions, a feature fusion method is further introduced to obtain one feature map by utilizing two mathematical operations.More specifically, the fusion method is implemented as follows.
( , ) max{ ( , ), ( , ), ( , )} Next, to take advantage of the multi-scale features of built-up objects, the high-frequency detail information at three levels are then combined into one feature map by the following acrossscale addition operation where  stands for across-scale addition, which first interpolate IL (L=1, 2, 3) to have the same size with the original input image and then implement a point-to-point arithmetic addition operation.
By now, an integrated feature map has been generated which enables the built-up areas to stand out from their background.Thus, it can also be referred to as a saliency map from the perspective of visual saliency in computer vision.

Feature Enhancing Using Local Spatial Statistic
In order to enhance the saliency in built-up areas while suppressing it in non-built-up areas, this paper introduces the well-known Getis-Ord statistic to model and modulate the spatial distribution of saliency values.The Getis-Ord statistic was originally designed to measure spatial autocorrelation in spatial statistics (Ord and Getis, 1995), which can be expressed as Using Getis-Ord statistic, the saliency map can be modulated and the contrast between built-up and non-built-up areas can also be further enhanced, which would be benefical to segment the built-up areas by the threshold-based algorithm.

Built-up Area Segmentation Using Otsu Algorithm
Many methods have been proposed for image thresholding, and among them, Otsu algorithm is an adaptive threshold technique based on the criterion of maximum between-class variance, which is very suitable for binary classification.Therefore, this algorithm is used to select the optimal threshold for built-up area segmentation based on the derived saliency map.

EXPERIMENTS AND RESULTS
To verify the validity of the proposed method，experiments were conducted on two image datasets with different resolutions.The first dataset is from Chinese ZY-3 satellite, which was launched on January 9, 2012.The panchromatic band with a resolution of 2.1m was used.The other one is composed of panchromatic Quickbird images with a resolution of 0.61m.To quantitatively evaluate the accuracy of built-up area detection, three commonly used indices, that is, precision P, recall R and F-measure are used.More specifically,

P=TP/(TP+FP)
(5) where TP and FP denote the number of true built-up area pixels and non-built-up area pixels in the extracted built-up areas, respectively; FN denotes the number of true built-up area pixels in the extracted non-built-up areas; F is a composite indicator of the precision P and the recall R.
In this experiment, the window size s(s=2d+1) in Getis-Ord statistic is the only parameter needed to be set, which determines in what extents or scales the local spatial autocorrelation information is utilized to calculate the saliency for each pixel; therefore, it can further affect the final detection result.We have test different values for parameter s on each image data, and the results show that the F-measure will first increase and then decrease as s increases from a small window size (e.g.s=3).Taking the first image (i.e.ZY-3-1 in Table 1) for example, the change curve of F-measure with s is presented in Figure 1, which get the peak value 0.8829 when s=39.In the same way, we can obtain the optimal F-measures for all the test images as shown in Table 1.Overall, the F-measures are all high for both ZY-3 and Quickbird image data.This fact indicate that the proposed model is very effective for the test data.To see the detection results more intuitively, the results of partial test images are shown in Figure 2 and Figure 3, where the first column are original images, the second column are ground truths and the automatic detection results are presented in the last column.By comparing each detection result and its ground truth, it can be found that the experimental results are very good, and they are closely approximate to their ground truths.What's more, although the test images include complex textures and man-made structures, the detected areas are still complete with well-defined shape, which are beneficial to further raster-to-vector conversion in some real applications.show that the proposed method is very effective for built-up area detection.

Original Images Ground Truths Detection Results
In future work, we will focus on how to automatically determine the optimal parameter (i.e.window size s) in Getis-Ord statistic operation.In addition, more image data with different scenes are necessary to further verify the robustness of the proposed method, and the comparing experiments are also needed to indicate its performance.
spatial weight matrix.In our model, we use symmetric binary weights, with ones assigned to all locations within distance d of pixel i, and zero otherwise.Here, we use s*s (s=2d+1) rectangular neighborhood to calculate * () i Gd for each pixel i, and d is a pixel distance.

Figure 1 .
Figure 1.The change of F-measure with s for ZY-3-1 image Sensor ID P R F

Figure 2 .
Figure 2. Built-up area detection results for part of the ZY-3 images

Table 1 .
Accuracy evaluation results for the test images