A DATA FIELD METHOD FOR URBAN REMOTELY SENSED IMAGERY CLASSIFICATION CONSIDERING SPATIAL CORRELATION

Spatial correlation between pixels is important information for remotely sensed imagery classification. Data field method and spatial autocorrelation statistics have been utilized to describe and model spatial information of local pixels. The original data field method can represent the spatial interactions of neighbourhood pixels effectively. However, its focus on measuring the grey level change between the central pixel and the neighbourhood pixels results in exaggerating the contribution of the central pixel to the whole local window. Besides, Geary’s C has also been proven to well characterise and qualify the spatial correlation between each pixel and its neighbourhood pixels. But the extracted object is badly delineated with the distracting salt-and-pepper effect of isolated misclassified pixels. To correct this defect, we introduce the data field method for filtering and noise limitation. Moreover, the original data field method is enhanced by considering each pixel in the window as the central pixel to compute statistical characteristics between it and its neighbourhood pixels. The last step employs a support vector machine (SVM) for the classification of multi-features (e.g. the spectral feature and spatial correlation feature). In order to validate the effectiveness of the developed method, experiments are conducted on different remotely sensed images containing multiple complex object classes inside. The results show that the developed method outperforms the traditional method in terms of classification accuracies.


INTRODUCTION
High-spatial-resolution remotely sensed imagery has been considered as an important data in urban application. As it contains not only a large amount of spectral information but also rich spatial information. Additionally, the latter is one of the characteristic features in images for its robustness. Thus, representing and modelling spatial structure become a vital step in high-spatial-resolution remotely sensed imagery interpretation and information extraction. Methods of remotely sensed imagery classification using spatial autocorrelation characteristics are well established. Spatial statistics are calculated on the variability in digital numbers or brightness values within local windows in addition to the values of the original spectral bands (Purkis, 2006).The image texture can be described on two dimensions. The first dimension is its tonal primitives and the second dimension is for the description of the spatial dependence or interaction between the primitives of an image texture (Haralick, 1979). The image spatial autocorrelation can be used to describe the interaction between the pixel (Craig, 1979) and (Campell, 1981). Getis Index, when used in remote sensing image processing, not only calculates spatial dependence but also describes the impact on the central pixel from its neighbourhood pixels (Wulder, 1998). Moran's I and Geary's C are relatively more powerful than Getis Index in characterizing complex spatial arrangementsof objects and features in the classification of remotely sensed images (Myint, 2007.). Among these general spatial autocorrelation statistics, Geary's C can describe the whole spatial distribution pattern well in terms of the local difference between pixels of an image. It dues to its sensitivity to edge information and leads to extracting heterogeneous regions better.
Unfortunately, the result contains much small patches. However, we introduce the data field method to filter and limit the noise. By considering each data object as a particle with mass related to data space, we can use the data field method to describe the complex interaction among data objects (Li, 2005). In addition, we enhance the original data field method by considering each pixel in the window as the central pixel to compute statistical characteristics between it and its neighbourhood pixels. The remainder of this paper is organized as follows. Section II introduces the data field method and Geary's C statistic respectively and also tells the similarity and difference between them. The image classification method used in this paper is presented in Section III. In Section IV, a comparative study is made between the proposed method and some traditional methods. Section V concludes this paper.

Geary's C Statistic
Geary's C statistic is a local indicator measuring spatial dependence for each pixel. The standardized Geary's C statistic is defined as (Geary, 1954): where C = value of Geary's C N = the number of spatial units indexed by i and j X = variable of interest X= the mean of X Wij = matrix of spatial weights W = the sum of all wij Wij = 1 if point j is within the local window of point i; otherwise Wij = 0. And the value of C lies between 0 and 2. 1 means no spatial correlation. The value lower than 1 indicates increasing positive spatial correlation, while values higher than 1 illustrate negative spatial correlation. Geary's C is used to calculate the degree of local spatial autocorrelation in this paper.

Data Field
Inspired by the short-range nuclear forces field theory in the physical world, data field used in images processing is a method taking each pixel in the image as the data object with mass (Wu, 2012).
where φx(y) = potential value ||y-x|| = the distance between object y and point x m= the mass of object y k= the distance index The mass relates to the grey level value. In the local region, each pixel has interaction with other pixels, and the magnitude of the interaction is determined by the corresponding potential value. The spatial distribution or topological structure of a data field mainly depends on the influential range of interaction among data objects and is little affected by the form of potential function and the index k of distance term. In real applications, the Gaussian potential function (i.e., nuclear-like potential function with k =2) representing a short-range field is often adopt to model the distribution of data fields due to its good mathematic properties. According to 3σ law of Gaussian function, the influential range of object interaction is usually defined as where R = influential range of a data object When the distance is more than R, the power of this object is so weak that it can be neglected.
In a data space with more than one data object, the potential value of any position under these circumstances can be obtained based on the following principle. Given a data field produced by a data set D = {x1,x2,⋯,xn} in space Ω ⊆ R^p, the potential at any point x ∈ Ω can be calculated as where φ(x) = potential value ||x-xij|| = the distance between object x and point xij ρij= the mass of object xij

Comparison Between Geary's C and Data Field
Comparing the two methods above, we can see that they present almost in the same way in terms of the form of equations. The comparison can be seen in Table 1.  Table 1 shows that both data field method and Geary's C statistic describe the local spatial correlation between objects considering two parts-the attribution correlation and the spatial weights. ρij is to the data field method what (Xi-Xj) 2 is to Geary's C statistic. Although these two are calculated in different ways, they share the same goal of measuring the attribution correlation of objects. The same goes to comparison of the computation approach of the spatial weights. Above in all, as can be seen from the comparison of the equation of the data field method and Geary's C statistic, they work in essentially the same way and can be concluded as where F = value of spatial correlation in the local window x, y = the number of two position in the local window ρ(x),ρ(y) = the attribution correlation of two position W = the spatial weights Furthermore, we enhance the original data field method by considering each pixel in the window as the central pixel to compute statistical characteristics between it and its neighbourhood pixels. Thus each pixel will have the same impact on the feature value of the window and the local spatial structure will be described and modelled more completely. Figure 1 shows different description ways of the original data field and the enhanced data field.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLI-B7, 2016 XXIII ISPRS Congress, 12-19 July 2016, Prague, Czech Republic a) original data field b) enhanced data field Figure 1 Figure different description ways of the original data field and the enhanced data field As Figure 1a) shows that the potential value at point x33 can be calculated as   When we use the enhanced data field method to describe the local spatial correlation, as Figure 1b) The formula of the enhanced data field shows that we take the interaction among all the 24 neighbourhood pixels into account instead of direct nearby 8 pixels, when calculating the value of the central pixel X33. So it weakens the contribution of the central pixel to the whole window, which avoids presenting a one-side local spatial structure. Then, given a data field produced by a data set D = {x1,x2,⋯,xn} in space Ω ⊆ R^p, the potential at any point x∈Ω can be calculated as where φ(x) = potential value ||x-xij || = the distance between object x and point xij ρij= the mass of object xij

COMBINATION OF GEARY'S C AND DATA FIELD
An experiment is performed to show how the spatial statistical information obtained by the Geary's C statistic and data field method respectively impact on the imagery classification. For the evaluation of the classification, we used the UC Merced Land Use Dataset which is shown in Figure 2. As we can see, boundaries are well extracted. In other words, Geary's C does well in distinguishing different objects. However, there are still much patches and noise, which will lead to low accuracy. Figure 3 Geary's C feature Figure 4 represents the data field feature of figure 3. It is obvious that data field method can filter and limit the noise effectively.

EXPERIMENT
In order to validate the effectiveness of the proposed algorithm for texture feature representation and the classification of highspatial-resolution remotely sensed imagery, the proposed method was evaluated with QuickBird datasets of Wuhan City. Moreover, Gram-schmidt Pan Sharpening method, which is implemented in environment for visualizing images (ENVI) program, was used to merge high spatial resolution panchromatic images with high spectral resolution multispectral images to improve spectral quality of the test image. In addition, some other spatial information extraction methods, including local Moran's I (L_I), local Geary's C (L_C), local Getis (L_G) were employed for the purpose of performance comparison. In addition, we employed the support vector machine (SVM) to realize the effective integration of spectral and spatial features. Since all the methods above have the influential range. Repeated experiments were performed with multiple window sizes to select the proper parameter for local spatial autocorrelation analysis.

Data Set
The test image, a three-band natural-color image, is shown in Figure 5, comprising 889 rows and 1039 columns with a spatial resolution of 0.61 m. As can be seen, the test image 1 contains many kinds of typical ground objects in urban areas (different types of buildings, water, grass, trees, roads, and shadow). The number of samples in the training set and test set for different objects is shown in table 2. Among the six kinds of land cover class, building and water cover most of the image. Thus, the training set and test set of building and water are selected more than other objects.  Figure 6 shows classification results using four different feature extraction methods, spectral bands with L_I, spectral bands with L_G, spectral bands with L_C and spectral bands with the proposed method (D_C). The classification accuracies (overall accuracy and Kappa coefficient) achieved by different texture extraction algorithms with their optimal parameters are presented in Table 3.   Table 2, we can draw the following conclusions:

Classification Results
(1) The classification accuracy achieved by local Geary's C is equal to local Getis and is a little higher than local Moran's I.
(2) Data field method can reduce small patches to improve the classification accuracy of local Geary's C statistic. However the overall accuracy increases only by 0.98%.
(3) Window size definitely affected classification accuracy. For the spatial correlation calculated using moving windows across the whole image, every method has its own optimal window size.

CONCLUSIONS
In this paper, the data field-based method is introduced to filter and reduce patches caused by Geary's C statistic. Data field is generated inspired by the short-range nuclear force's field in the physical world, and the potential value in the data field is as the measurement of the grey scale changes in the remotely sensed images. Compared with the relative methods, experimental results show that images can be classified effectively and efficiently by using the new technique. A possible future research direction is to study on the physical properties of the data field method like the superposition principle and extend its application to other domains.