Remote Sensing Image Classification of Geoeye-1 High-Resolution Satellite

Networks play the role of a high-level language, as is seen in Artificial Intelligence and statistics, because networks are used to build complex model from simple components. These years, Bayesian Networks, one of probabilistic networks, are a powerful data mining technique for handling uncertainty in complex domains. In this paper, we apply Bayesian Networks Augmented Naive Bayes (BAN) to texture classification of High-resolution satellite images and put up a new method to construct the network topology structure in terms of training accuracy based on the training samples. In the experiment, we choose GeoEye-1 satellite images. Experimental results demonstrate BAN outperform than NBC in the overall classification accuracy. Although it is time consuming, it will be an attractive and effective method in the future.


INTRODUCTION
Image classification will still be a long way in the future, although it has gone almost half a century.In fact, researchers have gained many fruits in the image classification domain, but there is still a long distance between theory and practice.Therefore, image classification is an interest, open area and a bottle problem for photogrammetry and remote sensing.However, some new methods in the artificial intelligence domain will be absorbed into the image classification domain and draw on the strength of each to offset the weakness of the other, which will open up a new prospect.Consequently, this paper will apply a new method in the artificial intelligence domain----Bayesian networks (Friedman, N., 1997), to image classification domain.In general, Bayesian networks represent the joint probability distribution and domain (or expert) knowledge in a compact way and provide a comprehensive method of representing relationships and influences among nodes (or feature variables) with a graphical diagram.Accordingly, by virtue of advantages of Bayesian networks we will try our best to explore a new road to texture classification of High-resolution satellite images for achieving the automatization and intelligentization of photogrammetry and remote sensing.Since 1988, Pearl et al. had put up the concept of Bayesian Networks, which is a powerful of inference under conditions of uncertainty.However, In the beginning, Bayesian Networks were not considered as classifiers until the discovery that Naive Bayesian Network, a very simple kind of Bayesian Networks that assumes the features are independent given the class attribute (node), are surprisingly effective (Langley, P ,1992).From then, some researchers started to explore more deeply into Bayesian Networks as classifiers.Actually, the "Naive" independent assumption in Naive Bayesian Network can not be hold in many cases, so some researchers wondered whether the performance will become better if we relax the "strong and unrealistic" independent assumption among features (Yu Xin, 2005; D. Heckerman, 1995).Thus, this paper puts up a new method, to construct the topology structure of Bayesian Network Augmented Naive Bayes (BAN), and it can resolve the forenamed problem (or assumption), because it allows arbitrary relation (arc) among features, which can be obtained in terms of training accuracy based on training data (samples).In addition, in order to validate the feasibility and effectivity of BAN, we will apply BAN to the texture classification of High-resolution satellite images.This paper is organized as follows.In section 2, we review some basic concepts of Bayesian Networks Augmented Naive Bayes (BAN) and then we introduce the mathematic model of BAN in detail in section 3. Then in section 4 we test on Highresolution satellite images (GeoEye-1) based on BAN.Finally Section 5 describes the experiments and draws some conclusions.

BAYESIAN NETWORKS FOR TEXTURE CLASSIFICATION OF HIGH-RESOLUTION SATELLITE IMAGES
In this section, we simply introduce some basic concepts about Bayesian Networks and then apply it to texture classification of High-resolution satellite images.

Bayesian Networks
Bayesian Network is one kind of effective inference methods in artificial intelligence and expert systems.In the Bayesian networks, the nodes express the variables and the arcs express a probabilistic relationship among the connected variables.In the Figure 1  This contribution has been peer-reviewed.doi:10.5194/isprsarchives-XL-4-325-2014node i X is the parent node of the node j X and by contraries the node j X is the child node of the node i X , as a rule we denote the parent set of the node i X by () ai PX .
Accordingly, we can compute the joint probability of all the variables based on the definition of Bayesian Networks.
In fact, Bayesian Network is accustomed to naming Bayesian Networks Classifiers in the classification domain (Cheng, J., 1999).To relax the independent condition in Naive Bayes Classifiers (NBC), some researchers put up Tree Augmented Naive Bayes Classifiers (TAN), which extends Naive Bayes by allowing the nodes to form the topology structure similar to a tree.Whereas, the tree topology structure can not express inherent relations among features either.Wherefore, this paper will apply Bayesian Network Augmented Naive Bayes (BAN) to classification and BAN classifiers that extend Tree Augmented Naive Bayes (TAN) classifiers by allowing the features to form an arbitrary graph rather than just a tree (Friedman, N., 1997).() Bayesian Networks learning includes two steps: topology structure learning and parameters learning (Cheng, J., 2001).In fact, learning the topology structure means to get some relationship among the features and the parameters learning means to estimate the parameters of the assumed probability density (distribution) from the training samples with known class label.Whereas, structure learning is more difficult than parameters learning (D.Heckerman, 1997), which is still an open problem since Bayesian Networks are put up.Hence, in order to get a Bayesian Network from data, some researchers put up many methods.There are two kinds of methods to learn the topology structure.One is the scoring-based learning algorithm (Jiebo Luo, 2005), that find one certain structure that maximizes the Bayesian, MDL or Kullback-Leiber (KL) entropy scoring function (D.Heckerman, 1995) and the other is CI-based algorithm (Yu Xin, 2007) (the conditional independent test such as Chi-squared test and mutual information test).In this paper, we put up a new method to acquire the topology structure of BAN in terms of the training accuracy based on training samples.Firstly one certain topology structure is given as initial structure and then estimate the relevant parameters based on training samples.And we regard the training samples as test samples to test and get the overall accuracy of the training samples, which is named the training accuracy.Then search all possible network topology structures and get the corresponding training accuracy.Among them, there exists the maximum training accuracy, whose topology structure is considered as the best one for fitting the training samples (YU XIN, 2008).

Mathematic Model and Inference of BAN
Suppose s X is one arbitrary feature in some BAN and for sake of simplicity the parent set of s X is denoted as p X (i.e. () s Pa X ).Ordinarily, we use a capital letter, like X to denote a random variable and a particular value of a random variable will be denoted with a lower case letter, in this case x .
Commonly, it is reasonable to assume that ~( , ) The joint-normal probability density can be specified in following ways (Cui, Xizhang, 2001).
According to Bayes' rule, we can compute the conditional Where ~s Where s dX denotes extremely small step and is regarded as one constant in the computation.Thus, based on formula (1) and ( 9), we can get the joint probability of all variables based on the following formula.( , , , , ) ( ) ( | ) Where i C denotes the class label (variable) and 1, , im  represents the relevant class.

The Classification Scheme
To

Experimental area
GeoEye-1, launched in September 2008, is the latest in a series of commercial high-resolution Earth observation satellites.With its ground sample distance (GSD) of 0.41m for the panchromatic band, GeoEye-1 offers the highest resolution yet available to the spatial information industry.However, for commercial users, image products are down-sampled to 0.5m GSD (Clive S. Fraser, 2009).In the experiment, we choose GeoEye-1 images, which was achieved in 2010 and located in Beijing, China.And the size are 1906 pixel×1816 pixel and it covers an area of about 3.5 square kilometres.In order to validate the feasibility of BAN applied in the texture classification of High-resolution satellite images, the above GeoEye-1 images in Beijing are used in the experiments.In fact, there are five classed, such as houses, roads, grass, hills and rivers in the images, which is shown in Figure 2.

Experimental results
The classification accuracy is calculated based on the confusion matrix, which contains information about the correct classification and misclassification of all classes.To evaluate the efficiency of BAN, classification results were calculated based on BAN and Naive Bayes Classifiers (NBC) in terms of overall classification accuracy.The experimental results are shown in Table 1.

CONCLUSIONS AND FUTURE WORK
Bayesian Network is a directed acyclic graphic model.In this paper, Bayesian Network Augmented Naive Bayes (BAN) is used for texture classification of High-resolution satellite image (GeoEye-1).Experimental results show that BAN outperforms than NBC.However, search all possible network topology structure needs a great deal of time.In addition, extracting seven kinds of texture features from each classification unit, whose appropriate window size of high-resolution satellite image we need to study deeply.

C 1 X 2 X 3 X 4 X 5 X 6 X 7 XFigure 1 .
Figure 1.An Example of BAN Applied In the Classification Figure 1 is an example of BAN applied in the classification domain.The node C denotes the class label and 16 ,, XX and 7X denote the texture feature that are extracted from image classification unit.Thus, according to the above definition of Bayesian networks, we can compute the following probabilities.
a new ( 1 n  )-dimensional normal random vector X with corresponding mean vector X

2. 3 1 X 3 X 5 X
Texture Extraction and Description Texture features are very important in image classification domain for a long time.So far, many approaches have been proposed.Usually, these methods can be divided into two types: structural (transform-based) texture features, such as Skewness statistics( ), information entropy( 2 X ), and inverse difference moment based on the gray co-occurrence matrix ( ), and statistical texture features, such as .the mean of LL sub-image ( 4 X ), and standard deviation of LH sub- image ( ) and HL sub-image ( 6 X )at the first decomposition level through the Symlets wavelet transform (Yang S, 2002)and fractal feature ( 7 X ).

⑦
show BAN applied in the texture classification of Highresolution satellite images in detail, the complete classification scheme is summarized below (YU XIN, 2008).① Training and testing samples of each class are chosen from the whole image; ② Extract seven kinds of texture features from each classification unit; ③ Arbitrarily select some network topology structure as the initial structure and estimate the parameters ~c  and ~cc D of each class (in the formula (6)) based on the training samples by the formula (7) and (8).And then the training samples are regarded as the testing samples to be classified and the initial training accuracy is obtained.④ Search all possible network topology structure and learn relevant parameters based on the training samples.And then we can get the corresponding training accuracy of the different topology structures; ⑤ Single out the topology structure of the best training accuracy as the final training results; ⑥ In terms of the training results, we can compute the posterior probability of a new (unknown class label) sample X Statistical analysis of classification results.

C 1 X 2 X 3 X 4 X 5 X 6 X 7 XFigure 3 .
Figure 3.The Topology Structure of BAN

Figure 4 .
Figure 4. Classification Image Based On the BAN

Table . 1
The Comparison results of Two Methods Table 1 displays comparison of the accuracy among two methods in the condition of different training samples, and N denotes the number of training samples of five classes.The best mean overall classification accuracy is 86.2% (BAN).As expected, BAN gives better classification results than NBC.And Figure 3 is the network topology structure of BAN when the training accuracy is best and the number of training samples is 200. Figure 4 is the classification image based on BAN.