PURIFICATION OF TRAINING SAMPLES BASED ON SPECTRAL FEATURE AND SUPERPIXEL SEGMENTATION

Remote sensing image classification is an effective way to extract information from large volumes of high-spatial resolution remote sensing images. Generally, supervised image classification relies on abundant and high-precision training data, which is often manually interpreted by human experts to provide ground truth for training and evaluating the performance of the classifier. Remote sensing enterprises accumulated lots of manually interpreted products from early lower-spatial resolution remote sensing images by executing their routine research and business programs. However, these manually interpreted products may not match the very high resolution(VHR) image properly because of different dates or spatial resolution of both data, thus, hindering suitability of manually interpreted products in training classification models, or small coverage area of these manually interpreted products. We also face similar problems in our laboratory in 21st Century Aerospace Technology Co. Ltd (short for 21AT). In this work, we propose a method to purify the interpreted product to match newly available VHRI data and provide the best training data for supervised image classifiers in VHR image classification. And results indicate that our proposed method can efficiently purify the input data for future machine learning use.


INTRODUCTION
In recent decades, great process has been made in developing and launching satellites, which makes it easily to assess high spatial resolution remote sensing images.Fine spatial optical sensors with metric or sub-metric resolution, such as QuickBird, Ikonos, Worldview, as well as Chinese Beijing series, allow more detailed and accurate information extraction task.Data from these sensors enable advanced applications, such as urban mapping, precision agriculture, environmental monitoring, and military applications (Verpoorter et al. 2014 andGuan et al. 2017).Among these applications, image classification is one of the most vital phases for remote sensing image information extraction.Generally, supervised image classification relies on abundant and high-precision training data, which is often manually interpreted by human experts to provide ground truth for training and evaluating the performance of the classifier (Huang et al. 2015).In addition, in most case, training samples labelling takes lots of time and the labelled samples do not work when images changed.In this sense, providing exhaustive ground truth for large remote sensing images is often not possible.Hence, there is an urgent demand to develop a time-saving and accurate sample labelling framework.
In this context, researchers have proposed semi-supervised classification to deal with the insufficient training samples, by taking the unlabelled samples into consideration (Camps-Valls et al. 2007).Meanwhile, active learning has received increasing attention in recent years, which aims to minimize the cost of training sample labelling process (Demir et al. 2011, Di et al. 2012, Patra et al. 2014, Persello et al. 2011and Persello et al. 2012).Both semi-supervised classification and active learning can work with few training samples, however, it also needs to * Corresponding author develop specific classifier.Thus, semi-supervised classification and active learning are super solutions for scientific study when there are limited training samples.However, for actual production project in remote sensing enterprises, it seems unrealistic to develop a new classification algorithm.Taking advantage of the existing data and traditional classifiers is vital import for timely and efficient actual production project of remote sensing enterprises.In the current literature, there exist studies exploiting the crowdsourced OpenStreetMap(OSM) data as training samples for high-resolution remote sensing classification, where OSM data seems a time-saving and costeffective way to provide labelled data for image classification (Arsanjani et al. 2013 andJohnson et al. 2016).However, due to the unprofessional production process and the absence of data quality control, OSM data could contain misleading errors.To the best of our knowledge, there have been few papers discussing insufficient labelling in actual production project.
To date, remote sensing enterprises accumulated lots of manually interpreted products from early lower-spatial resolution remote sensing images by executing their routine research and business programs.These interpreted products describe the land cover by assigning each patch a class label, while each patch may contain multiple land cover classes.Inspired by the exploitation of OSM data as training samples in classification, our work attempt to purify the interpreted products to competent of training samples for remote sensing classification (Verpoorter et al. 2016).Due to the actual production needs, the interpreted products may contain misleading errors where a patch may contain multiple classes.Moreover, these manually interpreted products may not match the VHR image properly because of different dates or spatial resolution of both data, thus, hindering suitability of manually interpreted products in training classification models, or small coverage area of these manually interpreted products.We also face similar problems in our laboratory in 21st Century Aerospace Technology Co. Ltd (short for 21AT).
In this work, we propose a method to purify the interpreted product to match newly available VHR data and provide the best training data for supervised image classifiers in VHR classification.Specifically, interpreted products are processed based on spectral feature and superpixel segmentation to generate training samples.The reminder of this paper is constructed as follows.Section II describes the study sites and data sets.Section III presents the methodology of our work.Section IV describes the experimental results and analysis.Finally, section V concludes our work.

DATASETS
As shown in Figure 1, a high-resolution RGB image obtained by Beijing-2 satellite over Haidian District in Beijing, China, is utilized in the experiment.The RGB image was pre-processed by stitching and even colour in terms of the specific requirements of actual production project.The Beijing-2 is a satellite constellation, which has three satellites each carries a 1m resolution panchromatic sensor and a 4m resolution multispectral sensor.More details of Beijing-2 satellite parameter are list in Table 1.This satellite constellation has been operated by the 21AT since July 10 2015 (Wen et al. 2017).The Twenty First Century Aerospace Technology Co., Ltd. is a Beijing based hightech enterprise and is the first commercial Earth observation satellite operator and service provider in China.The satellite constellation can provide daily targeting capability anywhere on Earth.The interpreted product is manually interpreted from previously acquired coarser resolution images in 21AT.The interpreted product of Haidian district is shown in Figure 2   samples for high resolution image classification.In order to reduce the misleading mistakes introduced by the coarse interpreted product, it is very crucial to match the interpreted product accurately with the high resolution image to be classified.Therefore, our work proposed a purification framework, which exploits the interpreted product to generate training samples for high resolution classification, as illustrated in Fig. 3.It includes the following two points: (1) Superpixel segmentation and decision fusion between interpreted product and high resolution remote sensing images.
(2) Reassign the class label of each pixel in the interpreted product according to three designed spectral analysis indexes.

Superpixel segmentation
To relief the boundary offset between the interpreted product and the high spatial resolution image, our study employs surperpixel segmentation to decline the boundaries according to the high spatial resolution image.Firstly, a superpixel segmentation algorithm is applied on the RGB image, dividing the image into a series of superpixels.In this work, simple linear iterative clustering superpixels(SLIC) (Achanta et al. 2010) is employed.Supposing an image is to be divided into k superpixels.Pixels are clustered according to their colour similarity and proximity in the image plane.Besides, two parameters, number of desired superpixels and weighting factor between colour and spatial differences, need to be defined when applying the SLIC segmentation on the image.Subsequently, the class boundary of interpreted product is refined according to the superpixel segmentation image by decision fusion.In addition, the decision fusion is realized by the majority voting according to the following rule:  Dark vegetation shows similar spectral characteristics with water in Beijing-2 in our study, it is important to ensure attribute correctness of label which to be training samples for classification.To exclude misleading labels between dark vegetation and water, the VWD is presented to verify the class of interpreted product is matched with VHR image.
(3) Aquatic Plants Index(API) API=2*G-(R+B) (4) In the interpreted product, the water surface which is covered with aquatic plants may be marked as water.In fact, aquatic plants belong to vegetation according to spectral characteristics.Therefore, it is necessary to filter out the aquatic plants from water to make sure the best separability of vegetation and water.
It is found in our study that the API can discriminate the aquatic plants from water, effectively.By applying the above indexbased label refinement on the interpreted product, labels with uncertainty are corrected to the more appropriate class.

Region1
Region2 Region3 interpreted product purified labels interpreted product purified labels interpreted product purified labels

EXPERIMENTS RESULTS AND ANALYSIS
Three test images are utilized to validate the proposed method for the automatic training samples labelling framework.In the experiments, the labels purified by the proposed method are compared with the manually labelled interpreted product according to intra-class purity and inter-class separability.The test regions with the corresponding interpreted product and purified labels are displayed in Figure 4.As it can be seen from Figure 4, class labels are more accurate after purification, which are more qualified to be training samples for pixel-based classification.Water surface which are covered with aquatic plants are all fixed to vegetation.In addition, vegetation around buildings is excluded from build-up areas, where in the interpreted product they are confusing.Besides, bare soils that are caused by image acquisition time are adjusted to the Beijing-2 image.Follow-up is about the quantitative evaluation of the experimental results.In our study, intra-class variance is adopted to describe the intra-class purity, as variance is an important indicator of data dispersion.The larger value of the variance, the more significant of the volatility it is.In other words, if the variance decreased after purification, the intra-class purity is increased.The intra-class purity for the interpreted product and purified labels are listed in Table 2.It is evident that intra-class purity increased significantly after purification for all the three test regions.Besides, JM distance and transformed divergence are calculated to measure the intra-class separability.These values range from 0 to 2.0 and indicate how well the selected training samples are statistically separate.A larger value for JM distance and transformed divergence represent higher intra-class separability.Table 3, Table 4 and Table 5 exhibit the inter-class separability of interpreted product and purified labels for the three test regions.As can be learned from these tables, inter-class separability for the interpreted product is rather poor, with JM distance and transformed divergence for vegetation-build-up areas and vegetation-water is lower than 1.0.However, after purification, the inter-class separability is improved significantly, as all the values are raised to above 1.Besides, the transformed divergence for water-building is even increased to 1.99, which implying these two class are statistically separate.

CONCLUSIONS
In this paper, an innovative method that exploits interpreted product and remote sensing image for the generation of training samples is proposed.To make the interpreted product pure enough to be training samples of high resolution image classification, a series of approaches is used successively.To verify the effectiveness of our proposed purification framework, three test images and their corresponding interpreted product are utilized to generate training samples for high resolution image classification.Meanwhile, intra-class purity and inter-class separability are employed to evaluate the quality of the purified training samples.The experimental results illustrate the superiority of the proposed method in terms of quantitative accuracy and visual interpretation.Further research lies in the utilization of the purified training samples for high resolution image classification.

Figure 1 .
Figure 1.Pre-processed Beijing-2 image over Haidian District with test regions marked as A, B, and C.In this study, three test regions are selected to validate the proposed framework for the purification of interpreted product according to remote sensing images, with each sub-image covering about a study area of about 4.2 km×3.1 km, 4.4 km×3.9

Figure 2 .
Figure 2. The corresponding interpreted product of Haidian District the class label for each pixel of the interpreted products, and s the superpixel to which the pixel belongs.This step aims to eliminate misleading pixels, which refer to the boundary offsets of the same object revealed in the interpreted product and remote sensing image.

Figure 4 .
Figure 4. Test regions and the corresponding interpreted product and purified labels map

Table 1 .
, which contains three classes: vegetation, built-up area, water.Parameters of Beijing-2 satellite

Table 2 .
The intra-class purity for the interpreted product and purified labels of three test images

Table 5 .
The inter-class separability for Region 3