DECISION TREE REPOSITORY AND RULE SET BASED MINGJIANG RIVER ESTUARINE WETLANDS CLASSIFACTION

The increasing urbanization and industrialization have led to wetland losses in estuarine area of Mingjiang River over past three decades. There has been increasing attention given to produce wetland inventories using remote sensing and GIS technology. Due to inconsistency training site and training sample, traditionally pixel-based image classification methods can’t achieve a comparable result within different organizations. Meanwhile, object-oriented image classification technique shows grate potential to solve this problem and Landsat moderate resolution remote sensing images are widely used to fulfill this requirement. Firstly, the standardized atmospheric correct, spectrally high fidelity texture feature enhancement was conducted before implementing the object-oriented wetland classification method in eCognition. Secondly, we performed the multi-scale segmentation procedure, taking the scale, hue, shape, compactness and smoothness of the image into account to get the appropriate parameters, using the top and down region merge algorithm from single pixel level, the optimal texture segmentation scale for different types of features is confirmed.Then, the segmented object is used as the classification unit to calculate the spectral information such as Mean value, Maximum value, Minimum value, Brightness value and the Normalized value. The Area, length, Tightness and the Shape rule of the image object Spatial features and texture features such as Mean, Variance and Entropy of image objects are used as classification features of training samples. Based on the reference images and the sampling points of on-the-spot investigation, typical training samples are selected uniformly and randomly for each type of ground objects. The spectral, texture and spatial characteristics of each type of feature in each feature layer corresponding to the range of values are used to create the decision tree repository. Finally, with the help of high resolution reference images, the random sampling method is used to conduct the field investigation, achieve an overall accuracy of 90.31%, and the Kappa coefficient is 0.88. The classification method based on decision tree threshold values and rule set developed by the repository, outperforms the results obtained from the traditional methodology. Our decision tree repository and rule set based object-oriented classification technique was an effective method for producing comparable and consistency wetlands data set. * Corresponding author.


INTRODUCTION
One of the common data sources for wetland monitoring and mapping is satellite images.The traditional method of wetland mapping is primarily visual interpretation and pixel-based.Due to inconsistency training site and training sample, the supervised classification method used for identification of shallow and small water body, which plays a vital role in the wetlands accuracy improvement, is inefficient and inaccuracy, it results incomparable data set within different organizations.The shallow and small water body often confused with the mountain shadow, building shadow, or the low density wetvegetation cover.The classification "same object with different spectrum" and "same spectrum with different objects" were serious, mis-classification and error results appeared (Iryna et al., 2011).
The high training site and sample quality of wetland is needed for object-oriented classification method.It can take full advantage of the shape and texture information provided by satellite data, and can improve the classification accuracy of wetland finally.The reasonable threshold value of each type of wetland feature characteristic parameter is needed for objectoriented classification method, thereby effectively improved the classification accuracy.These are the questions that provide the motivation for this paper.
In the first section, this paper used standardized atmospheric corrected, spectrally high fidelity texture feature enhanced Landsat7 ETM+ images as data sources of training site and training sample of wetland in the study area.A series of comparative experiments were conducted to achieve the best segmentation scale.The second section mainly concerned with characteristic parameter selection.The Brightness layer, Greenness layer and Humidity layer, together with NDVI, MNDWI, DVI, and NDBI image spectrum indices coverage, the texture feature coverage, and the original image bands data to form a single file to participate in eCognition.
The repository was designed as a collection of metadata, classification rules, tasks, and the threshold value generators as depicted in section 4. Based on the sample data obtained in the knowledge base, this study took the type of wetland as the target variable and each feature parameter as a test variable.The decision tree model is used to further mine each feature parameter of remote sensing data to obtain a more standard threshold, and gradually acquires classification rule set of the threshold value based on the classification results.
In the result and discussion section, the small water body and the wetland vegetation are extracted from the vegetation at 1 scale layer.Based on this layer ， Using small-wave infrared wavelengths ≦ 16 to extract small water bodies from vegetation layers.Accuracy assessment and application scenarios and discussion to applied these rules and values for Landsat 8 OLI image classification in study area given also.This is followed by a concluding section.

Study area
The Min River is the largest independently flowing sea (East China Sea) river in Fujian province, south eastern China.The river ' s total length is about 349.21miles, and the area of its drainage basin is about 23549.14 square miles, covering about half of Fujian Province's total area.The downstream of Min River, which from the mouth of Anrenxi of Minqing County to Changmen of Lianjiang County, the length of the channel is about 71miles, channel width is generally from 0.25miles to 1.25 miles, and the riverbed slope ratio is below 0.01%.This study area covered the tidal channel in the mouth of the Min River, from Shangjie of Minhou County to Changmen of Lianjiang County (Figure .1), the length of the main stream of the Min River is 36.67 miles long and it flows through Cangshan District, Gulou District, Taijiang District, Mawei District, Changle City and Lianjiang County.In this area, the Min River was divided into Beigang and Nangang branches at the Shangjie, the Beigang River (or the Tai River) flows through Fuzhou city to Ma weigang and in the north side, while the south branch of Nangang (or the Wulong River) revolves around Nantai Island to the south, after admission to the estuary of Dazhang river, pass through the gorge of Xiadou to Mawei.Further, this river turned to the northeast and then passed through Minan Canyon after the confluence of North and South branches.The width of this confluence river shortened from 0.6 to 0.38 miles and then was separated into two waterways at the Tingjian Town of Lianjian County, the south waterway flows through the south side of the Langqi island and enters the East China Sea, the north branch, which flows from Guantou and Changmen, and pass through Wuzhu,Tangdou, and Hujiang waterways to enter the East China Sea.
In addition, the tides in this area are regular semi-diurnal which has 4.1 m tidal range (Guantou).

Satellite data
The satellite data in this paper including two scenes, taken by the Landsat Thematic Mapper (TM) and the Operational Land Imager (OLI), freely available at the United States Geological Survey, correspond to the imaging time of 1999 and 2017, were used (Table 1).The scenes had more than 6 multi-spectral bands with a spatial resolution of 30 m*30 m and one panchromatic band with a spatial resolution of 15 m * 15 m.
Landsat8 OLI includes all bands of ETM+, band features are basically the same as Landsat7 ETM+, which allows the two sets of data to maintain high comparability and consistency (Xu et al., 2013).In order to avoid the atmospheric absorption characteristics, OLI re-adjusted the band.The larger adjustment is OLI Band5 (0.845 -0.885 um), which excludes the water vapor absorption characteristics at 0.825um; the OLI panchromatic band (Band8) has a narrow range.It is possible to better distinguish between vegetation and non-vegetation features on panchromatic images.In addition, there are two new bands: the blue band (band 1; 0.433 -0.453 um).The main application of the coastal observation, short-wave infrared (band 9 ； 1.360 -1.390 um) Including strong moisture absorption features can be used for cloud detection.(Table 1).
Before the experiment, the FLAASH model was used to radio metrically calibrate the image and correct the atmosphere to obtain more accurate surface reflectance information.Then we used ENVI5.1 software to crop the study area.Finally, the Pansharp2 fusion algorithm based on PCI software (Tan et al., 2007 ) ,the image data obtained after fusion had both high spatial resolution and rich multi-spectral features.Through the above data pre-processing, the basic classification data of this experiment is finally obtained.
Table1.Characteristics of Landsat Thematic Mapper (ETM) and Operational Land Imager (OLI) scenes used in the study.

Wetland sampling data and auxiliary data
Two types of sampling data employed as initial input of the repository of this study, one came from the historical wetlands inventory and field sampling data (figure 2

METHODOLOGY
The image analysis methodology for wetland classification in the current study is depicted in Figure 3. Object-oriented classification method, comprehensively considers the spectral and spatial characteristics of image objects, by using different segmentation algorithms, the adjacent pixels in the image are divided into meaningful regions.

Image pre-processing
Atmospheric and radiometric correction method (FLAASH) was used to eliminate the noise in the satellite images, by using ENVI5.1 software, to obtain more accurate surface reflectance values of the terrestrial surface.The Pansharp2 fusion algorithm based on PCI software ( Tan et al., 2007 ) , the image data obtained after fusion had both high spatial resolution and rich multi-spectral features.Through the above two steps of data pre-processing, the basic classification data of this experiment is finally obtained.The cap transformation, also known as K-T transformation, was used to capture the Brightness, Greenness and Wetness components information respectively.The Brightness represents the total radiation energy level of the target object; the Greenness reflects that the comparison between the infrared and near-infrared bands represents the growth of the vegetation; Wetness reflects the characteristics of the humidity.These components are independent of each other and can be to separate the features of vegetation, moisture and soil.

Segmentation method
Image multi-scale segmentation is the basis of object-oriented classification.It is a process of grouping adjacent pixels with similar features such as brightness, color, and texture in an image into "objects".The selection of the segmentation scale in image segmentation is very important, and it directly affects the number, size, shape, and accuracy of the classification information of the generated object polygon (Frohn et al., 2011).Multi-scale segmentation is a commonly used segmentation algorithm in eCongnition software.It is a bottomup zone merging method.Multi-scale segmentation is to set a threshold value for the segmented image object, and then to pass through the image feature.Types, colors, textures, and other features, establish a corresponding segmentation threshold and segment it.Because different geographic entities have different scales, it is very important to choose different segmentation scales for different features.This is the key to multi-scale segmentation (Ming et al.,2008).
Due to the complex features of the ground features in this study area, this paper finally determined 50 and 2 scales through multiple tests.The shape parameters of the segmentation parameters were chosen to be 0.2, the corresponding spectral segmentation parameters were 0.8; and the spectral tightness was chosen to be 0.5.The corresponding smoothness parameter is 0.5.The purpose of setting these two parameters is to extract the features layer by layer in order to improve the classification accuracy.On the scale of 50, vegetation and deep waters, bare sandbars, and buildings were mainly extracted; then the segmentation parameters were further set on the basis of the vegetation that was divided at 50 scales to extract shallow water bodies, adjacent water vegetation, and other small targets features.In short, the segmented objects based on the above parameters have clear boundary contours and high internal homogeneity of the objects, and have good separability and representativeness.

Object-oriented classification
In the classification process, the image object is finally interpreted, according to the rule set and the threshold value of each image object, by taking the decision tree algorithm and classification repository in this study.

Accuracy verification
The accuracy verification of this paper is based on training samples and high-resolution images to participate in discriminating classification results.Indicators, such as confusion matrix, overall classification accuracy coefficient, Kappa coefficient, wrong division error, missing division error, producer accuracy of each type of ground object, and user accuracy, were used in this paper.

REPOSITORY CONSTRUCTION
In this paper, the repository was designed as a collection of classification rules, tasks (segmentation, classification), rules, and the processes; it includes feature characteristic concepts, characteristic parameter and wetland type definition, classification scheme, and the spatial relationships of among categories.For example, the geospatially nearest topological relation rule was used to overcome the barriers to distinguish between wetland vegetation and no wetland vegetation, when using medium resolution satellite data in land use land cover classification.The repository also used to derive threshold value of feature characteristic parameters, such as greenness, NDVI, and NDWI, were vital to the classification accuracy.
The sample library and spectral library also suggested that there was an urgent need to construct the classification repository, which contained components listed as follow in this paper.

Definition and Classification scheme
Due to the inherent high complexity of the wetland ecosystem, there is currently no unified wetland definition and classification scheme.

Characteristic Parameters
The feature characteristic parameters of image objects derived from the repository, which including the terrain aspects, the spectral parameters, and the shape texture features parameters.

Spectral characteristic parameters:
The spectrum is the most direct and important interpretation element for identifying the image object, the average values of the multispectral layers based on the original image, the luminance, greenness, and humidity component data that obtained by the haptic cap conversion.NDVI, the combination of the nearinfrared and red bands of the image realizes the representation of vegetation information (Zhang, et al., 2014).Its formula is: The NDVI range is [-1, 1], and NDVI can reflect vegetation coverage.
The MNDWI, the normalized ratio vegetation index of the green and mid-infrared bands of the image, the formula is: The MNDWI range is [-1, 1].The MNDWI is used to extract water and can better distinguish between shadows and bodies of water (Xu, 2005).
The SAVI, explains the changes in the optical characteristics of the background and corrects the sensitivity of NDVI to the soil background (Xu, 2010), its formula is: Compared with NDVI, SAVI has increased the soil conditioning coefficient L determined according to the actual situation, and the value range is 0-1.
The NBR, the normalized ratio of the near-infrared and midinfrared bands of the image (Wang, 2013)., whose formula is: NBR highlights the situation before and after the change of vegetation and can improve the extraction accuracy of exposed surface caused by changes in the number of vegetation.

4.2.2
Shape and texture feature parameters: Shape and texture feature parameters are currently recognized as a statistical method for representing texture information (Bie et al., 2009, these texture features, can be largely improve classification accuracy, for describing textures, Grayscale cooccurrence matrix is the most commonly used and recognized classical statistical method for representing texture information (Grey level Concurrence Matrix ， GLCM).The shape index was utilized to distinguish objects with different shapes and further extract wetland types from the similar image objects (Wang et al., 2008).The geometric textures of the feature objects can help to solve the phenomenon of "homologous spectrum" and "foreign matter homogeneity" that can't be classified only by the spectral parameters.

4.2.3
Spatial relational feature and its parameter: The distribution of wetlands and the distribution of water bodies have spatial inclusions, spatial similarities, and spatially adjacent spatial relationships.The extraction of the water body is important for the accurate extraction of the wetland.In this regard, this paper mainly uses the inter-class relationship and connection relationship algorithm provided by the eCongnition to extract wetland vegetation around the water.First, all the water body need be extracted to accurately, before spectral feature parameters can be used to distinguish wetlands vegetation and non-wetland vegetation of the image objects.Class related features and Linked object features, and such features refer to the classification results of other objects located in the image object hierarchy.Using this object's characteristic parameters, it can effectively neighbor similar ground objects, and can greatly improve the classification accuracy of wetland vegetation.

Training sample
A highly reliable training sample is the key to establishing a decision tree model of the wetland classification, it affect the quality of the rule set directly.The training samples the wetland extraction results obtained in 1999, and are based on relevant graphic data and field wetland survey data in the study area in 1999, Object-oriented method together with visual interpretation, the wetland of the ETM+ image in 1999 was first extracted and compared with historical graphic data.Therefore, when selecting training samples, the selected objects should be of high purity, representative and typical types of objects, and the number of training samples should also increase in an appropriate amount, in this study the number of samples selected varies from 50 to 300.

Decision Tree
There are these categories of wetland types in the study area, including shallow water, deep water, construction land, nonwetland vegetation,shadow, wetland vegetation(or wet vegetation), sand band.Based on the object-oriented decision tree method to extract ground features, most scholars first obtained water and non-water bodies from land types, and then extracted other features.However, using this method to extract water will cause many non-water bodies such as city shadows and mountain shadows to be extracted.The method of removing them later will also lose a lot of shallow water body objects.Therefore, the elimination method is a double-edged sword.There are advantages and disadvantages to extraction.This article also tried a variety of extraction sequence, after repeated experiments prove that the extraction of the ground objects can't be proposed once, and it needs to be extracted step by step.
Based on the sample data obtained in the knowledge base, this study took the type of wetland as the target variable and each feature parameter as a test variable.The decision tree model is used to further mine each feature parameter of remote sensing data to obtain a more standard threshold, and gradually acquires classification rule set of the automatic value based on the classification level.

Rule Set and best classification threshold
The rule set, as shown in the figure 4,that had been demonstrated by the field investigation and accuracy assessment indexes were based on the decision tree model.The feature parameters from the decision tree model can provide valuable references for the image classification.The best segmentation scale provided by the repository for the ground features different from each other, stratified sub-scale method have more advantage for distinguishing the fine the image objects, such as small water, from the parent image objects, that is vegetation class in this context.At the scale of 5, vegetation and built area, deep water, sand beach were extracted, while only at the scale of 1 the small water bodies could be extracted completely.
The value of 0.25 is obtained by selecting the sample to be imported into the decision tree model, by using feature parameter NDVI ≧ 0.25 to divide ground features into vegetation and non-vegetation on the 5 scaled-layers.If the threshold the NDVI is too large, it will lead to incomplete vegetation extraction, that is, the vegetation with mixed small water bodies is leaked, which directly affects the accurate extraction of water bodies.If the threshold is too small, some urban construction areas will be proposed, this will result more difficulty when extract these urban areas from the vegetation later, owing to the NDVI threshold range urban areas confused with the vegetation area with low vegetation coverage.
For non-vegetation layers, this layer was firstly divided into deep water and other land types using MNDWI ≧0.Since the spectral information of the water mixed in the urban area overlaps with the urban shadows, the extraction threshold of the water is larger, and the shortwave infrared band ≧ 35 is used to extracts sand.Then the rest of the land is classified as building land.Of course, this part of the building land contains the un-extracted hill shadows, building shadows, and a few water bodies in the first step vegetation.Using MNDWI≧-0.08 to extract a small part of the water body and then extracted shadow part by 0 ≦ Green band ≦ 8 from the building.The final part is the construction site.
Then the small water body and the wetland vegetation are extracted from the vegetation at 1 scale layer.Based on this layer，Using small-wave infrared wavelengths ≦16 to extract small water bodies from vegetation.This is because the water has strong absorption characteristics in the short-wave infrared region, so this method can be used to achieve the extraction of fine water in vegetation class.
At last, because of the special spatial characteristics of the adjacent water vegetation and water body, the extraction of wetland vegetation is based on the extraction of all the water bodies after extraction.In this study, the extraction of the adjacent vegetation is mainly based on the vegetation layer proposed by the Rel.border to water (all water body)> 0 in the inter-class relationship algorithm.Thus, the wetland vegetation can be extracted.

Application scenario and accuracy assessment
This rule Set and classification threshold developed from the ETM+ of the 1999 was successfully used to do the wetland types classification of the 2017 OLI image within the study area were shown in figure 5.A series of comparative accuracy assessment were conducted by using GPS-based sample point data collected in the field and the samples from high-resolution images, the accuracy inspection of the classification result shown in table 3 As can be seen from the classification results, based on the classification results obtained from the decision tree and the rule set, the spots are clear and the boundaries are clear.The "pepper and salt phenomenon" has been eased, and a large area of the water body in the study area has been proposed to improve the wetland extraction accuracy.From the accuracy evaluation results, the 2017 classification results based on the object-oriented decision tree knowledge base and rule set method achieve an overall accuracy of 90.31%, and the Kappa coefficient is 0.88.It can be seen that the decision tree based knowledge base and rule set are object-oriented.The method has high precision in wetland extraction.

Shallow water and the classification accuracy
Applying the rule set created by the 1999 Landsat ETM+ imagery to the Landsat 8OLI image classification in 2017, the classification results show that the rule set can also be used to extract the 2017 wetland landforms and extract the wetland boundary more efficiently.However, the automatic extraction of wetlands based on the unified rule set requires a high degree of similarity in the data.The Landsat series of data used in this paper has generally the same band spectral characteristic, and both images have undergone the same pre-processing.
The classification accuracy of water bodies, vegetation, and buildings is the highest, but some small water bodies are incoherent.Moreover, the classification accuracy of watervegetated vegetation and sandbars is relatively low, and due to the complexity of water-borne vegetation, it is easy to mix with the surrounding vegetation.In a word, through object-oriented method based on the combination of decision tree knowledge base and rule set, the automatic extraction of wetland information in the study area has obtained satisfactory results.However, how to effectively remove all kinds of shadow effects on water bodies remains to be further studied.

CONCLUSION AND DISCUSSION
This paper proposed a classification method based on decision tree threshold values and rule set developed by the repository, the various characteristics of wetland on remote sensing images were analyzed and extracted through hierarchical scales.The extraction of shallow water at a small scale effectively avoids the influence of a part of urban shadows was introduced, it can greatly reducing the workload of manual modification, and improving the accuracy of classification reliability.

Figure 1 .
Figure 1.Location of the study area, it depicts the river network, the bifurcation and confluence of the lower Min River reaches.a) Shangjie; b) Xiadou; c) Minan; d) Changmen; e) Guantou; f)Meihua ), which collected from the local authority in 1999, and the other from the satellite image classification process.The sampling design for the other target training sample data and test sample data collection was based on the object-oriented satellite image information extraction, combining with visual interpretation from the 1999 ETM+ images.Wetland maps from high resolution data, such as Google image, also used as test sample data in the study.The auxiliary data, such as digital topographic data and DEM data is also required for the study, the elevation and slope data derived from the DEM used as assist in the extraction of terrain classes.

Figure 2 .
Figure 2. Different types of wetlands.a)Sand b and c)Adjacent water vegetation (shrubs and herbs) d)Water body

Figure 3 .
Figure 3. Technique flowchart of wetlands classification

Figure 4 .
Figure 4.The Rule Set of wetlands classification

Table 2 .
The definition and classification of wetland type of the lower Minjiang River. .