RANDOM FOREST AND OBJECTED-BASED CLASSIFICATION FOR FOREST PEST EXTRACTION FROM UAV AERIAL IMAGERY

Forest pest is one of the most important factors affecting the health of forest. However, since it is difficult to figure out the pest areas and to predict the spreading ways just to partially control and exterminate it has not effective enough so far now. The infected areas by it have continuously spreaded out at present. Thus the introduction of spatial information technology is highly demanded. It is very effective to examine the spatial distribution characteristics that can establish timely proper strategies for control against pests by periodically figuring out the infected situations as soon as possible and by predicting the spreading ways of the infection. Now, with the UAV photography being more and more popular, it has become much cheaper and faster to get UAV images which are very suitable to be used to monitor the health of forest and detect the pest. This paper proposals a new method to effective detect forest pest in UAV aerial imagery. For an image, we segment it to many superpixels at first and then we calculate a 12-dimension statistical texture information for each superpixel which are used to train and classify the data. At last, we refine the classification results by some simple rules. The experiments show that the method is effective for the extraction of forest pest areas in UAV images. * X. Hu is with the School of Remote Sensing and Information, Wuhan University, Wuhan 430079, China, and also with the Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan 430079, China (e-mail: huxy@whu.edu.cn).


INTRODUCTION
Increasing forest pest has caused huge damage to a huge amount of forest in China.It's significant to monitor and detect the pest areas as soon as possible to prevent the pest from spreading.However, the traditional methods for the detection of pest area in forest are usual performed with laborious and timeconsuming field sampling (Lavoie et al., 2005;Pellerin et al., 2008).Since spatial information technology have a great deal of advantages such as wideness, rapidity, simultaneity, and economy, the needs and the interests of remote senses among professionals as well as general persons have greatly increased these days.Even in the field of forest damages by disease and insect pests it was proved to possibly use various analysis methods by spatial information, and studies on this field are actively working.Now, more and more remote sensing related techniques and algorithms (such as segmentation, clustering, feature expression and supervised classification) are applied in the detection of forest pest these years (Coops et al., 2006;Heurich et al., 2010;Ortiz et al., 2013).Multi-spectral infrared (IR)-imagery derived from high flying aircraft (Yu et al., 2006;Medlin et al., 2000), commercial satellites, or multi-temporal public datasets like the ASTER or Landsat Thematic Mapper satellite system (Franklin et al., 2003;Lawes et al., 2008;Hais et al., 2009;H. G. Wu and Q. W. Zeng., 2008;J. Y. Su and J. Ni., 1995), has also been extensively used for conservation and forest restoration monitoring (Milton et al., 2005;Ecker et al., 2008).However, it's also relative expensive or complicated to obtain the high flying aircraft imagery and commercial satellites imagery.Now, the rapid development of unmanned aerial vehicle (UAV) offers a cheap and fast way to get the highresolution images of forest areas which are the foundation of the subsequent image process to detect the pest areas.The proposed approach is make use of UAV imagery to rapidly and effectively extract pest areas based on image process and analysis algorithms.This paper is organized as follows.Section II describes the proposed method, followed by the pest areas extraction results and analysis.Section III concludes this study and identifies some aspects for improvement.

METHOD AND RESULT
The workflow of the proposed approach for forest pest areas extraction proposed in this study is shown in Figure 1.The major steps include segmentation of the image, feature extraction of the segmentation objects, training and classification using the random forest classifier.Initially, an image is segmented into superpixels so that the descriptors of each superpixel can be computed to form a feature vector for classification.Training data must be selected to represent the variety of pest and non-pest segments.The random forest algorithm is then applied to discriminate pest and non-pest regions.The key of the method is to deal with the training and classification based on superpixels which can reserve more texture information.

Image Segmentation
Image segmentation is a process of dividing an image into different regions such that each region is nearly homogeneous.Since many pest areas in the forest imagery composed of just a bit of pixels, we must make sure that the segmentation couldn't be too big.So we segmented images to superpixels.There are many approaches to generate superpixels, each with its own advantages and drawbacks that may be better suited to a particular application.For example, if adherence to image boundaries is of paramount importance, the graph-based method of (Pedro Felzenszwalb and Daniel Huttenlocher., 2004) may be an ideal choice.However, if superpixels are to be used to build a graph, a method that produces a more regular lattice, such as (Jianbo Shi and Jitendra Malik., 2000), is probably a better choice.While it is difficult to define what constitutes an ideal approach for all applications, we believe the following properties are generally desirable (R. Achanta et al., 2012): 1. Superpixels should adhere well to image boundaries.2. When used to reduce computational complexity as a preprocessing step, superpixels should be fast to compute, memory efficient, and simple to use.Next, in the assignment step, each pixel i is associated with the nearest cluster center whose search region overlaps its location.
Once each pixel has been associated to the nearest cluster center, an update step adjusts the cluster centers to be the mean [] T l a b x y vector of all the pixels belonging to the cluster.The 2 L norm is used to compute a residual error E between the new cluster center locations and previous cluster center locations.The assignment and update steps can be repeated iteratively until the error converges.The result of the SLIC segmentation is shown in Fig. 2.

Feature Extraction
The extraction of texture feature for each superpixel is shown in Fig. 3. Our experiments demonstrate that the texture information expressed by those simple statistical values of RGB can distinguish forest pest areas from else and they are also efficient to calculate so that our algorithm is able to process a huge amount of images.

Training and Classification by Random Forest
We train a random forest model by the 12-dimensional feature calculated from training data and use the model to classify the test images by the same dimensional feature extracted from them.
Random Forest is an ensemble learning method for classification by a multitude of decision trees at training time and outputting the class that is the mode of the classes.It's proposed in (Ho and Tin Kam., 1995) and developed further in (L.Breiman., 2001).The multiple decision trees of the RF are trained on a bootstrapped sample of the original training data.
At each node of every decision tree, one among a randomly selected subset of input parameters is chosen as the best split and used for node splitting (Liaw and Wiener., 2002).Each tree uses only a portion of the input samples (typically two-third) for the training while the remaining roughly one-third (referred to as Out-Of-Bag (OOB)) of the samples are used to validate the accuracy of the prediction.In general, RF increases the diversity among the decision trees by randomly resampling the data with replacement and by randomly changing the parameter subsets for node splitting at each node of every decision tree.Random forest has been widely applied in tracking (V.Lepetit and P. Fua.Keypoint., 2006) and object recognition missions (F.Moosmann et al., 2006;J. Winn and A. Criminisi., 2006).We choose random forest because it's invariant to monotonic transformations of the input variables and robust to outlying observations.Moreover, as has been noted by (J.Winn and J. Shotton., 2006), it's much faster in training and testing than traditional classifiers (such as an SVM).At last, it also enable different cues to be "effortlessly combined".
The main steps of Random Forest are described in Table 2 (Du, P et al., 2015).It is a combination of tree predictors in which decision trees are constructed using resampling technique with replacement, the inducers randomly samples the attributes and chooses the best split among those variables rather than the best split among all attributes.The assignment of class label of an unknown instance is performed using majority voting strategy.Due to the important advantages such as handling very large number of input attributes and low time cost, Random Forest has widely attracted the interests of researchers from the context of remote sensing image classification (Waske, B and Braun, M., 2009;Gislason et al., 2004;Qi et al., 2012;Samat et al., 2014).
Input: DTI (a decision tree inducer), T (the iterations numbers), S (train sets), r (sampling ration), N (number of attributes used in each tree) Train: for i = 1 to T Get sample St from S with replacement using r; Build classifier Mt based on the inducer randomly samples N of the attributes and choose the best split.
Classification: new instance classified by classifiers ( 1,..., ) t t T M  then performed using majority vote.
Table 2 Algorithmic steps of Random Forest.

Experimental Results and Analysis
Some examples of pest extraction from the image can be seen in Figure 4.The intermediate results of segmentation and final outputs of the proposed method are illustrated.The final outputs are refined from the classification results of random forests following the rule that pixels whose red value being smaller than green value or blue value won't belong to pest area.The experimental results show that the proposed approach can effectively detect all of the visually salient pest areas from the UAV imagery of forest.The overall advantages of our method can be concluded: 1) it does not require multitemporal images;

Test data
and 2) it can process images quickly because the program takes superpixels as input and calculate simple but effective texture features.

CONCLUSION
In this study, random forest and object-based classification for forest pest detection in UAV aerial imagery are proposed.12dimensional descriptors are extracted from all of the pixels of each superpixel, which is then used as the input vector to build the random forest classifier.A number of various images are tested.The results show the effectiveness of the proposed method.The developed method provides a general framework for detecting pest areas in a wide variety of UAV images by sampling, feature extraction, training, and classification.
The results demonstrate the potential of UAV aerial imagery for forest pest infestation monitoring.It can be found that the presented approach has a strong positive economic advantage over the traditionally applied ground based forest pest detection workflow, since we would dramatically reduce temporal and financial cost.Due to the limitation of the amount of data, we train our model on about tens of images and haven't tested the proposed algorithm on different conditions.Future work includes training and testing on larger datasets with different light conditions and employing more different features.More complex probabilistic models (such as conditional random fields) could be involved in the algorithm for the consideration of semantic information.

Figure 1 .
Figure 1.Workflow of the proposed approach.

.
3. When used for segmentation purposes, superpixels should both increase the speed and improve the quality of the results.Take the above factors in consideration, in our proposed method, image segmentation is performed by the simple linear iterative clustering (SLIC) algorithm proposed by R. Achanta in 2012(R.Achanta et al., 2012).SLIC clusters pixels in the combined five-dimensional (5-D) space of color and image plane to efficiently generate compact, nearly uniform superpixels.The zero parameter version of the SLIC algorithm is used for choosing an adaptive compactness factor (R.Achanta et al., 2012).SLIC is simple to use and understand.By default, the only parameter of the algorithm is k , the desired number of approximately equally-sized superpixels.For color images in the CIELAB color space, the clustering procedure begins with an initialization step where k initial cluster centers[ , , , , ]  The centers are moved to seed locations corresponding to the lowest g E radient position in a 33  neighborhood.

Figure 2 .
Figure 2. Superpixels segmented by SLIC; the image (top) and the patches (bottom) extracted from the upper image.

Figure
Figure 3. Extraction of feature

Figure 4 .
Figure 4. Examples of pest extraction by our method.Top images are test images, middle images are the demonstration of corresponding superpixels segmented by SLIC and below images are forest pest extraction results of our approach, the purple areas are pest areas.
The entire algorithm is summarized in Table 1./* Initialization */ k C do Compute the distance D between k C and i .until threshold E 