MULTI-TEMPORAL CLASSIFICATION AND CHANGE DETECTION USING UAV IMAGES

In this paper different methodologies for the classification and change detection of UAV image blocks are explored. UAV is not only the cheapest platform for image acquisition but it is also the easiest platform to operate in repeated data collections over a changing area like a building construction site. Two change detection techniques have been evaluated in this study: the pre-classification and the post-classification algorithms. These methods are based on three main steps: feature extraction, classification and change detection. A set of state of the art features have been used in the tests: colour features (HSV), textural features (GLCM) and 3D geometric features. For classification purposes Conditional Random Field (CRF) has been used: the unary potential was determined using the Random Forest algorithm while the pairwise potential was defined by the fully connected CRF. In the performed tests, different feature configurations and settings have been considered to assess the performance of these methods in such challenging task. Experimental results showed that the post-classification approach outperforms the pre-classification change detection method. This was analysed using the overall accuracy, where by post classification have an accuracy of up to 62.6% and the pre classification change detection have an accuracy of 46.5%. These results represent a first useful indication for future works and developments.


INTRODUCTION
The increase rate of urban growth in recent years has immensely transformed the urban landscapes all over the world.Urban growth leads to congestion of the immediate surroundings, as well as causes adverse effects including pollution and other processes that directly or indirectly cause Global Warming (Laidley, 2016).Due to this concern, Change Detection studies of urban systems has become an integral part in Urban and Regional Planning domains (Xu, Vosselman, & Oude Elberink, 2015).Change detection is one of the important image analysis techniques as it provides information about how the area have been changed/transformed in a specific time interval.The importance of change detection is mainly for monitoring and controlling the land cover and land use changes, city management and updating of the geographical information of a certain area (Liu et al., 2003).The acquisition of high resolution images using satellite is currently the most common way to deal with change detection projects.However, satellite images suffer from some challenges including not having a direct control of their quality, being affected by weather conditions and, last but not least, not being flexible in terms of resolution and acquisition time.The increase of new technology such as Unmanned Aerial Vehicle (UAV) could give therefore a big impact to the development of change detection techniques due to its flexibility on data acquisition.The use of UAVs for the acquisition of very high resolution images has become a common platform in the geomatics field (Nex & Remondino, 2014), and proven to be good for urban area change detection up to the building level (Qin, 2014).When comparing to the past airborne sensors, UAVs have the same advantages such as the possibility of acquiring data in a small area at an affordable cost and they require lower costs in recruitment of staff as explained by Xuan (2011) and hardware (though they require a certified pilot in most of the countries!).
Most of the monitoring activities require data to be captured repeatedly in order to have multi temporal information.This kind data can be easily generated using UAVs.These platforms can easily deliver the updated images of on an area in rapid development (i.e.constructions take place every day).The multitemporal integration of these images can be used for monitoring the progresses of the site with a change detection analysis.However, the manual generation of such change maps is time consuming and not feasible with practical needs.An automated approach for change detection using UAV images is therefore necessary.In this regard, this paper wants to present the first tests performed on this task.The classification and change detection presented use the DSM and orthophoto from different epochs in input.Conventional post-classification and pre-classification change detection techniques as well as well-known state of the art features have been considered in order to better realize the challenges in the achievement of this challenging task.The Conditional Random Field (CRF) model has been then used for classification purpose which is termed to have a good ability to smoothen the classification results.The unary potential of CRF was defined using a supervised Random Forest (RF) classifier which was trained to distinguish four classes.Fully Connected CRF was then used to define the pairwise potential of the CRF.
The dataset used in this research was collected with very high resolution (5 cm Ground Sampling Distance) over a building construction site.Eight different epochs have been considered for this purpose.DSM and orthophoto were generated using the Pix4D software and have been already registered following the procedures proposed by Aicardi et al. (2016).

Change detection
In literature, many change detection methods using different methods have been reported in the last three decades.In (Afify, 2011) and (Frauman and Wolff, 2006)  Algebra change detection is a pixel based change detection method where changes are detected pixel by pixel.Despite its simplicity, algebra change detection have some challenges and limitations such as the difficulty to define the "from-to" class changes as it requires a careful threshold selection.On the other hand, classification based change detection method is the one which involve any kind of classification for either separate image or combination of images.Among the most used classification change detection methods, the post classification change detection technique is one of the most commonly used.Post classification consists of a classification of images captured in different epochs followed by the overlay of those images and the analysis of the class changes from one epoch to another (El-Hattab, 2016).Post classification can be supervised or unsupervised as it depends on the presence of reference data for the area to be analysed (H.Liu & Zhou, 2010).Supervised change detection has an advantage in providing the qualitative (change map) and quantitative (change statistics) for the analysed images, but sometimes unsupervised change detection is preferred due to the lack of data to be used as base information or reference (Ghosh, Mishra, & Ghosh, 2011;Leichtle, Geiß, Lakes, & Taubenböck, 2017).Though post classification change detection suffers from error propagation from the classification output, it is still the method that have been used by many researchers (Afify, 2011;Liu and Zhou, 2010;Wu et al., 2017).Post classification change detection techniques have the advantage of providing the change information as from which class a pixel have been changed.Change information can be then presented in a change matrix showing what has been changed between two dates (Théau, 2012).Post classification requires sufficient training samples during the training of the classifier in order to have a good classification accuracy.The final accuracy of the change detection depends on the accuracy of the classified images used as input for change detection (Lu, Mausel, Brondízio, & Moran, 2004).The pre-classification change detection method lays into the classification-based techniques.Pre-classification change detection technique consists of analysing the changes between the features (Peiman, 2011) followed with the classification of those changes.(Frauman and Wolff, 2006) explains that the quality of the output from pre classification change detection technique depends mostly on the quality of the image used as input.

Conditional Random Field
Conditional random field is a popular classification/segmentation technique that takes into consideration the use of contextual information with the aim of producing the better classification results (Li & Yang, 2016).The lack of contextual information in the classification process often lead to noisy classified images.Conditional random field can be divided in two part; (i) Unary potential and (ii) Pairwise potential.Unary potential is the term that represents the relationship of the pixel label and the observed data, while the pairwise potential is the term that defines the relationship of the pixel label, its neighbours and the observed data.The general CRF can be defined as: Where the unary potential is given by ∅  (  ) and pairwise potential is defined by ∅  (  ,   ).

Unary Potential.
The unary potential is the term that provides the relationship between the label of the pixel and its observation data.It can be computed for each pixel and defines the probability of a label to be assigned in a particular pixel.The unary potential is usually reported in the form of negative log likelihood (see Equation 2) as it represents the conditional probability density that is used to minimize the function.
In this research unary potential was defined using Random Forest as it has been termed as the robust classifier and gave good classification results in a similar application in Yang and Förstner (2011).Random Forest has an ability to handle large dataset with higher computational load and still produces good results (Chehata et al., 2009;Sun et al., 2017).This algorithm has been compared with other classifiers like SVM, maximum likelihood just to mention few of them in different studies and it was found to perform better (Feng, Liu, & Gong, 2015; J. Liu et al., 2016;Sesnie et al., 2010;Sun et al., 2017).

Pairwise potential.
The output from the unary potential contains a lot of noise due to the lack of contextual information.
Pairwise potential that makes use of contextual information is then used to smoothen the classification output.Pairwise potential defines how the pixel is related to their neighbouring pixels (Cao, Zhou, et al., 2016).This relationship can be in a short range that includes 4 connected CRF or 8 connected CRF or in a longer range which includes fully connected CRF.In Fully Connected CRF pixel label is defined by finding the relationship between the interest pixel and all the other pixels of the image.
In this paper, it was decided to use a pairwise potential defined by the Fully Connected CRF (Krähen and Koltun, 2011).As explained by Krähen and Koltun (2011) pairwise potential for the fully connected CRF is defined as: (3 where  is the label compatibility function,  () is the weight of the Gaussian function and  () is the Gaussian kernel consisting of smoothening kernel and appearance kernel (Krähen & Koltun, 2011) as it is shown in equation 4 and 5 respectively.The smoothness kernel is used to remove the small pixels that appear to be isolated from other class labels and appearance kernel combines the nearby pixels that have the same colour as they are supposed to belong to the same class.
From the two equations above,   and   are positional vector,   and   are colour vector,   and   are parameters used to control the degree of nearness and similarity, while  (1) and  (2) are weights used to combine the two kernels.

Accuracy assessment
In this study, the accuracy of the method was defined using the Overall Accuracy (OA) together with the Intersect over Union (IoU) score.OA and IoU was computed using the equation 6 and equation 7 respectively.
where TP (true positive) means the cases that was correctly classified, FP (false positive) are the negative pixels that were incorrectly classified as positive pixels and FN (false negative) are the negative pixels that was correctly classified as negative.

Dataset
The orthophoto and DSM generated from UAV images acquired in the construction area in Lausanne (Switzerland) were used.The analysed area is approximately 32,830m 2 .The images were acquired in eight different epochs with a sampling distance of about 5cm and processed using the Pix4D software for orthophoto and DSM generation.Four different classes have been defined: asphalt (road, railway), buildings (roof and concrete), vegetation and bare soil.By using ENVI 5.3 + IDL 8.5 software, the ground truth for all of the epochs and the changes between adjacent epochs were digitized using visual inspection.The algorithms were implemented in Matlab R2016a.Figure 2 shows the orthophoto image and the corresponding ground truth of epoch 2.

Methodology
Two change detection techniques were applied in this paper.The first approach was pre-classification change detection technique and the second approach was post-classification change detection.In the following sections a detailed description of the implemented methods is given.Feature extraction.For the classification purposes features were extracted from orthophoto and DSM as well.From the orthophoto spectral features and textural features were extracted while geometrical features were extracted from the DSM.In detail, the spectral features extracted were HSV features and GLCM textural features while several geometric were extracted from DSM. HSV are simply hue, saturation and value that according to (Wu et al., 2015) give better results in image classification than RGB colour space.Textural features are features that has spatial distribution information of tonal variations within an image and that can be categorized as being fine, coarse, smooth, rippled, mulled, irregular or lineated as described by (Haralick et al., 1973).The common method for textural feature extraction is the use of the grey level cooccurrence matrix (GLCM) (Mohanaiah et al., 2013).The geometric features extracted from the DSM are linearity, planarity and normalised DSM (nDSM).Planarity and linearity features are computed from eigenvalues within the local neighbourhood (Chehata et al., 2009) while the nDSM was simply inferred considering local minimum height values in the DSM as the considered area was relatively flat and this approximation showed good results.Change detection.After feature extraction, the change between features was defined and the change map was produced and then classified using the Conditional Random Field model.Random forest.Random forest was used to define the unary part of the CRF.50 trees were defined as optimal after some tests.
From the total of eight epochs, the first four of them were used to train the classifier and the remaining four was used to test it.This was to simulate a real case of a fast monitoring where no new ground truth is available and the classification is performed on older information.Fully connected CRF.The results by Random Forest classifier were then refined using fully connected CRF.The parameters for the smoothening and appearance kernel of the fully connected CRF were defined testing several configuration in the training epochs.The best parameters (providing the best accuracies) were then chosen selecting the tests with the best accuracies.In particular, the positional standard deviation was set to 40 and the colour standard deviation was equal to 5. Also the weight for the fully connected CRF was tuned and found that 0.5 gave better results.the fully connected CRF.The training for the Random Forest was performed using the first four epochs leaving the last four epochs for testing to have comparable results in the two methods.Change detection.After the classification of the four testing epochs, the results of two adjacent epochs were compared and the change was detected from the overlaid data.

RESULTS
Different tests on both the methods were performed in order to preliminary assess the performance of these methods.In this work, different feature configurations were considered too.Since each epoch was classified in four classes, the expected change detection map was supposed to have up to 16 different classes.

Pre classification
The first experiment for the classification method was using only 2D features from orthophoto, and the second experiment was performed using both 2D and 3D features together.The results of the first experiment are shown in Figure 5 while the corresponding accuracies are reported in table 1.In figure 5 the different changes are coded with different colours.In particular, rd, veg, bld and bs refer to the road, vegetation, building and bare soil classes respectively.Changes in the land no change The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2, 2018 ISPRS TC II Mid-term Symposium "Towards Photogrammetry 2020", 4-7 June 2018, Riva del Garda, Italy use are given (in the second line of the legend) as a combination of the these four classes (i.e.bs-bld, etc.).
The results of the second experiment are shown in Figure 6, followed by the corresponding accuracy assessment in Table 2.As largely expected, it can be observed that pre-classification method using both 2D and 3D features gives better accuracy compared with same method only using 2D features.

Post classification change detection
In this method, each epoch has been classified independently, as in the first case, two different feature configurations have been considered.One example (for Epoch 5) of the classification only with 2D features are depicted in Figure 7, together with their corresponding accuracies.In this Figure, the intermediate results provided by the only Random Forest (i.e.unary-term) and their improvement thanks to the Fully Connected CRF are shown too.The use of 3D features in classification process largely increases the accuracy as it can be shown in Figure 8.The final classification results of all the epochs are reported in Figure 9 with their respective ground truth.These results suggest that the classification is usually very accurate in correspondence of the vegetation and the roads in most epochs, while bare soil and buildings (mostly built in concrete) are very often mixed because of the similar feature responses.
The OA and IoU values for each epoch are reported in Table 3: the classifications of the unary term alone and after the fully connected CRF are given.The IoU of each class considered in the classification is given in    In Table 6, the IoU of each the 16 change classes are reported: the values for all the three change detections are given.According to the epochs considered in the change detection, the results presented in Figure 6 demonstrate that the approach is somehow able to detect them correctly, although their accuracy is still very low.

DISCUSSION
The experiments performed using different sets of features confirmed that the combined use of 2D and 3D information can improve the classification and change detection results.The classification accuracy increased up to 11% when both set of features are used.Random Forest delivers noisy data in all the performed tests while Fully Connected CRF reduces the noise to a big extent resulting into a much smoother boundary classified map.However, the results are generally quite poor in all the performed tests.The pre-classification change detection seems to be unable to capture the variability of the classes in the different epochs reporting low accuracies overall and completely missing changes between classes with similar spectral values (like bare soil and buildings).Some more promising results have been obtained in the postclassification change detection that outperformed the preclassification method, achieving an Overall Accuracy 21% higher than the accuracy of the other method.Beside the still modest results, the post-classification method allows to detect all the 16 possible combinations of changes: this makes that approach more suitable than the pre-classification (that detected only 5 typologies of changes).As expected, the training performed on the first 4 epochs provides better results in the first testing epochs (epochs 5 and epoch 6) but degrades their quality in the later epochs.In this regard, the use of this training configuration was an additional challenge of the preformed classifications.The adopted features seems insufficient to handle the reliable classification of the data.Although vegetation is the easiest class to be detect in the different epochs, it also suffers from a seasonal variability that can often affect the detection of changes (i.e.different colours in winter and summer acquisitions).The different illumination conditions and the use of new materials in the construction site in the later epochs represent an additional unsolved challenge for this change detection.The relatively small radiometric differences of bare soil, buildings and roads and the continuous changes in the terrain position (due to piles and displacement of ground) as well as the large planar shapes of large regions (i.e.roofs, roads, bare soil) made often very challenging to correct distinction of the changes.

CONCLUSIONS AND FUTURE WORKS
In this paper two different approaches for change detection have been designed and tested.The developed approaches were based on the use of the CRF, adopting a Random Forest classifier as unary term and a fully connected implementation for the smoothing term.Eight different epochs where considered in the performed tests: the first epochs were entirely used for training while the other for testing.This challenging configuration was chosen to be closer to the practical case of the repeated and fast construction site monitoring.The achieved results were quite modest in both methods, even if the post-classification strategy seems to be the only promising for future improvements.This approach is able to detect all the possible typologies of changes, although with a very variable accuracy.The Fully Connected CRF helps to improve the boundaries of the classified regions, removing the majority of the noise in the classification.The very high resolution of the data allows to capture small details in the scene: this probably represents the biggest challenge faced in this work.The seasonal variability of the vegetation, the different illumination conditions and the presence of new materials in the later epochs represent additional challenges to be faced.In this regard, the used features are insufficient to capture the large heterogeneity of the scene.A more extended set of features as well as the use of CNN approaches in the unary term could be the next directions to take to get improvements in the results.Different ways to train the classifier will be also used: in particular, few samples from the epoch to classify will be added in order to see if this can improve the final results.
different change detection methods including image differencing, image rationing, principal component analysis, change vector analysis and post classification are described.Change detection algorithms are usually categorised in two main typologies (Dinand et al., 2013): (i) Algebra change detection that includes image differencing, image rationing, image regression, vegetation index differencing, change vector analysis and background subtraction techniques and (ii) Classification based change detection that includes post classification comparison, spectral temporal analysis, unsupervised change detection and hybrid change detection.

Figure 2 :
Figure 2: orthophoto image with corresponding ground truth.
-classification change detection: Pre-classification change detection technique was the first method to be implemented.This technique extracts features first, then generates the changes from the features of two consecutive epochs and finally classifies the changes in order to provide a change map, as shown in Figure 3.

Figure 3 :
Figure 3: flow chart showing the steps involved in preclassification change detection.

:
Post-classification change detection was the second method to be implemented.This method involves feature extraction, classification of each epoch using CRF followed by change detection from those classified images.The workflow for the post classification change detection is as shown in Figure 4.

Figure 4 :
Figure 4: flow chart showing the steps involved in post classification change detection Please note that the "change in class" reported in both Figure 3 and Figure 4 refers to the cases where the change define a geometric change in the same class too (i.e. a new floor is added to the same building region).The same features described in the first method were also used in post-classification change detection technique in order to allow a fair comparison.Thanks to these features, the classification of each epoch separately was executed using a CRF model.As in the previous method, the unary part of the CRF model was defined by Random Forest and the pairwise part was defined by

Figure 5 :
Figure 5: pre classification change detection using only 2D features.

Figure 6 :
Figure 6: pre classification change detection using 2D and DSM features.

Figure 9 :
Figure 9: classification of the last four epoch with CRF model.
Figure 10: post-classification change detection map.

Table 2 :
pre classification accuracy using 2D and DSM features.

Table 3 :
Table 4 too.The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2, 2018 ISPRS TC II Mid-term Symposium "Towards Photogrammetry 2020", 4-7 June 2018, Riva del Garda, Italy accuracy assessment using 4 epochs for training.

Table 4 :
IoU accuracy in percentage for each class.

Table 6 :
IoU accuracy in % for each change class.