APPLICATION OF DEEP LEARNING OF MULTI-TEMPORAL SENTINEL-1 IMAGES FOR THE CLASSIFICATION OF COASTAL VEGETATION ZONE OF THE DANUBE DELTA

: Land cover is a fundamental variable for regional planning, as well as for the study and understanding of the environment. This work propose a multi-temporal approach relying on a fusion of radar multi-sensor data and information collected by the latest sensor (Sentinel-1) with a view to obtaining better results than traditional image processing techniques. The Danube Delta is the site for this work. The spatial approach relies on new spatial analysis technologies and methodologies: Deep Learning of multi-temporal Sentinel-1. We propose a deep learning network for image classification which exploits the multi-temporal characteristic of Sentinel-1 data. The model we employ is a Gated Recurrent Unit (GRU) Network, a recurrent neural network that explicitly takes into account the time dimension via a gated mechanism to perform the final prediction. The main quality of the GRU network is its ability to consider only the important part of the information coming from the temporal data discarding the irrelevant information via a forgetting mechanism. We propose to use such network structure to classify a series of images Sentinel-1 (20 Sentinel-1 images acquired between 9.10.2014 and 01.04.2016). The results are compared with results of the classification of Random Forest.


INTRODUCTION
Land cover is a fundamental variable for regional planning, as well as for the study and understanding of the environment.This topic has become a key element of most inventory maps and monitoring inventories of environmental phenomena.The active or passive remote sensors used for various applications related to the detection, analysis, mapping and definition of land cover changes and vegetation monitoring cover a very broad domain of the electromagnetic spectrum.Remote-sensing technologies can deliver data on habitat quantity (amount, configuration) and quality (e.g., structure, distribution of individual plant species, habitat types and/or communities, persistence (He et al., 2011) across a range of spatial resolutions and temporal frequencies (Wulder et al., 2004).
The new generation of radar imagery has been available for several years now.Satellite radars Sentinel-1 collect the data using two polarization configurations, thus providing, in principle, greater potential than their predecessors for the inventory and monitoring of land cover changes and vegetation monitoring on finer scales.Change detection in SAR images is getting increased attention in recent years for the imaging characteristics of SAR, such as all-time, all-weather, and largearea.Hence, these sensors can be used to cartography and measure changes in the state of land cover or habitat quantity/quality (Nagendra et al., 2013;Niculescu et al., 2016Niculescu et al., , 2017)), as well as generate categorical products (thematic maps).Such data can be used in combination with discretely collected field data (Newton et al., 2009) to test hypotheses relating to biodiversity change (e.g., on specieshabitat relationships).The use of multi-temporal radar satellite remote-sensing has been mainly focused on the assessment of changes in both habitat quantity and quality within categorized land-use classes.As well as remote-sensing, quantitative spatial-analytical approaches to conservation have arisen from applied landscape ecology studies with these aimed at understanding the relationships between spatial patterns of land-cover change and ecological, biophysical and/or socio-economic processes (Mairota et al., 2015).
For image classification, one possible way to address the categorization task is to use deep learning algorithms, for instance a convolutional neural network (CNN).Deep learning is an important branch of machine learning, and it tries to learn abstract concepts by simulating the cognitive mechanism of human brain and explore the latent pattern by establishing deep architecture (Arel et al., 2010).When data is fed into a deep network, the features can be learned layer by layer, and the output of one layer can be taken as the input of the next layer (Bengio et al., 2013).CNN is inspired by the receptive fields in neural cortex, and it is a multilayer neural network suitable for processing 2-D data such as videos and images.Deep-learning-based methods, which achieve many improvements in many research fields, have been widely applied in natural images classification, object recognition, natural language, and text processing.The remote sensing community has also started to incorporate deep CNNs to image classification tasks.However, the majority of research using deep CNNs in the remote sensing community has been focusing on high-resolution images.
Classification of these high-resolution images is similar to object recognition in computer vision, and remarkable improvements achieved by deep networks in object recognition have also been shown in these applications (Sharma et al., 2017).Due to their remarkable performance, these methods are used to analyze HRRS images, and have achieved more impressive results than the traditional shallow methods for scene classification (Castelluccio et al.;Hu et al., 2015;Zhang et al., 2016;Zhao and Du, 2016;Luo et al., 2017;Wang et al., 2017;Cheng et al., 2016).Gong et al. (2016) used deep learning to achieve change detection for SAR images.They select samples based on a pre-classification without using difference image.Deep learning was then used to learn highorder features and classify the SAR images.Deep learning has shown promising performance in classification problems and it achieves accurate results.
In the second part of this work the results of the classification Deep Learning are compared with outcomes of classification of Random Forest.A random forest (RF) classifier is an ensemble classifier that produces multiple decision trees, using a randomly selected subset of training samples and variables.This classifier has become popular within the remote sensing community due to the accuracy of its classifications.The RF classifier yields reliable classifications using predictions derived from an ensemble of decision trees (Breiman, 2001).Furthermore, this classifier can be successfully used to select and rank those variables with the greatest ability to discriminate between the target classes (Belgiu and Dragut, 2016).The RF classifier is an ensemble classifier that uses a set of Classification and Regression Tree (CART) to make a prediction (Breiman, 2001).The trees are created by drawing a subset of training samples through replacement (a bagging approach).This means that the same sample can be selected several times, while others may not be selected at all.About two thirds of the samples (referred to as in-bag samples) are used to train the trees with the remaining one third (referred to as out-of-the bag samples) are used in an internal crossvalidation technique for estimating how well the resulting RF model performs (Breiman, 2001).The RF classifier has been used to map Land cover classes (Colditz, 2015;Haas and Ban, 2014;Stefanski et al., 2013;Tsutsumida and Comber, 2015), to map boreal forest habitats (Räsänen et al., 2013), to map biomass using (Frazier et al., 2014), to identify tree (Wang et al., 2015), and to map tree canopy cover and biomass using unitemporal and multi-temporal Landsat 8 imagery (Karlson et al., 2015), to map the ecosystems remediated in Danube delta using multi-temporal Sentinel-1 and Sentinel-2 images (Niculescu et al., 2017).

Vegetation of the Study area
The Danube Delta, Romania's youngest landmass, is a fluvialmaritime floodplain on two floristic provinces, the lower Danube (ponto-sarmatic) and the Black Sea (euxinic) (Borza, 1960;Ciocârlan, 1994).The diverse geomorphology, soils, and hydrological conditions favour the proliferation of a large number of aquatic, semi-desert, and saline habitats.At the international level, almost all habitats are considered very important.By the same token, each habitat is part of a unique nature conservation network.The flora in the Danube Delta Biosphere Reserve (both Romanian and Ukrainian sectors) is specific for a steppe bioregion with a temperate climate, featuring almost 1,400 species of vascular plants (Hanganu et al., 2002) of which five species (1 subspecies) are endemic (0.51% of the total number).The delta's marine zone is geo-morphologically characterized by the presence of parallel sandy beach barriers separated by shallow depressions.Most beach barriers are narrow and low, measuring several tens to a few hundreds of meters wide and lying 1.0 -1.5 meters above sea level.The depressions between them are relatively wide; many of them are hundreds to several thousand of meters across.Three complexes occur in which the barriers are wider and the depressions narrower: the Sărăturile complex, the Caraorman complex, the Letea complex.Geomorphologically, the marine zone consists of narrow beach barriers with very wide depressions in between.The crests of major beach barriers (for instance, Buhaz, Palade, and Crasnicol) are 1-1.5 meters above sea level.They are out of the reach of flooding.They often even too high to be influenced by the saline groundwater.The terrain consists of shifting sands and pastureland featuring Bermuda grass (Cynodon dactylon), silky wind grass (Apera spica-venti ssp.maritima), corn brome (Bromus squarrosus) and roundhead bulrush (Holoschoenus vulgaris).The beach barrier soil, at intermediate elevation, is still moderately saline.
The vegetation on these saline calcaric arenosols consists of a moderately salt-tolerant pasture of alkali grass (Puccinellia convoluta), P. distans, Apera spica-venti ssp.maritima and redtop (Agrostis gigantea ssp.pontica).Further on, past this Puccinellia convoluta zone, the increasing influence of fresh water flooding (up to three months a year) decreases the saline content.Agrostis gigantea ssp.pontica, rush (Juncus gerardi) and reed (Phragmites australis) are characteristic of this dynamic habitat, with alternating fresh water flooding and moderate saline levels.The next, lower zone, which floods for three to six months per year, is covered by sedge marshes, with reed mace and some reeds.The depressions themselves, with a flooding period of over six months per year, are covered by reed marshes with some sedge, growing in peat beds.Some of the younger depressions will still be in the process of being filled up with reed peat.Small lakes occur in their centre.These lakes are the last remnants of the lagoon.Reeds dominate the plaur in these small lakes.High saline levels means glasswort (Salicornia patula) and seepweed (Suaeda prostrata) are rare in this area, only present in a few isolated depressions within beach barriers not flooded by fresh water.

Data set
We used the following satellite images in this study: 20 Sentinel-1 images acquired between 9.10.2014and 01.04.2016 (table 1).The Sentinel-1 data were acquired in a time series that covered the entire growth season of 2015 and part of 2016.This enabled us to determine the influence of the time dimension and of the polarimetric dimension (VV and VH polarization are available) on the characterization and classification of the vegetation in coastal area of Danube delta.

Date
Incidence GRD image calibration is vital for viewing the maximum amount of information on an image.In our research, the ơ0 value is extracted using Calibration Tools of the OrfeoToolbox software, which provides us with the backscattering coefficient of the area.These values depend on the targets illuminated by the beam, on ground roughness and moisture and, in the end, on the vegetation density.

Deep Learning
Recently, recurrent neural network (RNN) approaches have demonstrated their quality in the remote sensing field to produce land use mapping using time series of optical images (Ienco et al., 2017) and recognize vegetation cover status using Sentinel-1 radar time series (Minh et al., 2018).Motivated by these recent works, we decided evaluate the quality of RNN for our task.We chose to use the GRU unit (Gated Recurrent Unit) introduced by (Cho et al., 2014), coupled with an attention mechanism (Britz et al., 2017).Attention mechanisms are widely used in automatic signal processing (language or 1D signal) and they allow to combine together the information extracted by the GRU model at the different timestamps.The input of a GRU unit is a sequence (xt1 ,..., xtN ) where a generic element xti is a multidimensional vector and it refers to the corresponding date in the time series.In our case, xti corresponds to a vector with two components (polarizaion) VV and VH for a particular date.
The output returned by the GRU model is a sequence of feature vectors learned for each date: (ht1 ,..., htN ) where each hti has the same dimension d.Their matrix representation H ∈ R Nxd is obtained vertically stacking the set of vectors.The attention mechanism allows to combine together these different vectors hti , to combine the information returned by the GRU unit at each of the different timestamps.The attention formulation we used, considering a vector sequence of learned features (ht1 ,..., htN ), is the following one: va = tanh(H * Wa + ba) λ = SoftMax(va * ua) rnn_feat = Σ λi * hti Matrix Wa ∈ R d,d and vectors ba , ua ∈ R d are parameters learned during the process.These parameters allow to combine the vectors contained in matrix H.The purpose of this procedure is to learn a set of weights (λt1,..., λtN) that allows to weight the contribution of each timestamp (hti) through a linear combination.The SoftMax(•) (Ienco et al., 2017) function is used to normalize weights λ so that their sum is equal to 1.The RNN model learns a new representation of the input sequences but it does not make any prediction by itself.To this end, a SoftMax layer is used again on top of the learned features rnn_feat to perform the final multi-class prediction.The Deep Learning method has been implemented in Python through the Tensorflow library.

Random Forest
Second step, we performed synthetic Random Forest classifications for all the Sentinel-1 radar.Random Forest is an ensemble learning technique and builds upon multiple decision trees.Each decision tree is built using a subset of the original training data and is evaluated based on the remaining training features.New objects are classified as the class that is predicted by the most trees (figure 1).Each decision tree is independently produced without any pruning an each node is split using a user-defined number of features (Mtry), selected at random.By growing the forest up to a user-defined number of trees (Ntree), the algorithm creates trees that have high variance and low bias (Breiman, 2001).As mentioned above, two parameters need to be set in order to produce the forest trees: the number of decision trees to be generated (Ntree) and the number of variables to be selected and tested for the best split when growing the trees (Mtry).Theoretical and empirical research has highlighted that classification accuracy is less sensitive to Ntree than to the Mtry parameter (Ghosh et al., 2014;Kulkarni and Sinha, 2014).
According to Rodriguez-Galiano et al., 2012, the classifier has three main advantages for land cover classifications from remote-sensing images: (i) it reaches higher accuracies than other machine-learning classifiers; (ii) it has the ability to measure the importance level of the input images; (iii) it makes no assumptions about the distributions of the input images (cited by Hütt et al., 2016).We used the following parameters for the Random Forest algorithm: 200 trees, maximum depth of the tree 25 and minimum number of samples in each node 25.
Concerning the Random Forest Classifier, we use the public available implementation supplied by the Scikit-Learn python machine learning library.The results show very good classification performance for the two algorithms: 96,2% mean accuracy for deep learning and 94,3% for Random Forest.The mapping accuracies were summarized using confusion matrices (figure 2 and 3) and statistics including user, producer and overall accuracy and Cohen's K. Radar data provide information especially on plant physiognomies.This analysis supplies information on polarimetric data in relation to the geometric characteristics of the physiognomies of the plants growing in delta and enables us to draw conclusions about ways to distinguish among the various plant physiognomies.
Finally, the F-measure was calculated (table 2).The Fmeasure is the harmonic mean of the precision and recall (this indicator gives the proportion of pixels well classified for each class).We can first note that these baseline scores are quite high, demonstrating the relevance of the temporal dimension for land-cover classification.In addition, note that due to the significant number of classes in the reference map, the Kappa scores are quite high and a small increase of the score can correspond to a major difference in the classification.

Figure 1 :
Figure 1: Performance of the Random Forest algorithm

Figure 2 :
Figure 2: Matrix Confusion of Deep Learning Classification F-measure of Deep Learning and Random ForestThe outcomes of F-measure for the two algorithms we show very good results for all classes of reed: 'reed vegetation on salinized soils' (0.69 for Deep Learning and 0.74 for Random Forest), 'pure reed vegetation' (0.73 for DP and 0.67 for RF), 'reed on open plaur' (0.93 and 0.92), 'reed on compact plaur' (0.73 and 0.67) and 'reed on compact plaur' (cut reed) (0.97 and 0.96).The class 'Dunes' and 'Dunes vegetation' present values a mediocre F-measure for the two algorithms (figure6).

Table 1 :
Sentinel-1 imagery used in this studySince it was first launched in April 2014, the Sentinel-1 satellite has allowed specialists to monitor the earth's surface day and night regardless of weather conditions and has transmitted highresolution space images free of charge.The Sentinel 1 SAR mission is part of the Copernicus Programme -European Earth Observation Programme, which was previously called GMES (Global Monitoring for Environment and Security), of the European Space Agency.Placed on an orbit at an altitude of 693 km, Sentinel-1 operates in four data collection modes: the StripMap (SM) mode, the Interferometric Wide swath (IW) mode, the Extra-Wide swath (EW) mode and the Wave (WV) mode.Each mode provides different products with respect to spatial resolution and imaging swath.Sentinel-1 images are captured in C band (5.5 cm), and they may exhibit simple HH or VV polarization or double HH+HV or VH +VV polarization.The data used in our research were collected in the IW mode.This mode includes three sub-swaths, namely IW1, IW2 and IW3, which correspond to cyclical antenna deviations.This mode provides GRD (Ground Range Multilook Detected) and SLC (Single Look Complex) images made up of three IW(MDA, 2011).The GRD images are Multilook images (five looks for the IW mode) with less speckle noise and coarser space resolution.Although the SLC products have finer resolution, it is difficult to use them directly due to the phase information, which seems useless as it prevents extraction of additional information in certain cases.
Classification of Sentinel-1 series imagery (20 images) using two different machine learning (Deep Learning and Random Forest) algorithms were implemented, evaluated, and compared.The methods used have shown a real interest for the characterization of the major vegetation types and for the precise delimitation of several types of vegetation formations from the Sentinel-1 time series images.The classification procedures produced from the 20 images are reproducible which allows their implementation on vast territories.Both methods are only based on a few input parameters and provide accurate classification results.Thus, Random Forests and Deep Learning can be regarded as a simple yet accurate approach.The classifications accuracies were increased by introducing the spatial features to original polarimetric coherency feature.Especially, multi-feature combination shows potential capability to distinguish the classes with similar polarimetric responses but different textual and spatial features, such as the all classes of reed: reed vegetation on salinized soil, pure reed vegetation, reed on open plaur, reed on compact plaur and reed on compact plaur (cut reed).Higher classification accuracies can always be obtained by Deep Learning technique that smartly combines the temporal polarimetric features.The increasing size of training samples can effectively improve the classification accuracy.Deep Learning outperforms Random Forest for all the experiments in terms of accuracy, its efficiency is lower than Random Forest because it is highly affected by feature dimensionality.The main advantage of using GRU over Random Forest is that it enables to build a hierarchy of local and sparse features derived from spectral and temporal profiles while Random Forest build a global transformation of features.Both algorithms are robust and they can be used for remotely sensed data vegetation classification.Performance of random forests is on par with other machine learning algorithms (such as Deep Learning) but it is much easier to use and more forgiving.On the other hand, compared to random forest, Deep Learning does not need extraction and selection of hand-crafted features.Such advantage, together with its success in the signal processing field has motivated researchers in the remote sensing community to investigate its usefulness for remote sensing image analysis.