ENHANCING UAV COASTAL MAPPING USING INFRARED PANSHARPENING

Ecosystems must now cope with climate change such as rising sea levels. These major changes have a direct impact on the coastal fringe. However, in recent years, coastal ecosystems such as saltmarshes have proven their adaptive capacity. Unmanned Aerial Vehicles (UAV) are an inexpensive and easily deployable alternative which offer us the possibility to monitor these geomorphological and ecological systems, have been perfected over the years, making it possible to achieve high or even very high (VH) spectral and spatial resolution. Detection of changes at VH temporal and spatial resolution such as coastline evolution or seasonal monitoring of plant communities is facilitated. The red-green-blue (RGB) camera is the basic equipment of low-cost UAVs. Many studies have demonstrated the interest of infrared sensors for vegetation or water detection. In this original study, a pansharpening method has been developed to generate a red-edge (RE) and near infrared channel based on the VH resolution of RGB. Out of the three different pansharpening algorithms tested, Gram-Schmidt showed correlation (0.61 and 0.63 for RE and NIR channels respectively), followed by nearest neighbor diffusion and finally, principal component spectral pansharpening. The maximum likelihood, support vector machine and convolutional neural network classifiers were used to discriminate the main objects of the study area. The classification results revealed that at the classifier scale the ML outperforms the others with an overall accuracy of 80.75%. At the spectral band scale, the RE obtains the best performances with 80.04% of OA with ML and 78.34% of OA with SVM. * Corresponding author


Global Change
Coastal habitats are increasingly facing global sea level rise, storm intensification and changes in land cover and local land use (IPCC, 2009). The increasing anthropization of coastal fringe makes our societies vulnerable to constantly rising seas. Facing this threat, human societies can rely on coastal ecosystems, such as seagrasses, saltmarshes and mangroves, which constitute an important biodiversity reserve. They also play a major role in the adaptation of territories to the effects of climate change, through their capacity to trap atmospheric CO2 (Pendleton et al., 2012) but also to capture sediments (French et al., 1993). This protective service (Reef et al., 2018) contributes to the elevation of the foreshore. A better understanding of their evolution requires habitat mapping at the process spatial scale (submeter resolution). Although satellite (James et al., 2020) and manned aerial sensors  provide very high (VH) spatial resolution optical imagery, their temporal resolution remains too coarse to monitor subtle variations in eco-geomorphological dynamics. An alternative platform leveraging both VH spatial and temporal resolution should be found to rigorously capture the seamless coastal fringe.

Unmanned Aerial Vehicle for Coastal Management
Unmanned aerial vehicles (UAV) have been successful to investigate the coastline evolution (Green et al., 2015) and habitat mapping  at very high spatio-temporal resolution. In the temporal and spatial monitoring of coastal habitats, UAV have the potential to be quickly operational thanks to their easy deployment and implementation (Mury et al., 2019). Data collection allows the monitoring of different habitats and species threatened by the erosion of the foot of the dune. Specific monitoring data of the environment helps territory managers to set up adapted management programs. However, most low-cost UAV sensors only deploy red-greenblue (RGB) camera, which limit their spectral capabilities to correctly discriminate key features such as vegetation or water. The combination of infrared (IR) with RGB information has yet improved coastal geomorphological monitoring (Aubry et al., 2012) and ecological characterization . However, such IR information requires an UAV-dedicated sensor whose spatial resolution usually does not reach the RGB basic sensor's.

Pansharpening for Hyperspatial Monitoring
Many satellite constellations that have emerged in recent years are capable of tracking environmental changes at VH spatial resolution. In addition, satellites such as hyperspectral Worldview3 satellite have, most of the time, sensors in invisible spectrum . Multispectral (MS) sensors have a coarser resolution than the panchromatic sensors that are fitted to satellites. For example, Pleiades-1 images are delivered with a panchromatic band at 0.50m and four spectral bands RGB+near-IR (NIR) with a 2 m × 2 m spatial resolution. To preserve MS information at high spatial resolution, pansharpening methods have been developed, allowing to increase MS images' pixel size by merging panchromatic images (Meng et al., 2019). Pansharpening methods allow to add spectral bands to images that are devoid of them and thus, increase the invisible spectrum. Often used for satellite images, pansharpening is increasingly used for the fusion of manned aerial and satellite images (Siok et al., 2020). Fusion at centimetric scale can also be applied between satellite or manned aerial and UAV images (Jenerowicz et al., 2017). UAVs are mostly fitted with increasingly resolute RGB basic cameras. Nevertheless, MS UAV are expanding in coastal environment research. However, MS image resolution problems are identical to those encountered in satellite and aerial imaging.
In this study, an original UAV pansharpening methodology has been therefore developed to produce red-edge (RE) and NIR wavebands at RGB higher resolution. Three pansharpening method have been evaluated: Gram-Schmidt (GS), nearest neighbor diffusion (NND) and principal component (PC). Three machine learning supervised classifiers have been used: Maximum likelihood (ML), support vector machine (SVM) and a convolutional neural network (CNN). Two issues have been addressed: (1) what is the best UAV pansharpening method? (2) What is the added value of the pansharpened RE and NIR data into traditional RGB habitat mapping?

Study Site
The study site, named Guimorais' tombolo, links Besnard's rocky island to Meinga's cliff (48°41'34.17''N, 1°56'49.32''W; Figure 1). It offers a wide diversity of landscapes (beach, dune, saltmarsh) shaped by the meteo-marine forcings. The dominant marine currents have contributed to the creation of the tombolo by carrying with them sediments which, over the years, have been fixed between the two rocky tips (Mahmoud, 2015). The saltmarsh is located at the bottom of Rothéneuf's habor. The protection offered by the tombolo's dune to the north, allows it to develop. The main plants such as Halimione portulacoides grows on the upper saltmarsh, whereas Sueda maritima grows on the lower part. The dune is colonized by endemic plant species of the temperate region: Ammophila arenaria (yellow dune) and by mosses (grey dune).

Ground-truth Acquisition
Ground-truth data collection took place on October 16, 2020. A range of 13 targets and 30 photoquadrats were placed according to a predefined grid and geolocated accurately in the French national RGF93 datum, Lambert 93 projection with a DGNSS Topcon hiper V receiver (yellow and red points respectively in Figure 1). Each target and photoquadrat geographics coordinates (XYZ) were post-processed with the open-source software RTKlib (Takasu and Yasuda, 2009). Photoquadrats (0.5 × 0.5m) were collected with an Olympus TG4 camera (4 608 × 3 456 pixels). Seven ground-truth classes representatives of the study site have been extracted from photoquadrats: upper and lower saltmarsh vegetation, grey and yellow dune plant, road, dry and wet sand ( Wet sand Wet sand of grain size of 0.06-2mm Table 1. Description of the seven ground-truth classes.

Unmanned Aerial Vehicle Survey
The planned aerial survey was run between 12:30 and 1:00 pm (UTC+1) on October 16, 2020 with a DJI Phantom 4 Pro V2 (P4V2), provided with an embedded 20M-pixel RGB camera (4864 × 3648 pixels), and enhanced with a 1.2 M-pixel twochannel (RE and NIR in this study) multispectral Parrot Sequoia+ (1 280 × 960 pixels for each camera). Both sensors' invisible spectrum are focused on the RE (735 nm) and NIR (790 nm) wavelengths. The acquisition flight was carried out in 30 minutes. The UAV flew over the study area at 50m above local terrain. Each image was covered by 80% front overlapping and 70% side overlapping.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2021 XXIV ISPRS Congress (2021 edition)

Photogrammetry Reconstruction
The photogrammetric reconstruction was computed with Pix4Dmapper® using the structure for motion photogrammetry process. Three very high resolution (VHR) geolocated orthomosaics have been created: one from the P4V2 530 RGB images (0.01 m pixel size), two from the Parrot Sequoia+ 1 542 RE and NIR MS images (0.06 m pixel size).

Pansharpening Algorithms
Three pansharpening algorithms have been evaluated to increase low-resolution multispectral Parrot Sequoia+ imagery ( Figure 2) from a computed P4V2 panchromatic imagery with ENVI® software: nearest neighbor diffusion (NND) pansharpening, principal component (PC) spectral sharpening, and Gram-Schmidt (GS) pansharpening. From the RGB geolocated photogrammetry reconstruction, a panchromatic (PAN) image was generated for the needs of pansharpening which requires a single VHR band input ( Figure  2). The following formula has been applied: where fix = converts floating pixels value into integer pixels value C1 = Red channel, C2 = Green channel, C3 = Blue channel.

Figure 2.
Work chart of the UAV imagery multispectral orthomosaic pansharpening.

Gram-Schmidt Pansharpening:
The initial use of GS pansharpening assumes that the panchromatic band is of higher resolution than the bands to be fused (Laben et al., 2000). This method allows to retain both a spatial VHR offered by the panchromatic band and a spectral VHR from the MS datasets. GS algorithm starts by simulating a PAN MS. GS transformation is then applied to the simulated PAN image and MS dataset. The simulating PAN image at 0.06m is replaced by the PAN image at 0.01m. A GS reverse transformation is performed to obtain an MS dataset to spatial VHR.

Nearest Neighbor Diffusion Pansharpening:
The NND pansharpening algorithm is based on the statistical contribution of nearby MS pixels in order to preserve spectral integrity while accumulating spatial quality based on PAN VHR pixels (Sun et al., 2014).

Pansharpening Assessment:
Root mean square error (RMSE) and correlation coefficient were calculated to quantify the pansharpening methods' accuracy with 1 000 sample points (Sarp, 2014).

RMSE=
( 2) where P = Sequoia pixel value low resolution, O = Sequoia pixel value high resolution, n = number of observations.

Classification Algorithms
Three machine learning pixel-based classifiers have been assessed on the best pansharpened spectral dataset using ENVI® software: maximum likelihood (ML), support vector machine (SVM), and convolutional neural network (CNN). 1 000 training pixels and 1 000 validation pixels per class were randomly extracted. The spectral predictors have been tested: RGB, RGB+RE, RGB+NIR, RGB+RE+NIR. The overall accuracy was reckoned to estimate the classification accuracy via a confusion matrix.

Maximum Likelihood Classifier:
ML is a very common classifier in remote sensing because it is inexpensive in terms of processing time. This algorithm is based on the probability that a pixel is assigned to a predefined class in a normal distribution. Pixels assigned to a class have a high probability of belonging to that class (Table 2).

Function Parameter Probability Threshold
Single value Data scale factor 1.00 Table 2. Description of the parameters with the Maximum Likelihood (ML) classifier.

Support Vector Machine:
SVM is a powerful nonprobabilistic classifier that depends on two parameters: the kernel and the maximum margin of the pixels sample (Table 3).

Function Parameter Kernel type
Radial basis function Gamma kernel 0.20 Penalty parameter 100.0 Pyramid levels 0 Classification probability threshold 0 Table 3. Description of the applied parameters with the Support Vector Machine (SVM) classifier.

Convolutional Neural Network:
CNNs are a part of the growing artificial neural networks in remote sensing. ENVI® software CNN algorithm is based on "U-net" architecture (Ronneberger et al., 2015). A series of linear combinations is applied to the input raster. The multiplication of convolutions and deconvolutions allow the model to extract features from the data and label the pixels depending on them. The final dense layer outputs a class activation map, that is then used to compute the final classification by assigning to each pixel the feature class with the highest value above a given threshold. (Figure 3). A densified layer is created at the output of the process. The model was trained with 25 epochs for each spectral predictor (RGB, RGB+RE, RGB+NIR and RGB+RE+NIR).

RESULTS AND DISCUSSION
The best pansharpening algorithm has been highlighted between GS, PC, NND to perform the classifications. Then, the best classifier has been evaluated by comparing the overall accuracy (OA) of ML, SVM and CNN classifiers.

Best Pansharpening Method
The pansharpening results showed that the GS pansharpening outperformed the other pansharpenings (NND and PC) for this dataset (Figure 4). The pansharpening results demonstrate that using GS pansharpening significantly increases the spatial resolution of both bands without degraded spectral characteristics ( Figure 5). Data fusion adds information from the IR spectrum and provides a complete spatial and spectral VHR dataset for the detection and characterization of geomorphological and ecological objects in coastal management (Ibarrola-Ulzurrun et al., 2017).

A B
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2021 XXIV ISPRS Congress (2021 edition)

Scene-scale Classification Accuracy:
ML constitutes the first classification tested from the GS pansharpening (Table  4 and Figure 7). From the RGB overall accuracy (OA), the classification performances increased by 4.33% with the NIR channel, by 5.49% with the RE, and by 6.2% with their combination.  For the SVM classification, the contribution of spectral bands has been evaluated (Table 4 and Figure 8). Similarly, from an RGB basis, classification accuracies increase with the addition of spectral information: so that respectively, the RE and NIR channels increase the accuracy by 5.13% and 2.3%, and by 2.64% for the full combination. The CNN classification results calculated with the confusion matrix do not offer a significant contribution from the predictors (Table 4 and Figure 9). The OA of the RGB is 45.95%. Adding the predictors RE and NIR to the already low base RGB decreases the OA by 18.79% and 27.31% respectively.

RGB
The complete combination of spectral bands does not provide additional information to discriminate the habitats of the study site because the OA is 18.51%. The redundancy of information in the IR spectrum confuses classification results with the SVM classifier when the 4 spectral predictors are tested at the same time (Belluco et al., 2006). At the scale of the spectral band, the RE provides information in the wavelength at about 735 nm. Its positive contribution is unanimous for the ML and SVM classifiers. The results of the classification with CNN are unsatisfactory and make the results unusable. The architecture of ENVI's CNN "U-net" algorithm was designed to detect objects within images through segmentation rather than to perform continuous classifications. However, it appears that deep learning The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2021 XXIV ISPRS Congress (2021 edition) classification performs best when performed on objects rather than pixels . These CNN results can also be explained by the lack of training data that are ultimately essential to achieve good results. The amount of training pixels chosen was performing well for the ML and SVM classifiers but insufficient for the CNN (Längkvist et al., 2016).

Habitat Scale Classification Accuracy:
The contributions of the RE and NIR channels were examined at habitat scale using the three classifiers (ML, SVM and CNN).
The classification performances in the IR can be explained by the absorptance and reflectance of vegetal leaves' chlorophyllian pigments. Species present in high saltmarsh, such as Halimione portulacoides which is characterized by abundant leaves, play a large role on this classification (Carter et al., 2001). The discrimination of plant species present between high and low saltmarsh shows that there is a significant seasonal character. The UAV survey carried out in Autumn 2020 marks the beginning of plant deflowering. Low saltmarsh vegetation is less well extracted whatever the classifier. The classification performance results for the RE and NIR channels of the classes lower saltmarsh vegetation, grey dune plant and yellow dune plant are less significant than for the previously discussed classes. Respectively, the PA scores are: -0.85%, -1.05%; +5.65%, +6.25%; -1.48%, -3.48% for ML classification, and for SVM classification, +0.20%, +4.40%; +2.10%, +4.40%; +1.60%, +1.40%. Siliceous and calcareous sediment mixtures with vegetation negatively impact the contribution of the RE and NIR bands.
Considering that the CNN classification at the OA scale is unsatisfactory, the PA results in Figure 12  The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2021 XXIV ISPRS Congress (2021 edition) Figure 12. Barplot of the producer accuracy of e infrared predictors contributions on the basis RGB at class level (upper and lower saltmarsh, grey and yellow dune plant, road, dry and wet sand) computed with each class with convolutional neural network (CNN) classifier.

CONCLUSION
This new study shows that UAV-mounted cameras with IR sensors are essential for characterizing coastal ecosystems at VH temporal and spatial resolution in the context of rapid evolution imposed by climate change. Following photogrammetric reconstructions, pansharpening algorithms (GS, NND, PC spectral sharpening) enabled the use of RE and NIR low spatial resolution bands of Sequoia+ (1 280 × 960 pixels for each camera) orthomosaics via the RGB VHR spatial bands of the Phantom 4 pro V2 UAV mounted camera (4 864 × 3 648 pixels).
The pansharpening results showed that GS pansharpening performed best in this study with correlation coefficients of 0.61 and 0.63 for the RE and NIR bands respectively outperforming the other 2 methods. Three classifications were applied on the pansharpened dataset corresponding to RGB+RE+NIR at spectral level and 0.01m spatial resolution. ML, SVM and CNN classifiers were tested on the dataset from 7 classes identified by the ground-truth acquisition: upper and lower saltmarsh vegetation, grey and yellow dune plant, road, dry and wet sand.
The ML classifier stands out from the others with an OA of 80.75% for the complete combination of predictors. At band scale, the RE outperforms the other ML and SVM predictors combined. The CNN did not provide good classification results due to the lack of training data. In future studies, a segmentation could be applied before the classification for better discrimination of coastal ecosystems .