EELGRASS MAPPING IN ATLANTIC CANADA USING WORLDVIEW-2 IMAGERY

Eelgrass (Zostera marina L.) is a marine angiosperm plant that grows throughout coastal areas in Atlantic Canada. Eelgrass meadows provide numerous ecosystem services, and while they have been acknowledged as important habitats, their location, extent, and health in Atlantic Canada are poorly understood. This study examined the effectiveness of WorldView-2 optical satellite imagery to map eelgrass presence in Tabusintac Bay, New Brunswick (Canada), an estuarine lagoon with extensive eelgrass coverage. The imagery was classified using two supervised classifiers: the parametric Maximum Likelihood Classifier (MLC) and the non-parametric Random Forests (RF) classifier. While Random Forests was expected to produce higher classification accuracies, it was shown not to be much better than MLC. The overall validation accuracy was 97.6% with RF and 99.8% with MLC.


INTRODUCTION
Seagrasses are angiosperm plants that grow in brackish and saltwater systems found in coastal areas across the globe. There are approximately 60 species of seagrass worldwide. Atlantic Canada is within the Temperate North Atlantic bioregion, which has only five species: Ruppia maritima L., Zostera marina L., Zostera noltii Horneman, Cymodocea nodosa Asch, Halodule wright Asch, with Z marina, being the most common (Short et al., 2007a(Short et al., , 2007b. Eelgrass beds are the ecological base of many nearshore marine ecosystems (Heck, Orth, 2006) and are highly sensitive to environmental fluctuations, making them possible indicators of nearshore ecosystem health and the effects of climate change and other anthropogenic influences on marine and estuarine ecosystems (Thom et al., 2013). In facing numerous human-driven influences, eelgrass ecosystems have been declining globally for at least the last 100 years (Waycott et al., 2009). Throughout Atlantic Canada, declines of 30% to 95% have been reported for several bays in the recent decade (DFO, 2009).
To properly monitor eelgrasses and to study the impacts of anthropogenic disturbances on their distribution, it is important to have a reliable method of accurately mapping the extent of eelgrass beds (Hogrefe et al., 2014). Acoustic methods have been used to create high-quality sonographs on a scale relevant to eelgrass mapping (Kenny et al., 2003), but acoustics data are acquired with very expensive equipment that cannot be used under adverse weather conditions or require specialist knowledge to be processed. Also, acoustic surveys are made of transects that may provide insufficient coverage on temporal and spatial scales and require interpolation methods. The same applies to the bathymetric lidar data (Webster et al., 2015, Collins et al., 2016. Optical imagery can cover the entirety of a study area and has been extensively used to map benthic habitats (Orth, Moore, 1984). While aerial photographs have been used with success for more than 20 years (Mumby et al., 1997), the use of optical satellite images has been ongoing only since the launch of the * Corresponding author Landsat MSS satellite in 1972 (Lyons et al., 2012). Satellite imagery provides a larger level of coverage at a smaller cost when compared to aerial photographs , Hossain et al., 2015 and does not require interpolation like other methods. Most published studies that use optical satellite imagery have occurred only recently in Canada (O'Neill, Costa, 2013, Reshitnyk et al., 2014Stantec, 2014;2016;Barrell et al., 2015). Temperate water poses additional challenges for mapping eelgrass compared to tropical and sub-tropical waters because they tend to have lower clarity, which allows for lower resolution between features and low light penetration, which allows for mapping at shallower depths. (O'Neill, Costa, 2013, Reshitnyk et al., 2014. In most of the aforementionned studies, the classifier that was employed is the standard supervised Maximum likelihood classifier (MLC). Such a classifier has the inconvenience of requiring a normal data distributionbut not with Random Forests (RF), which is a non-parametric supervised classifier (Breiman, 2001). RF was showed to outperform MLC in several land cover studies (Pal, 2005, Gislason et al., 2006, Waske, Braun, 2009, LaRocque et al., 2014. The goal of this study is to compare RF and MLC to map eelgrass bed distribution in Tabusintac Bay, New Brunswick (Canada), using one WorldView-2 (WV-2) image acquired at low tide. One of the challenges of mapping eelgrass beds in Tabusintac Bay is that it is an estuarine lagoon having a temperate climate where light penetration is generally low. To assess the potential change in the eelgrass distribution in Tabusintac Bay over time, the resulting map will be compared to a previous map produced by the interpretation of air photographs (Mahoney, Hanson, 2008).

Study area
Tabusintac Bay (Figure 1) is a 25 km 2 shallow estuary and lagoon system located on the northeastern coast of New Brunswick (N 47° 20' 21", W 64° 55' 42"). The adjacent region hosts a diverse assemblage of habitats, including estuarine flats, peat bogs, saline ponds, and wetlands, while the lagoon area itself is dominated by extensive eelgrass (Zostera marina) beds that support many species of commercial and recreational importance. The wave energy within the estuary is generally low, thanks to an extensive and dynamic bar of sand dunes and islands which shelters the area from wave action from the Gulf of St. Lawrence. The bottom consists of fine-grained sediments like mud and sand, making it an ideal substrate for eelgrass growth. The estuary is generally shallow at 0 to 2 m deep, though a 2 to 5 m deep channel cuts through the lower portion of the region, which allows boat access to the Northumberland Strait (Webster et al., 2015, Stantec, 2014, Stantec, 2016. The site has been identified under the Ramsar Convention on wetlands as an international area of importance due to high levels of waterfowl visitation and uses during spring and fall migration periods, with some of these species depending directly on eelgrass for food (Ramsar, 2001). The dune system is also the home to one of the largest common tern colonies in Atlantic Canada and has been the site of piping plover (Charadrius melodus) nests (IBA Canada, 2013).  (Stantec (2014) and used in this study.
Field data were collected during a two-day field survey (Stantec, 2014) between September 24 th and 26 th , 2014, coinciding with the acquisition of the image (Figure 1). The study area was divided into an equally sized grid of 1 km wide-tessellated hexagons, and a single area of interest was randomly selected within each hexagon. This division allowed for the whole study area to be sampled while maintaining randomness ( Figure 1). A total of 55 sites were surveyed. At each site, GPS coordinates were recorded using a WAAS-enabled chart plotting unit (Garmin GPSmap® 531s, Garmin International Inc., Olathe, Kansas, USA), and pictures were taken using a downward-facing underwater Deep Blue Pro© drop video camera (Deep Blue Pro Splash Cam, Ocean Systems Inc., Everett, WA, USA) attached to a 0.25 m 2 quadrat frame. Additional samples were collected by Stantec (2014) in a specific area of interest, located in the southwestern portion of the study area.

Pre-Classification Image Processing
All the image processing was performed in PCI Geomatica® software (PCI Geomatics, 2018), except the atmospheric correction, which was done in ENVI® 5.1 software (Exelis Visual Information Solutions, 2013). The digital numbers (DNs) of the input image were converted into Top of the Atmosphere (TOA) reflectance. First, DNs were converted to radiance by applying Equation 1 that uses band-specific calibration gain and offset parameters, which were extracted from the image metadata (Digital Globe, 2010): The TOA radiance was then converted into TOA reflectance, which represents the radiation that has travelled from the sun through the atmosphere, reflected off the surface of the earth and interacts with the sensor. This conversion uses Equation 2 and compensates for the relative position of the sun and the earth, at the zenith: where ρλ = TOA reflectance (dimensionless) Eλ = Solar spectral irradiance (W. μm -1 ) Lλ = TOA spectral radiance (W. m -2 . srad -1 . μm -1 ) dES = distance earth-sun (m) z = solar zenith angle (rad) TOA reflectance has an atmospheric component that is caused by the effect of light reflection and scattering from atmospheric particles and by the atmosphere itself and needs to be removed.  Zhang (2002) to downscale the multispectral bands to 0.5 m spatial resolution by combining the panchromatic band and the multispectral bands. High spatial resolution is key for capturing the spatial variability and patchiness of eelgrass meadows due to the potentially complex distribution of eelgrass beds in the lagoon. A land mask was created to limit the image classification to underwater features The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) using the Normalized Difference Water Index (NDWI), which was calculated, as in McFeeters (1996).
Given that the aim of this study is to map eelgrass beds which are chlorophyllous elements in the scene, it is advantageous to consider vegetation indices in order to exploit the higher reflectance in the green and near-infrared bands of chlorophyllous elements compared to non-chlorophyllous elements. While near-infrared radiation is expected to be readily absorbed by water, there are portions of the study area that include exposed eelgrass at low tide which can reflect this radiation. The vegetation indices considered in this study are listed in Table 1.
(1) R = red band; G = green band; NIR = near-infrared band  (2014); (2) Has been used for the "eelgrass absent" class Table 2. Number of training areas with the corresponding area, the pixel count and the number of GPS validation sites for each class, used in this study When training areas were delineated, care was made to reduce spectral variance in each class while at the same, having enough pixels to represent each class effectively. Training areas were used to compute the class spectral signatures. These spectral signatures were then used to assess the separability between classes through the Jeffries-Matusita (J-M) distance of each class pair (Richards, Jia, 2006).

Classification
The first classifier that was tested is the commonly used Maximum Likelihood Classifier (MLC). This classifier was performed using the MLC algorithm of PCI Geomatica 2018®. MLC is a parametric classifier assuming a Gaussian distribution of grey level values for each class and the same probability of occurrence for each class in the image. It classifies each pixel x in class i by maximizing the following discriminant function (Strahler, 1980) where gi(x) = discriminant function for class i and pixel x p(i) = a priori probability for class I X = grey level value of pixel x in each input image Mi = mean vector for class i ∑i = covariance matrix for any class  = determinant of the covariance matrix   − = inverse of the covariance matrix I (X-Mi) t = transposed matrix of (X-Mi) k = number of input images used in the classification The second classifier is Random Forests (RF) is a non-parametric classification algorithm (Breiman, 2001). Originally designed for use by the machine learning community, this algorithm is becoming increasingly popular for remote sensing applications showing numerous advantages when compared to other classical classification methods, such as MLC (Pal, 2005, Gislason et al., 2006, Waske, Braun, 2009, LaRocque et al., 2014. RF classifier generates a series of decision trees, which are predictive models that use a set of binary rules to calculate a target value. The complexity of the decision tree is directly related to the number of sources of data, in our case, the number of image layers being used. Each tree uses a randomized subset of the input data, and each of the categories produced by each tree is slightly different. In the RF algorithm, hundreds of trees are produced, and the pixels are classed based on the agreement between trees. Also, RF is not sensitive to noise or over-classifying and gives an estimate of the importance of each input image for the classification (Gislason et al., 2006, Waske, Braun, 2009). The specific algorithm used for this study was developed in the R programming language (R Development Core Team, 2016). It is based on the "Random Forest" code written by Horning (2010) and adapted by the third author. The RF code we used has two versions: all-polygon and sub-polygon. The all-polygon version uses 100% of the training areas to define class training areas, while the sub-polygon version randomly selects a user-defined number of training area pixels from each class. Following Byatt et al. (2018), we used the all-polygon version because it gives a higher mapping accuracy using all the training areas in the classification method. Another advantage of using RF is that the algorithm outputs a variable importance plot. This plot displays the weighted mean decrease in error that indicates how much an individual variable input is used by the RF classifier to make its prediction.

Classification and Validation Accuracy Assessment
Classification accuracy was assessed first by comparing Areas of Interest (AOI)s with the equivalent class in the imagery. This comparison was performed under the form of a "confusion matrix" or "error matrix", where each cell expresses the number of pixels classified inside the class defined by the training areas (Congalton, 1991). The confusion matrix allows computing individual class User's and Producer's accuracies and their related errors (omission and commission). The User's class accuracy corresponds to the probability that a pixel of the classified image is in the correct class, the associated number of misclassified pixels being pixels classified in the incorrect class (error of omission). The Producer's accuracy measures the probability that a reference pixel is effectively well classified, the associated number of misclassified pixels being pixels that belong to another class (error of commission). The overall accuracy is the average of individual class User's or Producer's accuracies, weighted by the size of the class in the classified or reference image. The kappa coefficient is defined as a weighted measure of agreement between the numbers of well-classified pixels, where a value close to 0% corresponds to a classification that is no better than what could be expected by chance and a value closer to 100% indicates a good classification accuracy (Cohen, 1960).  Table 3. J-M distances computed for the eight original bands of the WV-2 image. Figure 2 shows the mean TOA reflectance for each class as a function of the WV-2 band, as extracted from the images using the corresponding training areas. All the classes followed a general trend that matches how we would expect the light to be absorbed in water, with higher absorption (weaker reflectance) in the longer wavelengths, and gradually less absorption (stronger reflectance) in the shorter wavelengths. The Deep water class had the lowest reflectance in the red-edge, NIR-1, and NIR-2 bands, compared to the other two classes. This is likely because the Sand Floor and Eelgrass classes were delineated over shallower areas where light penetration was higher. The Sand Floor class followed a similar decreasing reflectance pattern from the coastal to the NIR bands, though the reflectance was generally higher. From the coastal to yellow wavelengths, the Eelgrass class distinctly had the lowest reflectance among all the classes, including the Deep water class. Among all the vegetation indices considered in the study, only DVI, RVI and GDVI all showed good separability between classes (Figure 3). DVI and RVI were both calculated using the NIR-1 and Red bands, while GDVI used both the NIR and Green bands in its calculation. The NIR-1 band had a good spectral separability between the Deep water class and the other classes, while the case of the red band, the Sand Floor class was most separated from the others. (Figure 3). In the case of the green band, there was relatively little separability between the classes.

Classification
When only the original eight bands were used, the overall classification accuracy was 98.55% with the MLC classifier and 99.86% with the RF classifier. The corresponding kappa coefficients were 97.65% and 99.80%, respectively. The individual class Producer's and User's accuracies for both classifiers were also very high and are never below 95%, as shown in the confusion matrices (Tables 4 and 5). The Sand floor class has the lowest User's and Producer's accuracies, and the Deep water class had the highest accuracies, the Eelgrass class having intermediate accuracies, though the differences were minor (Tables 4 and 5). The comparison between the detailed confusion matrices of both tables shows that there was a higher number of well-classified pixels and a lower number of misclassified pixels for the RF classifier (Table 5) than for the MLC classifier (Table 4). However, we only show here the MLCclassified image produced because of the minimal difference between both images (Figure 4).  Table 5. Confusion matrix (in terms of number of pixels) and associated accuracies when the RF classifier is applied to the 8-band images The RF algorithm has the advantage of giving a variable importance plot that ranks the input features as a function of their The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B3-2020, 2020XXIV ISPRS Congress (2020 importance in the classification ( Figure 5). The most important bands for the classification were the green (510 -580 nm), red (630 -690 nm) and coastal (400 -450 nm) bands, while the least important bands were the near-infrared 1 (770 -895 nm), blue (450 -510 nm), and yellow (585 -625 nm) bands (Figure 5a). The green band allowed some spectral separability between the Deep Water and Eelgrass classes ( Figure 3) and was the band that most distinguishes the two. The green and coastal bands have the lowest rates of attenuation in shallow water, especially when compared to the longer wavelength bands.  With the addition of vegetation indices to the image combination (Table 6), the overall classification accuracy and the kappa coefficient with MLC were very low (42.65% and 0%, respectively). According to the confusion matrix of Table 6, the misclassification in the case of the MLC because all the pixels were classified in the Eelgrass class. This misclassification demonstrated how MLC could not be effectively used with a variety of different data types. By contrast, the RF classifier produced higher classification overall accuracy (99.83%) and the kappa coefficient (99.70%) than MLC, but they were not higher than when RF used only the eight original bands (Table 5).
The variable importance plot (Figure5b) showed that, among all the vegetation indices, the most important ones were the Normalized Red (NR) and the Green Difference Vegetation (GDVI). This was likely because both used the green band in their computation, although another green-based vegetation index (NR) did not exhibit a large separability between classes. The other vegetation indices did not seem to be very important in the RF image classification (Figure 5b). 42,65 0 0 Table 6. Confusion matrix (in terms of number of pixels) and associated accuracies when the MLC is applied to the 8-band images and the related vegetation indices

Field Validation
This validation and the subsequent analysis are done on the MLC-classified image based on the eight bands given the excellent classification accuracies ( Table 7). The validation was done by comparing it to the GPS validation sites. We achieved excellent values for the overall accuracy (96.36%) and the kappa coefficient (91.00%). The related confusion matrix shows that only a few sites were poorly classified (

Comparison with the eelgrass bed extent of 2008
The eelgrass bed distribution map obtained by classifying the 8 original bands of the WV-2 image was compared to the one of Mahoney and Hanson (2008) established from aerial photographs.  Figure 6 also shows a difference for the sandbar area that could be due to changes in the shape and extent of the sandbar over time, as described by CBCL Ltd (2014).

DISCUSSIONS
Our results showed that WV-2 images acquired at low tide are suitable for mapping eelgrass beds in shallow bays such as in Tabusintac. Indeed, the WV-2 sensor has a number, width, and overlap of spectral bands that are suitable for eelgrass bed and benthic mapping. The RF outputs shows that the green (506-586 nm) and coastal bands (396-458 nm) were amongst the most important input data in the classification ( Figure 5). With the MLC classifier applied only to the 8-band WV-2 image, our overall classification accuracy (98.55%) was higher than those obtained by Stantec (2016)  While RF was expected to produce higher classification accuracies, it was shown not to be better than MLC when only the eight bands of the WV-2 image were used. This result is contradictory to other studies comparing MLC and RF (i.e., LaRocque et al., 2014;Byatt et al., 2018). One of the major discrepancies with these previous studies is the diversity in the input data. In our analysis, we only utilized optical data that have a normal distribution, while LaRocque et al. (2014) and Byatt et al. (2018) used a more complex dataset made of optical imagery, radar imagery, and topographic data, with some data having a non-normal distribution such as the radar data.
We expected that the inclusion of vegetation indices would improve the overall classification accuracy, as it is often the case in land studies, but there did not seem to be an advantage of using these indices, especially with the MLC classifier. With RF, among all the vegetation indices considered in the studies, the most important ones were two vegetation indices that use the green band in their computation, i.e., the Normalized Red (NR) and the Green Difference Vegetation (GDVI), although NR does not exhibit a large separability between classes by contrast to GDVI. The lack of improvement with the vegetation indices is likely because our study deals with water-related ecosystems, and these indices were developed for land studies to highlight the presence of green vegetation or to assess vegetation water content.
The 2014 WV-2-derived map was compared to a map produced from the interpretation of aerial photographs carried out by Mahoney and Hanson (2008). The comparison showed that in some parts of the bay, there was an increase of approximately 6 km 2 of eelgrass, but a decrease of approximately 4 km 2 in other parts of the bay, resulting in a net increase of 2 km 2 for the whole bay. This result agrees with Leblanc et al. (2019), who showed a positive trend in the total eelgrass area in Tabusintac by analysing a temporal series of Landsat images between 1984 and 2017. Tabusintac Bay seems to experience a different evolution in eelgrass beds compared to several bays in the Atlantic Canada, where inter-annual declines ranging from 30% to 95% have been reported for the last decade (DFO, 2009).

CONCLUSIONS
This study tested the use of two supervised classifiers applied to a WV-2 image for mapping eelgrass beds in Tabusintac to assess their respective effectiveness. The classification accuracy of eelgrass maps using WV-2 imagery was not improved by using the RF classifier when compared to MLC. Also, the addition of vegetation indices in the classification did not improve the classification accuracies. The classified image was then compared with the eelgrass distribution map that was established from air photo interpretation (Mahoney, Hanson, 2008). Such comparison shows a net increase of the eelgrass bed area in Tabusintac Bay, which agrees with the results of Leblanc et al. (2019), who mapped changes in eelgrass bed areas between 1984 and 2017 in Tabusintac Bay using a time series of Landsat images.
This study only tested two classifiers (MLC and RF) applied to imagery acquired by a commercial satellite WV-2, and there is the need for future research to examine and assess freely available satellite images such as Sentinel-2. Also, we did not correct the images for sun glint effects described in Zimmerman and Dekker (2006) and the water column effects. There is the need to test if algorithms such as the one of Hedley et al. (2005) can be used for removing the sun glint effect. For the water column correction, it would be advantageous to test the use of models such as the one of Lyzenga (1978Lyzenga ( , 1981 in the absence of bathymetric data or the one of Sagawa et al. (2010), which requires bathymetry data.