HOW EFFICIENT CAN SENTINEL-2 DATA HELP SPATIAL MAPPING OF MUCILAGE EVENT IN THE MARMARA SEA?

With the repetition of mucilage event, which is triggered by many different anthropogenic, climatic and microbiological factors, in the Marmara Sea in 2021, the importance of water quality in the seas has come to the fore again. To present the spatial distribution of the mucilage, a feasibility study has been carried out with point-based water quality measurements and remote sensing data. In-situ measurements are collected routinely within the scope of the Integrated Marine Pollution Monitoring Program (DEN-IZ) which was conducted in cooperation with the Ministry of Environment, Urbanisation and Climate Change and the Scientific and Technological Research Council of Turkey Marmara Research Center (TUBITAK-MAM). In this preliminary study, 16 in-situ measurements, 5 of which were taken from water containing mucilage, on 29 April 2021 in the Gulf of Gemlik were used. Then, univariate regression analyzes were performed in two different scenarios (i.e. 5 mucilage points and all in-situ points) with Sentinel-2 satellite imagery and in-situ water quality measurements for 2 different parameters (i.e. chlorophyll-a ― Chl-a) and turbidity). According to R and accuracy assessment measures (fand tstatistics etc.), the most suitable models were determined for two scenarios and two parameters. Finally, the performances of the selected models were tested with 2 different in-situ measurements and satellite images (dated 22 and 27 April) taken from dates close to the data set used; and it was concluded that the models created with 16 points were successful for both Chl-a and turbidity estimation for this preliminary study.


INTRODUCTION
The Marmara Sea is an inland sea and forms a connection between the Black Sea and Aegean Sea (part of the eastern Mediterranean), which are two large semi-enclosed basins, through the Bosporus and Dardanelles Straits. In the Sea of Marmara, the brackish Black Sea water (~18.0 psu) forms the upper layer and flows to the Mediterranean Sea, while the high saline water (~38.5 psu) from the Mediterranean fills the basin and flows in the opposite direction (Besiktepe et al., 1994). These different salinity levels form a distinctive salinity system, which causes the stratification and anoxic bottom water (Yilmaz et al., 2019). With the having higher salinity, the Mediterranean Sea is the bottom layer and the Black Sea, which has eutrophic nature, is the upper layer of the Marmara Sea (Unlulata et al., 1990;Balcioglu, 2019).
The Marmara Sea has a sensitive and eutrophic ecosystem; however, it is polluted by various sources such as rapid population growth and industrial activities (Pekey et al., 2004;Yilmaz et al., 2019). For monitoring the pollution and its effects on seas and coastal waters, the Ministry of Environment, Urbanisation and Climate Change has conducted an Integrated Marine Pollution Monitoring Program (DEN-IZ) in cooperation with the Scientific and Technological Research Council of Turkey -Marmara Research Center (TUBITAK-MAM) since 2014. The scope of this program is to carry out in-situ measurements and analyses of water quality parameters at the designated stations in all seas (i.e. Black Sea, Marmara Sea and the Straits, Mediterranean and Aegean Sea), and then to report the results and evaluations periodically (Url-1).
Mucilage, which results from planktonic and benthic algal blooms, is a well-known phenomenon and has been observed in different seas, particularly in the Adriatic Sea, since the 18 th and 19 th centuries (Pompei et al., 2003). The Marmara Sea was introduced to the mucilage phenomenon for the first time in the 1990s. Although it is occasionally observed at non-periodic intervals, the most intense dates are in 2007 and 2021. The main potential causes of the mucilage are listed as anthropogenic effects (domestic, agricultural, industrial wastes) (Benedetti-Cecchi et al., 2015), climatic effects (De Lazzari et al., 2008) and microbiological activities resulted from these effects (Flander-Putrle and Malej, 2008).
In this study, the mucilage event that came to the agenda the most in Turkey in 2021 was investigated together with remote sensing data and in-situ water quality measurements made within the framework of joint study initiated with the DEN-IZ program. A feasibility study has also been initiated for the integration of remote sensing data in the last period of the DEN-IZ program, which is carried out at periodic intervals and is based only on point-based measurements and analyses. With this feasibility study, it was aimed to map not only the point but also the spatial distribution of water quality parameters in the Sea of Marmara using remote sensing data; this is because remote sensing is capable of showing spatial extent by estimating with a limited number of local measurements to better characterize the water body. Therefore, the main purpose of the study is to show the preliminary findings regarding the spatial distribution of mucilage and water quality parameters measured within the scope of the DEN-IZ program in the Gulf of Gemlik, which was selected as the case study area, using remote sensing data, and then to examine its applicability to other areas with an acceptable regression model.

STUDY AREA
As it is known, water quality levels deteriorate (from mesotrophic to eutrophic) in areas where domestic, industrial and agricultural wastes are discharged (Tufekci et al., 2010). In the Marmara Basin, which covers 2.96% of Turkey's surface area, high levels of domestic and industrial wastewater pollution are observed especially in the Gulf of Izmit and the Gulf of Gemlik, as well as in Istanbul and Kocaeli, where the population, urbanization and industrialization are intense. In the case study area, the Gulf of Gemlik, not only the urbanization and/or industrialization, but also the increasing agricultural activities and the pollution caused by the Karsak Stream flowing into the gulf create a significant problem (Teksoy et al., 2019).
Located in the southwest of the Sea of Marmara (Figure 1), the Gulf of Gemlik is 2-6 km wide in front of the Gemlik district in the east of Tuzla Point and 12-24 km wide between Trilye and Bozburun in the west. The average and maximum depths in the Gulf are 59 and 107 m, respectively. The regional winds, which play a dominant role in the dynamics of this semi-enclosed sea, are mostly controlled by the surrounding mountains and blow from the northwest in winter and predominantly from the northeast during the rest of the year. With a drainage area of 27 600 km 2 and an average water flow of 158 m 3 /s, the Karasu River is the most important geographical element of the region and carries 0.5-5.5 tons of suspended solids to the sea daily, depending on climatic conditions (Unlu and Alpar, 2006).

Materials
In the study, Sentinel-2A/MSI (S2A) Level 2 satellite image dated 29 April 2021 was used for the estimation of water quality parameters such as chlorophyll-a (Chl-a) and turbidity in the Gulf of Gemlik. The characteristics of Sentinel-2 data, which is an open data access policy, are given in Table 1 As the atmospheric correction is a key limiting factor of satellite-based water quality monitoring, the reliability of results from water-leaving reflectance will be subject to the quality of atmospheric correction (Warren et al., 2019). In this study, Sentinel-2 Level 2 products (bottom of atmosphere, BOA) distributed by ESA/Copernicus and atmospherically corrected with the SEN2COR package were used, since this method, as used in many similar studies, was found to be more suitable for the correction of inland water bodies compared to coastal areas (Toming et al., 2016;Warren et al., 2019).
There are 16 in-situ measurements taken on the same date with the satellite image acquired on April 29, 2021. The distribution of in-situ measurements is given in Figure 1. Statistical measures (i.e. minimum, maximum, mean and standard deviation) related to in-situ measurements dated April 29, 2021 at the coastal waters of Gulf of Gemlik are given in Table 2. Five of these samples, shown in red in Figure 1, were taken from water areas that contain mucilage specifically. Therefore, regression analyzes were performed in two different scenarios to examine the effect of sampling point number on prediction model results: using only five mucilage samples and all samples.

Methodology
To establish an empirical relationship (e.g. linear or non-linear regression) between the water-leaving radiance measured by the sensor (i.e. spectral reflectance values -individual band or combinations of bands) and in-situ water quality measurements, an empirical approach, which can be fully or semi-data driven and requires adequate in-situ water quality measurements, were used in this study.

Methodology
To establish an empirical relationship (e.g. linear or non-linear regression) between the water-leaving radiance measured by the sensor (i.e. spectral reflectance values -individual band or combinations of bands) and in-situ water quality measurements, an empirical approach, which can be fully or semi-data driven and requires adequate in-situ water quality measurements, were used in this study.
First, the spectral reflectance properties of two different water bodies (mucilage and clear water) were evaluated. As shown in Figure 2, reflectance spectra collected from clear water areas as well as mucilage-containing areas in particular show very different spectral profiles in the VNIR region.
As seen, the mucilage reflectance increases from blue to green and then shows a local minima around the red band, which can be explained by the presence of a chlorophyll pigment. Then the spectrum flattens towards the Red-edge (RE) and Near-infrared (NIR) wavelengths (Hu et al., 2022).

Correlation Analysis:
Although satellite images cannot measure all aspects of the physical-chemical and biological properties of a water body, in the literature many studies have shown a correlation between optical active components and the spectral response of measured water (Kavurmaci et al., 2013;Gholizadeh et al., 2016b;Batur and Maktav, 2018;Sagan et.al., 2020). However, as noted in the literature, correlations between in-situ measurements and spectral reflectance values can be complex and nonlinear, especially for Case 2 water bodies, as these parameters respond differently to various spectral wavelengths (Gholizadeh et al., 2016a;Hafeez et al., 2019;Topp et al., 2020). Therefore, regression models were created using linear and 2nd-order polynomial and the most suitable appropriate prediction model was chosen by statistical evaluation.

Accuracy Assessment:
As known, the accuracy of mapping the bio-optically active parameters such as chlorophyll-a, turbidity, etc., is largely dependent on the biooptical equation developed. In general, the accuracy of the regression models to be developed is evaluated with accuracy metrics (Standard error, f-and t-statistics, etc.). In this study, some statistical accuracy metrics were taken into account and then the accepted models were tested using two other images close to the study area along with in-situ measurements.
To evaluate the performance of the models used in regression analysis, Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) given in Eq. 1 and Eq. 2, were used. (1) (2) where yi is the measured value, ŷi is the estimated value and n is the number of measurements.
The lower value of RMSE and MAE implies higher accuracy of a regression model.

RESULTS AND DISCUSSION
Regression analyzes were performed as linear and/or 2nd order polynomials with many different spectral band/ratio combinations, taking into account some relevant spectral regions and/or using the most common equations in the literature. In the analysis, it was seen that the linear polynomials used to retrieve these two parameters gave relatively better results when the f-test and R 2 measure used to compare the fit of different linear models, were taken into account.
From the empirical regression modeling, Figure 3 and Figure 4, respectively, show the best regression models for the estimation of the concentration of Chl-a and turbidity from Sentinel-2A.
The statistics of the regression models (for five mucilage samples and all samples) selected in the preliminary analysis for two parameters are given in Table 3.
First, the green band (B3) was observed to be important in detecting Chl-a from both sampling datasets. For 16 samples, the best model fit was found suitable for the retrieval of Chl-a using the ratio between the green (B3) and the blue (B2) bands, with R 2 = 0.58. For only five mucilage samples, the difference between the green (B3) and red (B4) bands correlated better due to the local minimum around the red band, indicating possibly live algae (Figure 2). When this model was visually compared with Figure 1, it can be seen that the Chl-a concentration in Figure 3a is higher, especially in the mucilage regions.
Although the t value of this model is low but acceptable (i.e. it must be greater than +2 or less than -2), the p-value, which helps to determine the significance of the results about the null hypothesis (typically ≤ 0.05 indicating statistically significant), was found to be higher for the selected regression model (i.e. 0.104). On the other hand, the R 2 coefficient of this model was observed to be slightly higher than the other model.
On the other hand, in the other model performed with all samples, the higher concentration was seen mostly in the lower coastal parts of the gulf and in areas where mucilage is intense (Figure 3b). In the estimation of turbidity, the visible bands (blue (B2), green (B3) and red (B4)) and RE (B5) bands are observed to yield the best results. Visually, it was seen that the spatial distribution of turbidity is compatible with the Chl-a distribution (Figure 4b).
Although p-values were obtained at an acceptable level, it was observed that rather different R 2 accuracies were obtained (0.98 and 0.33) in the empirical regression modeling used for the prediction of turbidity for the two data sets (Table 3). In other words, the 16-sample model was found to be less accurate than expected in turbidity retrieval. On the other hand, it was seen that the turbidity model created with five mucilage samples was particularly successful in showing the linear features (traces) of the mucilaginous phenomenon (Figure 4a).
Although the model used to predict the turbidity did not perform well, it was still taken into account in the validation phase; the other model was not considered realistic due to the small number of samples used. Therefore, in this preliminary study, to validate the accepted models for the two parameters, these models were applied to the other two Sentinel-2 test satellite images (April 22, 2021 and April 27, 2021 using 5 and 10 in-situ measurements, respectively) acquired very closely with the dataset used in the analysis.
The calculated RMSE and MAE values between accepted models applied to two Sentinel-2 test images with in-situ measurements are given in Table 4. As seen in Table 4, despite having a low R 2 value, the predicted turbidity gave a lower RMSE value compared to the RMSE obtained for the estimated Chl-a in each 2 date test image. However, considering the ranges of the measurement values, it was concluded that the results obtained were better than expected.  Table 4. Validation results with in-situ measurements of accepted models applied to two Sentinel-2 test images.

CONCLUSION
This preliminary study attempts to model Chl-a concentrations and turbidity in complex inland waters such as the Gulf of Gemlik in the Marmara Sea using Sentinel-2 images and relate them to the mucilage phenomenon. Although these two parameters are considered to be important water quality parameters that can be accurately retrieved by satellite reflectance measurements, especially in Case 1 waters, however, the situation is different in Case 2 waters such as the Gulf of Gemlik.
Different band correlations including various combinations of VNIR bands were tested for the retrieval of turbidity and Chl-a parameters. Specifically, the green band (B3) was observed to be important in detecting Chl-a from both sampling datasets and also turbidity. In general, it was determined that the statistics of the model selected for Chl-a fit better than turbidity. In other words, the accepted turbidity model achieved the lowest agreement (R 2 =0.33), but in validation of this model, the RMSE was obtained quite low on both Sentinel-2 test images.
The spatial distribution maps of the two parameters were found to be visually significant, although the model with five mucilage samples was not statistically very realistic as the smaller number of samples gave a result that may not be strong enough to demonstrate the relationship.
It is clear that measurements and analyses made on a single date will not be sufficient in cases where ecological variables differ significantly due to untreated wastewater discharges, such as in this study area. Also, there are some general problems/issues already mentioned in the literature, but some critical ones still need to be reiterated. These are (i) the dynamic nature of water bodies and optical complexity of inland waters, (ii) the need to have a precise atmospheric correction of the satellite images, (iii) the sensitivity of the models to local environmental conditions, which causes them not to be automatically replicated to other regions, (iv) the requirement of large water quality sample sizes in models, (v) the spectral resolution of the sensors, (vi) the need for high signal-to-noise ratio (SNR), (vii) similar spectral properties of mucilage and floating matters (such as macro debris, microplastics).
In conclusion, the model findings, which were carried out in only one region of the Marmara Sea, show that the model is not yet robust and sufficient. Therefore, it is planned to continue these studies with more frequent and in-situ measurements in 2022 within the scope of the DEN-IZ program to develop an optimal algorithm(s) for the accurate estimation of bio-optic water quality parameters in the region. In addition, the seasonal and annual trends and changes of not only these 2 parameters used in this study, but also other optically active parameters (such as Secchi disk, Salinity, Total suspended materials, etc.) will be evaluated with the new in-situ measurements made and/or to be made. Besides, the use of Landsat 8 as an additional satellite will provide enhanced overpass opportunities and therefore may constitute an operational approach in providing regular observations for the seasonal behavior of these parameters. The success of this program will help to establish an operational satellite-based water quality monitoring system not only in the Sea of Marmara, but also in all coastal areas of Turkey.