GEOGRAPHICALLY WEIGHTED REGRESSION APPROACH FOR SHALLOW WATER DEPTH ESTIMATION USING MULTISPECTRAL SATELLITE IMAGERIES

Shallow water depth is essential for coastal planning, monitoring, and research. Bathymetry data is mostly produced from hydrographic survey using echosounder. The generic result from those measurements is discrete values while the desired output is a continuous depth model. To fill the gaps in the sounding data, we use Satellite Derived Bathymetry (SDB) approach with Geographically Weighted Regression (GWR). This study aims to investigate the feasibility of GWR to model bathymetry of shallow water in the eastern part of Indonesia. We explore the correlation between the number of training data and the predicted result. Two different satellites images are used, namely: Sentinel-2A and Landsat 8 OLI/TIRS with 10 and 30 m resolutions respectively. For the experiment, in-situ data are set into training and validation in three different ratios. The model is developed using adaptive GWR approach in which the parameter of regression would adapt the local data set within different kernel sizes. Finally, we compute RMSE (Root Mean Square Error), R2, and TVU (Total Vertical Uncertainty) to assess the quality of our model. In general, Sentinel-2A produces more detailed information due to higher resolution than Landsat 8 OLI/TIRS. Sentinel-2A also obtains more accurate results based on RMSE values. The percentage number of the estimated depth that fulfils TVU requirements is up to 83%. These assessment quality results give us an insight that the SDB approach using GWR is promising. Thus, the GWR method may be able to provide an estimate of bathymetry for many coastal areas in Indonesia.


INTRODUCTION
The advent of Satellite Derived Bathymetry (SDB) approaches to model sea-water depth could contribute to bathymetry mapping using in-situ measurements. The sea-water depth is determined by hydrographic survey using echosounder instrument installed on the ship. The ships need to sweep the area within a specific sounding line and frequency. Mapping the whole water area is expensive and takes a lot of time. Moreover, in the shallow water area, this method is challenging due to the difficulty of the ship to perform the survey. The safety issues from both vessels and underwater objects become a burden, especially in observing some particular shallow water with coral reefs formations. Meanwhile, shallow water depth information is important for coastal development planning, modelling, monitoring, and coastal research. Thus, multispectral remote sensing approaches to estimate depth have been proposed as alternative methods, especially to fill the gaps between sounding line from the survey.
The Indonesian Geospatial Information Agency (BIG) performs a bathymetry survey and use the obtained data as the main resource to build a bathymetry model. As the only institution responsible for marine and coastal mapping in Indonesia, BIG is required to deliver marine maps in large quantities. The survey in shallow water area is mostly performed with Single Beam Echosounder (SBES). The results are depth points, but the end products of the depth model are a raster-based model. Another alternative is to use SDB approach to fill the gaps obtained from sounding survey and produce the shallow water depth model as raster. Therefore, BIG needs to undertake necessary research to understand the lowest level of error which can be obtained through the implementation of SDB algorithm. Moreover, to understand whether SDB could give benefit to BIG's marine maps relative to conventional SBES measurements is also critical.
This study is focusing on the implementation of Geographically Weighted Regression (GWR) method by concerning the number of training data needed for SDB approach and the quality of the model as described in the literature (Bramante, Raju, & Sin, 2013;Hamylton, Hedley, & Beaman, 2015;Manessa, Haidar, Hartuti, & Kresnawati, 2018;Mishra, Narumalani, Rundquist, & Lawson, 2013;Sagawa, Yamashita, Okumura, & Yamanokuchi, 2019;Said, Mahmud, & Hasan, 2017). Detail explanations are given in the next section. In this research, we defined shallow water area depth ranging from 0 to 35 meter. We tested the performance of GWR to the proportion of training data set that used to determine the SDB model. Furthermore, we also compared the performance of GWR model between two satellite images, namely Sentinel-2A and Landsat 8 OLI/TIRS.

RELATED WORK AND MOTIVATIONS
The principal of optical satellite remote sensing bathymetry is the total amount of radioactive energy reflected from water column which is a function of water depth. In addition, the attenuation of light in the water column is a function of wavelength where the shorter wavelength attenuates less than the longer wavelength. The bottom reflectance can be transformed into depth values after removing the atmospheric scattering, surface reflection, and inwater scattering components (Vinayaraj, Raghavan, & Masumoto, 2016).
The utilization of remote sensing for mapping the water depth had been introduced by Polcyn in 1969. He used ratio algorithms of a pair of bands to find the correlation with water depth (Polcyn & Rollin, 1969). A decade further, Lyzenga modified the algorithm to include the effect of scattering in the water and the internal reflection at the water surface (Lyzenga, 1978) and he continuously upgraded the method (Lyzenga, 1985;Lyzenga, Malinas, & Tanis, 2006). Some research had been done by modifying Lyzenga methods by adding a spatial component in the Lyzenga's equation to improve the accuracy (Kanno, Koibuchi, & Isobe, 2012;Kanno, Tanaka, Kurosawa, & Sekine, 2013;Kanno & Tanaka, 2012).
Similar approaches for Indonesian water have also been conducted, but with other regression methods (Dewi et al., 2019;Manessa et al., 2016;Wicaksono, 2015). However, we argue that study on another approach is still necessary to find the optimum algorithm and model with a good data fit between in-situ data and the related SDB model.

MATERIALS AND METHODS
In this section, we introduce the study area and data sets that we used for this research. General workflow is introduced in Figure  1 and the details of each step are explained in the following subsections. In general, our study starts from data acquisition, satellite image correction, separation of in-situ data set, estimating shallow water depth, to quality assessment.

Study Area and Data
The research was conducted on the small islands in the south part of Morotai Island, Maluku Utara, Indonesia. For the experiments, the area was divided into three Area of Interest (AoI) as shown in Figure 2. The first AoI (AoI-1), named Zum Zum Island, covers about 7 km 2 . The second and third AoI are in Galo Galo village and covers roughly 5 km 2 each. This area was chosen due to the availability of in-situ data and the clear water condition. Also, it has a variation of depth, both shallow and deep water.
The nearest tides station from the study area is in Juanga, about 7 km from AoI-1 and 15 km from AoI-2 and Aoi-3. Juanga tide varies from -0.361m to 1.905m (Center for Geodesy and Geodynamic Control Network, 2020). By considering the short distance between the AoIs and Daruba, we assume that tides in the study area are similar to Daruba. The waters itself have a clear condition and contains sand and corals.
In-situ data depth was acquired from Single Beam Echosounder (SBES) survey held by Indonesian Geospatial Information Agency. The survey was performed in October 2018 with sounding line distance 200 m and sounding frequency 1 s. The final product was depth points with reference to Mean Sea Level (MSL). The depth ranges for AoI-1, AoI-2, and AoI-3 respectively are 3.092 m -35.222 m, 3.042 m -27.072 m, and 3.002 m -29.732 m below the MSL. This study used two multispectral images that have variation in spatial resolution (

Satellite Image Correction
The radiance value includes four components, that is atmospheric scattering, surface reflection, in-water volume scattering, and bottom reflection. The estimation of depth is performed by using bottom reflectance value. This value can be obtained after removing the other three components. As deeper the water as weaker the signal attenuates through it. Therefore, deep water does not include bottom reflectance component due to its depth (Vinayaraj et al., 2016). Thus, the correction can be obtained by using the average radiance at deep water area. The area of the deep water should be specified first. The easiest way to do this was by looking at the images visually. The darker the images, the deeper the depth. Or, systematically calculates the minimum radiance value. The corrected spectral value is given by (Green, Mumby, Edwards, & Clark, 2000): where ( ) is corrected spectral value of band at th point, ( ) is the spectral value in shallow water, and ( ) is the spectral value in the deep water.

Multispectral images
In-situ data sets The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition)

Training and Validation Data
The in-situ data become the reference in the validation process. Then, we need to separate those data into training and validation data sets. Training data is then used to calculate the coefficient regression and the rest is used to assess the quality of the results.
The problem is about how to find the optimum amount of in-situ data needed to build the optimum model. Therefore, in this study we split the in-situ data in each AoI into three training data sets based on the percentage of all data, namely: 75%, 50%, and 25%. The remaining in-situ data that is not included in the prediction would be used for validation purposes.

Depth Estimation Modelling
The corrected spectral values from images are correlated with reference depth values from SBES using a regression algorithm to derive shallow water depth. Several statistical approaches can be used, such as simple linear regression and multiple linear regression or global model. In general, these methods calculate the regression parameters value first, then this value is used in the equation to predict the depth value in each pixel. However, these parameters of regression are still homogeneous, meaning that they are the same for every pixel of the image. That is called global model. However, the regression technique generally depends on the density of the data used. So, when the density is varied, the result would not indicate the spatial relationship between data distribution and the parameters. To address this issue, Brunsdon, Fotheringham, & Charlton (1996) proposed a method called Geographically Weighted Regression (GWR). GWR is a relative simple technique that extends the traditional regression framework by allowing local rather than global parameters to be estimated (Fotheringham, Charlton, & Brunsdon, 1998).
The variables in this study consist of the depth value from SBES, called reference depth, and the spectral value of each band in each pixel. The parameters to be determined was the coefficient of regression. The predicted value to be obtained was the depth value in each pixel images. The global regression model is given by (Fotheringham et al., 1998): and the GWR model is given by (Binbin et al., 2019): where is the known depth, is the intercept, is the coefficient of regression, ( , ) is the coordinates of the nth point, and ( ) is corrected spectral value at the pixel of th point. The difference between equation (2) and (3) lies in the value. In the global model, the coefficient of regression did not express the local heterogeneity as in the GWR model. It means the value in the global model had the same value at each pixel. Meanwhile, GWR model is a weighted regression model that computes coefficient for each pixel. The coefficients are determined using a moving kernel and pixels close to the centroid are assigned higher weights than the pixels away from the centroid (Vinayaraj, 2017). In general, the weighting schemes are classified as discrete and continuous depending on the function used. The most important thing to be considered was the spatial coverage of the kernel (bandwidth) because it will take effect on the weighting.
The bandwidth can be set as fixed or adaptive. Fixed GWR used only one size of bandwidth in each kernel. Meanwhile, in Adaptive GWR, the size of the kernel is vary based on the density of the reference depths where it becomes smaller when the reference points are denser and vice versa (Vinayaraj et al., 2016). Due to the uneven distribution of the reference points, this study used Adaptive GWR (AGWR) method to predict the depth. For data processing, we used GW model in R package, detail of GW model can be found in several previous studies (Binbin et al., 2019;Gollini, Lu, Charlton, Brunsdon, & Harris, 2015;Lu, Harris, Charlton, & Brunsdon, 2014).

Estimated Depth Validation
To evaluate the quality of the implementation, the estimated depth values produced from the AGWR models were compared with the in-situ depth validation data sets. Here we used a standard approach to measure the accuracy of quantitative data by estimating the Root Mean Square Error (RMSE): where dref and dpred are in-situ and predicted depth values respectively at the same horizontal position in the validation data, and n is the number of validation data used in this step.
To address the variance of the estimated and reference (validation) values, we compute the coefficient of determination (R 2 ). In addition, IHO S-44 (International Hydrographic Organization, 2008) provides a specification to compute Total Vertical Uncertainty (TVU). TVU specifies the maximum allowable values with 95% confidence level under a certain order. The computation is actually for echosounder or bathymetric LIDAR measurement. We used this approach to evaluate our model from IHO standard perspective. When we obtained the TVU values from our model, we then counted how many data in our model that have values below these related TVU values. As for the maximum allowable TVU was calculated as (International Hydrographic Organization, 2008): where a represents that portion of uncertainty that does not vary with depth, b is a coefficient which represents that portion of the uncertainty that varies with depth, and d is the depth from the model. This study focused on shallow water depth below 100 m, so our model was categorized as Order 1 where the constant value a = 0.5 m and b = 0.013.

RESULTS AND DISCUSSION
From the experiment, we found that performing GWR on Sentinel-2A provided a better result than Landsat 8 OLI/TIRS images (see Table 2).  As for the variance, R 2 values range from 0.95 to 0.99 for all AoI and all combination ratios of training data implying that the estimated values fit well with the reference values. However, when we checked the quality of our SDB model by calculating the TVU values, we obtained varying values of TVU (see Table  3). In this case, the proportion of training data still resulted in a linear trend. The larger the proportion of training data used, the higher the percentages of TVU which meet the requirement of 95% confidence level. In general, the SDB model from Sentinel  Up to 15 m, it was obvious that by the increase of the depth, the accuracy of SDB models become worst. This may be due to the capability of sensors in capturing information in the water area is limited by the increasing depth (Geyman & Maloof, 2019). In contrast, at depth range 15 to 25 m, the SDB model obtained a good accuracy result. It was not clear to us why the accuracy of the model become better by the increasing depth. Even though, it is stated that SDB model can extract depth information until 30 m in clear water (Evagorou, Mettas, Agapiou, Themistocleous, & Hadjimitsis, 2019), further evaluation on the consistency of this GWR algorithm is required.
The cross-profile location is indicated as black line in Figure 6. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) To further evaluate the quality of our model, we generated cross profiles for the two AoIs as in Figure 5. We can see that AGWR model applied on Sentinel-2A and Landsat 8 OLI/TIRS images produced a fit line in shallow water area (less than 10 m depth), while in the deeper area both lines were shifted. However, from Figures 5a and 5c, we can see that Sentinel-2A profile (green line) has a similar pattern with the profile of SBES measurement (blue line). Figure 6 presents the visualisation of the SDB model from both Sentinel-2A and Landsat 8 OLI/TRS. In general, both images have a similar depth pattern; however, Sentinel-2A images provide more detailed depth information due to the fact that it has a higher spatial resolution. Sentinel-2A with 10 m resolution produced more detailed depth information compared to Landsat 8 OLI/TIRS with 30 m resolution. The results also demonstrated that the use of higher resolution improved the accuracies of the model because when the spatial resolution decrease, it gives effect to the spatial heterogeneity of images. Despite the fact that both images produced similar patterns, the differences are clearly seen in some places, for example Figure 6a,d grid cells A1 and A2. Sentinel-2A detected the boundary between water and land while Landsat 8 OLI/TIRS recognized it as water, for example in Figure 6b,e grid cells B2, and B3.

CONCLUSION AND FURTHER WORK
In this study, we performed SDB modelling with AGWR approach with different sets of training data and satellite images. The statistical and visual evaluation presented in this study shows the capabilities of AGWR to estimate depth in clear shallow water. From RMSE and TVU values, we argue that AGWR is promising as a method to derive depth information from satellite images and can further be applied in other areas in Indonesia.
However, improvements are still needed for the consistency of the SDB model, for instance a careful check on the datasets used in this research. Information on the accuracy of the in-situ data is important to have a better understanding of the datasets. Moreover, the pre-processing step of the satellite images should be carried out in a more comprehensive way, for example by comparing different atmospheric correction approaches.
Finally, based on our research, it was obvious that higher resolution images produced better results. Hence, future study using higher resolution data sets is necessary to be able to result in more detailed information in the shallow water area.