Auto-adaptive Multi-Level Seafloor Recognition and Land Sea Classification (AMSRLC) in Reef-Island Zones Using ICESat-2 laser altimetry

The world’s first photon-counting laser altimetry satellite, the Ice, Cloud and land Elevation Satellite-2 (ICESat-2), which was launched in 2018, has proven to have a certain bathymetric capability, which provides a new means for the surveying of island and reef zones. However, how to accurately extract and separate land surface signal photons, sea surface signal photons, and seafloor signal photons in these areas has not yet been resolved. In this paper, we propose a validated auto-adaptive multi-level classification algorithm (AMSRLC), which can realize automatic recognition and classification of seafloor, sea surface, and land surface photons in island and reef zones. A overlapping histogram, a slope-adaptive search ellipse, and a water depth adaptive local signal-to-noise ratio are respectively used to extract flat sea surface signals, island and reef surface signals with slope changes, and seafloor signals that weaken with water depth. The overall classification indicators OA, AA, and Kappa reached 0.993, 0.973, and 0.987 respectively. The algorithm can effectively detect various signals with high detection accuracy.


INTRODUCTION
In September 2018, NASA launched the second-generation laser altimetry satellite, Ice, Cloud and land Elevation Satellite-2 (ICESat-2). Its Advanced Topographic Laser Altimeter System (ATLAS) is the first multi-beam laser altimetry system with single-photon sensitive detectors (Markus et al. 2017) in the world. Because a green laser (532 nm) is used, ICESat-2 has a certain bathymetric ability. According to the published data, ICESat-2 can measure ~30 m seafloor (Lee et al. 2021), and can thus undertake shallow water surveys in island and reef zones, which cannot be achieved by ship-based acoustic systems.
Many scholars have conducted in depth research on the validation of ICESat-2 ATLAS bathymetry and analysis of ATLAS's bathymetric mapping performance. The bathymetric accuracy of ICESat-2 was then validated and the method of refraction correction was described in detail by Parrish et al.(2019), who reported that the accuracy of the bathymetry reached a root-mean-square error (RMSE) of 0.43-0.6 m over a 1-m grid resolution.
With the above research foundation, some scholars have attempted to integrate ICESat-2 and optical remote sensing images, such as Sentinel-2, for large-scale bathymetry inversion. These studies used various models (linear band model, band ratio model, support vector regression model) to collocate the measurement of both the depth (H) and radiometric properties captured by ICESat-2 and multiband image sensors, and expanded the bathymetry to the entire region covered by the multiband image sensor acquisition. However, most of the methods selected seafloor photons manually (Armon et al. 2020;Parrish et al. 2019; Thomas et al. 2021) or extracted them together with the water surface photons (Ma et al. 2019;Ma et al. 2020) in the ICESat-2 data processing. Actually, ICESat-2 data from island and reef zones have distinct characteristics. However, * Corresponding author accurate extraction and separation of land surface signal photons, sea surface signal photons, and seafloor signal photons are still a question to resolve. In this study, an auto-adaptive multi-level seafloor recognition and land sea classification (AMSRLC) was proposed to precise identification and separation of different types of photons.

MATHEMATIC
The ICESat-2 data from island and reef zones present distinct characteristics, with a multi-layer vertical structure and a signal strength that decreases with increasing water depth. One part of the energy of the laser pulse penetrates the sea surface to the seafloor, while the other part of the energy is reflected by the sea surface, which results in the multi-layer vertical structure. Due to the attenuation effect of water on the laser signal, as the water depth increases, a smaller number of photons can reach the satellite detectors from the seafloor, which results in the weaker signal strength with deeper seafloor.

Sea Surface Signal Photon Detection
The sea surface is typically smoother than the seafloor and the island and reef surface. For a general case, the sea surface signal has the characteristics of a high peak, small spread, and obvious characteristics in the histogram of photons, i.e., above and below the sea surface peak are the reef/island photon peak and the seafloor photon peak. Affected by the reflectivity and terrain slope, their peak height is much lower than the sea surface peak, and the signal peak is wider (See Figure 2.).

Figure 2.
A typical histogram of photons for an island/reef zone

Pre-denoising of Photons
In order to balance the unevenness of the photon density between the along-track distance and the vertical elevation of ICESat-2, the elevation coordinate of the photons be multiplied by the coefficient a (Eq. (1)), and the OPTICS clustering algorithm is used to retain the signal photons of the first 10% with a higher density.
where represents the photon distance along the track; represents the elevation coordinate of the point cloud; and represents the conversion factor for the elevation direction. For the extraction of sea level signals, the empirical value of is approximately [80,120]; ′ is the distance along the track after transformation; and ′ is the elevation after transformation.

Linear Regression Model with Segmented Statistical
Values to Accurately Fit the Sea Surface According to the segment range in the ATL03 data, the maximum elevation statistic for each segment is obtained, and these statistics are then used as input to fit the sea surface.
represents the center of the distance along the track of the k-th segment, and represents the maximum elevation of the signal photons in the k-th segment. We use Eqs. (4)-(5) to solve the fitting coefficient .

=
(4) = ( ) −1 (5) The initial fitting often includes some island/reef photons and seafloor photons. Therefore, iterative steps are needed to eliminate the non-sea-surface photons, to achieve a better fitting effect. The following three steps are performed.
Step1: Calculate the fitting residual between the photons and the fitting line, as shown in Eq. (6).
represents the maximum elevation of the k-th segment, ′ represents the fitting elevation of the k-th segment, and ∆ represents the fitting residual of each segment.
Step2: Eliminate the photons for which the residual error exceeds three times the standard deviation. In addition, in order to more efficiently eliminate signal points from above the sea surface, the points 1m above the fitted sea surface are also eliminated.
Step3: Re-fit the retained photons to obtain a new fitting line. These steps are perform two to three times to obtain a stable fitting line. The fitted line then needs to be moved down a distance 2 ⁄ , to correct the sea surface elevation (Eq. (7)). ℎ ′ = ℎ − 2 ⁄ (7) where ℎ represents the fitted sea surface elevation, represents the average range of the sea surface signal in elevation, and ℎ ′ represents the fitted sea surface after correction. According to the flatness of the sea surface, varies between 0.2 m and 2 m.

2.1.3
Identification of the Sea Surface Photons A more accurate sea surface position can be obtained by the linear regression model, and the sea surface photons can be accurately searched around its position. We split the photons into small segments in the along-track direction, establish the histogram with 50% overlap in the elevation of each segment, and use the density information to discriminate the range of the sea surface in elevation. The signal photons of each segment are then selected.
Step1: Establishment of the Histogram and Calculation of the Photon Density.
A histogram with 50% overlap can better describe the local distribution characteristics of photons, and can more accurately reflect the changes in photon density in the elevation direction. The number of photons in the bins of the histogram are counted, and the photon density is calculated. For each bin of the histogram, the photon density is determined by the ratio of the number of photons and the area of the bins (Eq. (8)).
ℎ represents the length of the bins, ℎ represents the height of the bins, and represents the total number of photons in the range. The selection of ℎ is closely related to the distribution of the photons. For sea surface photons, ℎ should not exceed 0.2 m, due to the small spread. In general, 0.1 m is appropriate. Step2: Overlapping Window Strategy Split Along Track. The along-track direction is split into segments, for which the size of the segments is set to 100 m for the AMSRLC algorithm. In order to ensure that the sea surface signal peak in the histogram of the current segment is more significant, the size of the moving window is set to three times that of the segment, and the movement step is set to the size of the segment (Figure 3). When identification of the sea surface photons is achieved, only the signal photons contained in the filter window should be selected. The search for sea surface photons continues until the entire study area is traversed. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France Step3: Identification of the Sea Surface Photons. The algorithm first finds the sea surface signal peak in the histogram. The algorithm determines whether this is the sea surface signal peak with the fitting line. If the height difference is within 1 m, it is judged that it is the sea surface signal peak. When there are multiple signal peaks, the AMSRLC algorithm considers the height difference between each peak and the fitted sea surface, and the signal peak closest to the fitted sea surface line is regarded as the sea surface peak.
Next, the algorithm determines the boundary of the signal peak to extract the sea surface photons in this segment. According to a previous study (Degnan et al. 2001), the number of photons in the bins will conform to a Poisson distribution. Therefore, with the fitted sea surface elevation as the center, a buffer zone with a radius of 0.5 m is established, and the mean value and standard deviation of the density in the area are calculated. The signal range is then determined, based on these parameters. The selected range can be expressed by Eq. (9): ℎ ∈ [ − 0.5, + 0.5] (9) where represents the position of the determined signal peak.
The density threshold can be calculated by Eq. (10): ℎ ℎ = − 0.5 (10) Starting from the peak of the histogram, the AMSRLC algorithm searches upwards and downwards. When the photon density is lower than ℎ ℎ , the search ends, to determine the upper boundary and the lower boundary of the signal peak.
For all the segments, the sea surface photons in each segment are extracted in this way, to complete the extraction of all the sea surface photons. After the sea surface signal extraction is completed, the signal photons above the sea surface are determined as island and reef photons, and the underwater part is determined as underwater photons. The subsequent signal extraction is then performed.

Seafloor Signal Photon Detection
The detection of seafloor signals is more complicated, for the following reasons: 1) the photon background noise rate and the SNR of the photon signal gradually decrease with the increase of the water depth; and 2) the topography of the seafloor can change greatly, which increases the complexity of the signal extraction.
In order to adapt to these characteristics mentioned above, we use the local SNR to detect the seafloor photons.

Detecting Seafloor Photons Using the Local SNR
For each photon , the sum of the k-nearest neighbors (KNN) distance is calculated, which is the sum of the distances of the knearest photons. The calculation formula for the KNN distance is shown in Eq. (11) (Xu 2017). Before the calculation, the elevation direction is scaled using Eq. (1) to balance the imbalance of the photon density between the distance along the track and the vertical distance.
The noise rate of photons is not homogeneous, but decreases with the increase of the water depth. In order to solve this problem, we use the local SNR to select the signal photons. The gray bins represent the noise bins, where the greater the noise rate, the darker the color of the grid. The black bins represent the signal bins. As the depth increases, the background noise rate becomes smaller and smaller.

Neighborhood Range and Noise Area
Firstly, the underwater photons are divided into small bins, for which the length and height are denoted by and , respectively, and the number of photons falling in each bin is counted. For each photon, the eight grid areas around the bin where it falls are the neighborhood range of the photon (Figure 4).
In the neighborhood bins, the AMSRLC algorithm selects the three bins with the smallest number of photons as noise bins, and determines them as noise areas.

Calculation of the Local SNR
The photons falling in a noise area are selected as the noise photons in the neighborhood of the center grid, and the average value of their sum of the KNN distance is counted as the feature value of the local noise photons in the area.
where 1 represents the central bin area, 2 represents the noise bin area, and represents the number of photons falling in 2 . The calculation of the signal threshold is closely related to the feature value of the local noise photons ist 1 . Now we consider two central photons A and B, for which the distribution of their photons is similar, the distance between the photons around B is two times than that of A. The area occupied by neighbors of photon B is four times than that of photon A. Therefore, we consider that, when the ratio of the sum of the KNN distances of the two photons is , then the density of the two points is 2 . Based on this, the local signal threshold in the center bin can be expressed by Eq. (13). The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B2-2022 XXIV ISPRS Congress (2022 edition), 6-11 June 2022, Nice, France where represents the set SNR within the water depth range.

SNR Threshold Setting According to the Water Depth
Affected by the attenuation effect of water on the laser signal, as the water depth increases, a weaker photon signal is returned seafloor, and the SNR is also weaker. In order to adapt to this feature, for different water depth regions, different SNRs are set to achieve the best extraction effect. The water depth range is divided into three areas, namely, water depth Zone-I, water depth Zone-II, and water depth Zone-III (see Fig. 10). The water depth range of water depth Zone-I is usually < 5 m, and the SNR setting is usually > 10. The water depth of water depth Zone-II is roughly within 5-15 m, and the SNR is usually set to 3 ≤ ≤ 10. Water depth Zone-III refers to the area with a water depth of > 15 m, for which the SNR is set to < 3 . The division of the water depth area and the selection of the SNR can be adjusted according to the actual situation. Figure 6. Diagram of the water depth zone division. As the water depth increases, the photon SNR becomes lower.

Piecewise Linear Fitting Method to Eliminate Outliers
After the seafloor photon extraction through the above steps, there will still be some photons remaining, and although they have a strong SNR, this part of the signal belongs to the misidentify photons and needs to be eliminated. The method of local fitting is used to eliminate the misidentify photons. In the range where the distance along the track is ∆ , all the signal photon coordinates ( , ) of the photons in the range are fitted, and the photons that deviate more than 3 m from the fitted straight line are eliminated until the fitted line stabilizes. The fitting area is shifted by ∆ /2 until the entire area is traversed (Neumann et al. 2018). The remaining photons are the final seafloor photons.

Island and Reef Photon Detection
The island and reef area is the upper part of the sea surface signal, and the photon detection needs to select the island and reef signals within the area photons. Island and reef signal photon detection is similar to that in land areas, and there are many mature algorithms (Brunt et al. 2014;Herzfeld et al. 2017;Jiang et al. 2015). Here, a novel two-step method using adaptive local density is adopted (Xie et al. 2022).
Firstly, an adaptive search ellipse is used to calculate the photon density feature. When the signal photons present a certain slope, the algorithm searches for the slope within the ellipse around the signal, and calculates the characteristic value of the photon density at the optimum slope, according to the optimum slope searched.
Secondly, based on the calculated density characteristic value of each photon, we use Otsu's thresholding method to obtain the threshold value of the photons for segmentation, and we initially select the noise photons as those photons with a value less than the segmentation threshold value. Then, according to the statistical characteristics of the noise photons, the signal threshold is determined, and the photons larger than the signal threshold are selected as the island and reef signal photons. ℎ ℎ = + 3 • (14) where represents the mean value of the feature value of the noise photons, and represents the variance of the feature value of the noise photons.
For a photon whose density characteristic value is greater than the threshold value, it is judged to be an island signal photon.

Performance Assessment
In this paper, the overall accuracy (OA), average accuracy (AA), and Cohen's kappa coefficient (kappa) are used. The calculation formulas for these indicators are as follows: where represents the number of categories, indicates that category is recognized as category , and represents the number of photons.
For the single-category indicators, the precision (P), recall (R), and F-measure (F) are adopted. These indicators are calculated as follows: where TP represents the true signal photons that are correctly detected, FP represents the noise photons that are misclassified as signal photons, and FN represents the true signal photons that are not correctly detected.

RESULTS
In order to verify the effectiveness of the algorithm, a dataset (filename: ATL03_20190822155703_08490401_002_01.h5) was selected. From the Table II and Table III   The reference data and the result of the AMSRLC algorithm for Dataset are shown in Figure 8. The difficulty of the data classification lies in processing the photons at the intersection of land and sea, so the data for area 1 and area 2 of Figure

CONCLUSION
In this article, we have proposed the auto-adaptive multi-level seafloor recognition and land sea classification (AMSRLC) algorithm, to solve the problem of denoising classification of single-photon data in island and reef zones. The algorithm is divided into three main steps. Firstly, the photon data are predenoised, and then the linear regression model is used to fit the sea surface accurately. Subsequently, the photon data are processed in segments along the orbital direction, and the sea surface signal is detected by the use of the histogram with the fitted line of the sea surface. Secondly, the seafloor photons are identified below the sea surface, for which the local SNR is proposed to detect the seafloor photons. Thirdly, a slope-adaptive method is used to calculate the photon density of each photon, to detect the island and reef photons.
The verification results showed that the kappa coefficient can reach more than 0.95 in all three types of data, indicating that the proposed AMSRLC algorithm can handle single-photon data of different types well. In terms of the single-class accuracy, the detection of sea surface photons or island/reef photons is easier than that of seafloor photons, and the F-measure can reach more than 0.99. For seafloor photons, the processing accuracy is slightly reduced, but the F-measure can still reach 0.97. Judging from the processing results, the algorithm can effectively detect various signals with a high degree of detection accuracy.