UPDATES OF ‘AW3D30’ ALOS GLOBAL DIGITAL SURFACE MODEL IN ANTARCTICA WITH OTHER OPEN ACCESS DATASETS

In 2016, the first processing of the semi-global digital surface models (DSMs) utilizing all the archives of stereo imageries derived from the Panchromatic Remote sensing Instrument for Stereo Mapping (PRISM) onboard the Advanced Land Observing Satellite (ALOS) was successfully completed. The dataset was freely released to the public in 30 m grid spacing as the ‘ALOS World 3D 30m (AW3D30)’, which was generated from its original version processed in 5 m or 2.5 m grid spacing. The dataset has been updated since then to improve the absolute/relative height accuracies with additional calibrations. However, the most significant update that should be applied for improving the data usability is the filling of void areas, which correspond to approx. 10% of semiglobal coverage, mostly due to cloud covers. In 2020, we completed the filling process by using other open-access digital elevation models (DEMs) such as Shuttle Radar Topography Mission (SRTM) DEM, Advanced Spaceborne Thermal Emission and Reflection Radiometer Global DEM (ASTER GDEM), ArcticDEM, etc., except for Antarctica. In this paper, we report on the filling process of the remaining voids in Antarctica by using other open-access DEMs such as Reference Elevation Model of Antarctica (REMA) DSM, TerraSAR-X add-on for Digital Elevation Measurement (TanDEM-X, TDX) 90m DEM, and ASTER GDEM to complete the void-free semi-global AW3D30 datasets. * Corresponding author


INTRODUCTION
The elevation map of terrain is one of the essential data for many geoscience applications e.g., ortho-photo processing, infrastructure design, disaster monitoring, natural resources survey, and so on. In 2016, the first processing of semi-global digital surface models (DSMs) utilizing all the archives of triplet stereo imageries derived from the Panchromatic Remote sensing Instrument for Stereo Mapping (PRISM) onboard the Advanced Land Observing Satellite (ALOS) was successfully completed (Takaku et al., 2016). The dataset was named 'ALOS World 3D (AW3D)', and has 5 m or 2.5 m grid spacing derived from the optical triplet stereo imageries in 2.5 m resolution. The accuracy of the DSM was confirmed at 5 m (rms) in vertical and also 5 m (rms) in horizontal. We then generated its low resolution version of 1 arc-sec (approx. 30 m on the equator) grid spacing (i.e., AW3D30) to be open to the public free of charge . The dataset has been updated since then to improve the data qualities as well as accuracies with additional calibrations (Takaku et al., 2017, Takaku et al., 2018. In 2019, we proceeded to another significant update, the filling of void areas, which correspond to approx. 10 % of the semi-global land coverage. The void areas are mainly distributed in the equator zone and high-latitude zones due to the heavy cloud coverage on the tropical rainforest areas and the snow/ice on the polar areas respectively in source PRISM imageries. In 2020, we completed the filling process by using other open-access digital elevation models (DEMs) such as Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) (Rodriguez et al., 2006), Advanced Spaceborne Thermal Emission and Reflection Radiometer Global DEM (ASTER GDEM) (NASA et al., 2019), ArcticDEM (Porter et al., 2018), etc., except for Antarctica (Takaku et al., 2020). In Antarctica, more than a half of the whole continent including large ice shelves are voids due to the heavy cloud and snow/ice coverage in the source imageries of PRISM. In this paper, we report on the filling process of the remaining voids in Antarctica to complete the void-free semi-global AW3D30 datasets.

INPUT DATASETS
We mainly used the Reference Elevation Model of Antarctica (REMA) DSM version 1 (Howat et al., 2019) for the voidfilling in Antarctica. The data were derived from optical stereo imageries of high-resolution (~ 0.5 m) commercial satellites i.e., WorldView series and GeoEye-1. The dataset covers 98% of the landmass up to 88°S. We used the mosaic tiles of 8 m grid spacing after a preliminary inter-comparisons with the AW3D30, which are explained in next section. The TerraSAR-X add-on for Digital Elevation Measurement (TanDEM-X, TDX) 90m DEM (Wessel et al., 2018) was used as secondary datasets. The data were generated from the interferometric processing of X-band bistatic radar onboard the twin satellites. It has the grid spacing of 3 arc-secs as the lowresolution version of the original data in 0.4 arc-sec grid spacing, covering all global land areas from pole to pole. The data were relatively compared with the AW3D30 before the filling process as well. The ASTER GDEM ver.3 was used as the third priority. The data were generated from the optical stereo imageries of 15 m resolution, covering land areas between N83° and S83° with 1 arc-sec grid spacing. Other than the existing DEM datasets we optionally generated additional DSMs from PRISM imageries that have over 30 % cloud covers for the void-filling, while the original AW3D was generated only from the imageries that have less than 30 % cloud covers.

DEM INTER-COMPARISON
The inter-comparisons between the AW3D30 DSM and the REMA DSM and between the AW3D30 DSM and the TDX 90m DEM were performed before we proceeded to the voidfilling. The periods of source data in the AW3D30 DSM, the REMA DSM, and the TDX 90m DEM are 2006~ 2011, 2009~ 2017, and 2010~ 2015. We selected two sample areas where all three different DSMs include sufficient valid data in different types of terrain. The one is located at the range of 77°-78°S/160°-164°E including relatively steep mountainous terrain in the height range of approx. 0 ~ 3000 m (steepmountain-area), while the other one is located at the range of 80°-81°S/146°-152°E including relatively flat terrain on an ice sheet in the height range of approx. 1400 ~ 2200 m (flat-icesheet-area). Figure 1 shows the AW3D30 DSMs on the two sample areas, which originally consist of four and six 1°x1° tiles each in geodetic latitude/longitude coordinates, projected on Polar Stereographic (PS) coordinates. In the calculation of their relative height differences, the REMA DSM in 8 m grids were down-sampled into 1 arc-sec girds of the AW3D30 with the averaging, while the TDX 90m DEM in 90 m grids were up-sampled into the same grids with the bilinear interpolation. The WGS84 ellipsoidal heights of the REMA DSM and of the TDX 90m DEM were converted to EGM96 orthometric heights of the AW3D30 DSM. Figure 2 shows the spatial distributions of relative height differences in the REMA DSM and in the TDX 90m DEM from the AW3D30 in the two sample areas. Figure 3 shows their histograms with the bin-width of 1 m as well as the results of Gaussian curve fittings on them. Table 1 shows their statistics with the means and standard deviations estimated from the Gaussian fits. In both areas, we identified negative mean errors for all compared DSMs, which mean that the heights of AW3D30 are higher than others. They are approx. -3 m to -6 m in the TDX 90m DEM and are larger than approx. -1 m in the REMA DSM. In Fig. 2 (d), we observed systematic waving patterns for the TDX 90m DEM that are not observed for the REMA DSM shown in Fig. 2 (b). The maximum peak of waves is approx. -50 m, while it seems that the patterns have faint correlations with the original terrain of the ice sheet. It results in the relatively large mean difference of -5.75 m in Table 1, as well as the left-skewed histogram in Fig 3 (d). The cause is still unknown; however, the penetration of X-band radar signal on the ice-sheets is one of the possible causes (Abdullahi et al., 2019). In Fig. 2 (c), no such patterns are observed in the TDX 90m DEM on the mountainous terrain; however, the negative differences distributed in partial flat areas are considered to be derived from the same phenomenon. It results in the mean difference of -2.70 m in Table 1. The mean differences of approx. -1 m in the REMA DSM are seemingly derived from the temporal changes among the acquisitions of source stereo imageries. The difference of reference data used in their absolute height corrections is another possible cause of it,  where the Ice, Cloud, and land Elevation Satellite (ICESat) product GLA14 (Zwally et al., 2012) was used in the AW3D30, whereas the ICESat product GLA12 and the product of CryoSat-2 radar altimeter were used in the REMA DSM (Howat et al., 2019). The standard deviations of the REMA DSM and of the TDX 90m DEM in the mountainous area are 19.09 m and 12.23 m respectively. These relatively large deviations are due to large outliers, which cause the height differences of up to 1486 m, distributed in some local steep spots. We identified that they were mostly caused by lack of invalid masks on blunders in either of the AW3D30 DSM or the REMA DSM that were seemingly processed from cloudy imageries. On the other hand, the standard deviations estimated from the Gaussian-fit on the histograms, which the effects of extreme outliers are invalidated, are 1.21 m and 2.98 m in the REMA DSM and in the TDX 90m DEM respectively that imply these DSMs have enough compatibility for the void-filling of AW3D30. In the flat ice sheet area, the standard deviations are 1.34 m and 3.95 m in the REMA DSM and in the TDX 90m DEM respectively, where they have no large difference from the estimated values from Gaussian-fits because there are no such large outliers. Relatively large deviations in the TDX 90m DEM are due to the difference of original grid spacing as well as the systematic waving errors on the ice sheet mentioned above. Other than the relative comparisons of DSMs, we compared the absolute accuracies of them by using the ICESat-2 land-ice product ATL06 (Smith et al., 2020a) as the reference. The ATL06 has much denser distributions of height data on the ground surface with the samples processed at 20 m intervals in each of six ground tracks per satellite orbit, as compared to the ones of ICESat GLA14 with the 170 m intervals in one grand track only per satellite orbit. We used the ATL06 data version 3 in the second repeat cycle of 91-days from 28 Dec. 2018 to 29 Mar. 2019. Figure 4 shows the distribution of ATL06 height samples on the AW3D30 DSM in two sample areas. The numbers of valid ATL06 height samples are approx. 0.4 ~ 0.5 million and approx. 1.3 million in the steep mountainous area and in the flat ice sheet area, respectively. Since the height of the ATL06 is defined as the estimated surface height of the segment center for each reference point (Smith et al., 2020b), the height in the DSMs to be compared is resampled at the point with the bi-linear interpolation. The EGM96 orthometric heights of AW3D30 DSM are converted to WGS84 ellipsoidal heights of others before the comparison. According to the quality information included in the ATL06 product, only the best-quality subset data, which the "atl06_quality_summary" indicates zero, were used after discarding the data of which the total vertical geolocation error indicated at "sigma_geo_h" exceeds 5 m. Figure 5 shows histograms of height errors in the three different DSMs from the ATL06 data with the bin-width of 1 m at the two sample areas, while Table 2 shows their statistics with the means and standard deviations estimated from the Gaussian fits    The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B4-2021 XXIV ISPRS Congress (2021 edition) on the histograms. Figure 6 shows a comparison of height profiles of the ATL06 and three DSMs along with the ground track of ATL06 depicted in green on Fig. 4 (b). The mean errors are approx. 1.0 m to 1.5 m and -0.1 m to 0.5 m in the AW3D30 DSM and in the REMA DSM respectively in the two sample areas. These differences are considered to be derived from the difference of their absolute height reference used in their data processing as well. The large standard deviations of 8.51 m and 11.90 m in these two DSMs on the steep area are derived from the large outliers as well. As the statistics of invalidating those outliers, their 90th percentile linear errors (LE90) are 2.75 m and 1.57 m, as well as their standard deviations from Gaussian fits on the histograms are 0.96 m and 0.82 m. In flat ice sheet area, the standard deviations are 1.41 m and 1.23 m in these two DSMs, which indicates both of them have enough relative accuracies to be merged. In the TDX 90m DEM, the negative mean errors of -1.48 m and -4.40 m are observed as well at the two sample areas. The cause is seemingly the same as the one mentioned in the comparison with the AW3D30 DSM, where the ATL06 and the AW3D30 are both derived from the reflection of optical signals in visible bands, whereas the TDX 90m DEM is derived from the reflection of radar signals in Xband. In Fig. 6, it is confirmed that only the profile of the TDX 90m DEM has partial negative errors, which correspond to the waving patterns in Fig. 2 (d), from those of others. It results in the relatively large LE90 of 8.87 m in the ice-sheet area.
As the results, it was confirmed that the REMA DSM and the TDX 90m DEM have compatibility with the AW3D30 DSM as the first-and second-priority in the void-filling respectively except for some blunders in the AW3D30 DSM or in the REMA DSM on local steep areas and the waving errors in the TDX 90m DEM on ice sheets.

SEA MASK CORRECTION
The sea areas in the AW3D in Antarctica were originally masked by using the existing global water-body-data in public domain i.e., Global Self-consistent, Hierarchical, Highresolution Shoreline Database (GSHHS) (Wessel et al., 1996). However we found some inconsistencies between the original sea masks and the source PRISM imageries. Therefore, we replaced the GSHHS to the OpenStreetMap (OSM) coastlines (OpenStreetMap contributors, 2019), following the preceding void-filling process in other global areas of AW3D30 (Takaku et al., 2020). Large ice-shelves were added as the valid areas in the replacement. The additional voids that were generated from the replacement were filled in the void-filling process as well. Figure 7 shows a comparison between the original GSHHS and the OSM coastlines depicted on the Landsat image mosaic of Antarctica (LIMA) (Bindschadler et al., 2008).

VOID FILLING
The process of void-filling is basically the same as the one used in the preceding filling process for other global areas of AW3D30 (Takaku et al., 2020) except for the input external open-access datasets. Figure 8 shows the flow of the voidfilling process in Antarctica. It is applied to the AW3D original   Figure 7. Comparison of coastlines in Antarctica on Center-Filled LIMA (Bindschadler et al., 2008). Red: GSHHS, Green: OSM coastlines. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B4-2021 XXIV ISPRS Congress (2021 edition) version of 0.15 arc-sec (5 m on the equator) grid spacing in 1°x1° tiles first. The grid spacing is reduced into 1 arc-sec of the AW3D30 products after all filling process are completed. We applied the method of "delta surface fill" (DSF) (Grohman et al., 2006) which fills the voids with smoothing the height gaps at boundaries between the original and the filling data without any change in the original data. The adaptive filtering process, which eliminates blunders in original AW3D DSM, is applied as well after the filtering parameters, i.e., thresholds of height difference from reference DSMs, number of stacks in AW3D, and minimum distance from nearest cloud masks, were calibrated for new input datasets of the REMA DSM. The results of the DSF based on the priorities of input DEMs and its following adaptive filtering are checked tile by tile to detect artifacts in filled areas. If an obvious artifact is detected the filling process will be re-tried after excluding the corresponding input DEM. For the voids remaining after all existing DEMs were applied, we optionally use a simple interpolation with the inverse distance weight (IDW) method to fill them depending on the condition of the void e.g., the size, shape, its surrounding terrain, etc. The result of the interpolation is checked manually for each void segment to decide its acceptance. Figure 9 shows the AW3D30 DSM void-filled ver.3.1, which consists of 4814 1°x1° tiles in geodetic latitude/longitude coordinates, as well as the distribution of its source open-access datasets in Antarctica projected on PS coordinates. The index flags that indicate source datasets used in the void-filling were stored in the ancillary mask files of AW3D30 data products. Table 3 shows the proportion of source datasets in all valid data. The original coverage of AW3D30 was limited up to 82°S due to the sun-synchronous orbit of the satellite, while the remaining ranges up to 84°S were mostly filled with the REMA DSM following the north latitude limit of 84°N. As the result, total 98.7% of all valid areas in Antarctica were filled with either of the AW3D original or the REMA DSM where their respective rates are approx. 42.7% and 56.0%. The TDX 90m DEM was mainly used in remaining areas of approx. 0.9% where both of the AW3D and the REMA DSM are voids. Other DSMs and the optional manual interpolation with the IDW method were used with the rates of less than 0.21%.

VALIDATION
The perspective absolute height accuracies of the AW3D30 ver. 3.1 in Antarctica were validated with both of the ICESat GLA14 and the ICESat-2 ATL06 for each of source DEM dataset used in the filling-process. Though the GLA14 is not strictly independent for the validation because it was used at the correction of the absolute height in the processing of original AW3D, we used it as a reference. We used all available datasets for the GLA14 in the period from 20 Feb. 2003 to 11 Oct. 2009, whereas we used the same datasets for the ATL06 as mentioned in section 3, which were acquired in the period from 28 Dec. 2018 to 29 Mar. 2019 including all reference ground-tracks of 1387 in the repeat cycle of 91-days. In the calculation of the height difference between the GLA14 and the AW3D30 DSM, the heights in 1 arc-sec (30 m on the equator) grids of the AW3D30 under the ICESat's footprint of 70 m in diameter were averaged. The samples that the standard deviations in the averaging heights exceed 5 m were omitted because those in the steep/rough terrain may have less reliability (Huber et al., 2009). In the comparison, the samples where the errors of heights exceed +/-100 m were associated with outliers due to the cloud reflections, saturated waveforms or other anomaly in the GLA14 and were excluded from the results (Carabajal et al., 2006). The calculation of the height difference between the ATL06 and the AW3D30 DSM is the same as mentioned in section 3.  The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B4-2021 XXIV ISPRS Congress (2021 edition) Table 4 shows the statistics of height errors from the GLA14 and from the ATL06 for each of source dataset. Figure 10 shows histograms of them with the bin-width of 1 m for their respective total samples. Figure 11 shows the distributions of mean error from the GLA14 and from the ATL06 in each 1°x1° tile of the AW3D30 DSM. Note that these statistics do not mean the relative difference of the absolute accuracies among different datasets because the numbers as well as the areas of samples are different among them. The total numbers of samples are approx. 122 million and approx. 838 million for the GLA14 and for the ATL06 respectively, where the latter is approx. seven times larger than the former thanks to its high density of sampling on the ground. In the results with the GLA14, the values of means, standard deviations, and LE90s for both of the AW3D30 original and the REMA DSM, which account for 98.7% of all areas, are all less than 2.5 m. They are enough consistent with the specification accuracy of the AW3D (5 m, rms). The mean of the AW3D30 original is higher in approx. 1 m than that of the REMA DSM. It corresponds to the spatial distribution of mean errors shown in Fig. 11 (a) and to the result of their inter-comparison in sample areas mentioned at section 3. The mean and standard deviation of TDX 90m DEM are -3.54 m and 8.41 m, respectively, seemingly due to its limited distribution on steep terrain as well as the penetration of the radar signal on icesheets. Other DSMs have relatively large errors, i.e., 23 m to 24 m in standard deviation, due to small number of samples in difficult terrain as well as lack of source imageries for generating DSMs in sufficient quality. As the result, the values of mean, standard deviation, and LE90 of total samples in the  In the results with ATL06, the trends are almost the same as the ones with the GLA14. However, the means for both of the AW3D30 original and the REMA DSM are higher in approx. 1 m than those from the GLA14. They are reflected in their spatial distributions shown in Fig. 11 (a) and (b). One possible cause is the temporal change of the absolute heights on the ice sheets during the time gap of at least nine years between the GLA14 and the ATL06. Other than the mean errors, the standard deviations as well as the LE90s are larger except for the REMA DSM and the TDX 90m DEM. They are derived from large outliers in some local steep terrain, which are mainly located near the coast of west Antarctica depicted in dark red at Fig. 11 (b), as mentioned in section 3. They result in the maximal and minimal errors of 232.74 m and -481.80 m respectively for the AW3D30 original in Table 4. It seems that those outliers were excluded in the comparison with the GLA14 at the filtering process applied for excluding those derived from the data anomaly in GLA14. Therefore, the results with the ATL06 are relatively more reliable than those with the GLA14. In spite of including these large outliers, the values of mean, standard deviation, and LE90 are 0.55 m, 4.51 m, and 2.45 m respectively for total samples of the ATL06, and are enough consistent with the specification accuracy of 5 m in rms for the AW3D30 as well.

CONCLUSION
The updates of AW3D30 global DSM datasets with other open access datasets in Antarctica were presented. The voids in original dataset, which correspond to approx. 57 % of the whole continent including large ice shelves in Antarctica, were filled with existing open access DEM datasets that were prioritized through inter-comparisons among them. The perspective absolute accuracies of the void filled datasets were validated for each of source dataset with both of the ICESat and the ICESat-2 global point cloud reference. The result showed that the accuracies of the void-filled areas are enough consistent with the AW3D30 original dataset except for some limited areas in extreme terrain. For future work, we will continue to update the datasets for better quality/accuracy of the AW3D DSM with detecting and rectifying the blunders remaining in some limited areas. We also have a plan to apply new DSM datasets including the one generated from the cross-track stereo imageries of the ALOS-3, a follow-on satellite of optical sensors onboard the ALOS (Takaku et al., 2019).