SPATIAL DISAGGREGATION OF LANDSAT-DERIVED LAND SURFACE TEMPERATURE OVER A HETEROGENEOUS URBAN LANDSCAPE USING PLANETSCOPE IMAGE DERIVATIVES

Satellite-derived land surface temperature (LST) is frequently utilized to characterize the intensity of urban heat island (UHI) effect in highly urbanized and rapidly urbanizing cities. However, current spaceborne thermal sensors cannot capture temperature variations within heterogeneous urban landscapes at finer scales due to its coarse spatial resolution. This study aims to apply RegressionKriging (RK) method to downscale a 30-meter Landsat-derived LST to 3 meters using different PlanetScope image derivatives. To avoid multicollinearity, exploratory regression was performed to reduce the initial set of 16 indices to 7 explanatory variables, namely, Enhanced Vegetation Index (EVI), Modified Soil-Adjusted Vegetation Index (MSAVI), Normalized Pigment Chlorophyll Ratio Index (NPCRI), Visible Green-based Built-up Index (VgNIR-BI), Mean, Entropy, and Homogeneity. Ordinary Least Squares (OLS) regression was applied to fit the models and the residuals of the best performing models were interpolated using Ordinary Kriging technique and added back to the downscaled LST. The model with the highest accuracy was obtained using the combination of MSAVI, EVI, and Mean, with an R of 0.75 and RMSE of 1.12°C, 0.58 °C, 0.80 °C, and 1.45 °C in estimating the LST of built-up, bare soil, vegetation, and water classes, respectively. The results indicate that the inclusion of textural features in the regression could improve model accuracy without increasing the variance of coefficient estimates. Moreover, RK method (RMSE = 1.10 – 1.16 °C) was proven to be a reliable downscaling technique because it redistributes the spatial variability of LST that were not preserved in the OLS regression (RMSE = 1.60 – 1.75 °C).


INTRODUCTION
The Philippines is considered as one of the fastest urbanizing countries in Asia, and it is projected to increase in the succeeding years due to rapid population growth and internal migration (World Bank, 2017). Although a high degree of urbanization could signify improved living standards, it could also have negative impacts on the thermal environment, one of which is the occurrence of the urban heat island (UHI) phenomenon. Intensified demand for urban expansion leads to the conversion of green spaces into dense built-up areas made of low-albedo materials. This has resulted to significantly higher land surface temperatures (LST) in developed regions compared to its surrounding rural areas (Yang et al., 2016). UHI has adverse impacts on the general situation in highly urbanized and rapidly urbanizing cities such as increased energy consumption and thermal discomfort (O'Malley et al., 2014).
To assist in mitigating the UHI effect, many studies have utilized LST images derived from satellite imageries with coarse spatial resolution such as Landsat OLI/TIRS (30 m, 100 m), Sentinel-3 SLSTR (1 km), and MODIS (1 km) to characterize its spatiotemporal variation. Zhou et al. (2018) analysed previous researches and observed that more than 70% of UHI studies have utilized Landsat and MODIS images, primarily because of its temporal resolution and availability. However, these satellite images are only capable of capturing generalized spatial information, whereas high-resolution data are needed to be able to observe detailed land cover changes and LST variations in urban landscapes with high heterogeneity. In order to obtain more accurate thermal information, spatial disaggregation or downscaling of LST layers can be applied to improve the spatial resolution of thermal images.
Several downscaling methods have been developed by previous studies, such as the Disaggregation Procedure for Radiometric Surface Temperature (DisTrad) method (Kustas et al., 2003) and the Temperature Sharpening (TsHARP) model (Agam et al., 2007), which both assume that there is an inverse linear relationship between LST and vegetation cover. The original methods involve the regression between low-resolution LST and vegetation metrics, i.e. Normalized Difference Vegetation Index (NDVI) for DisTrad and Fractional Vegetation Cover (FVC) for TsHARP, and the addition of the residual image to the downscaled LST. Although satisfactory results were obtained by these techniques in disaggregating LST over agricultural regions, it may not be applicable to complex urban landscapes which consist of various land cover types (Bala et al., 2018). Hence, the relationship of LST with different spectral and textural indices which can accurately characterize the spatial heterogeneity of urban areas should be considered in the downscaling process. Moreover, adding back the residual image derived from the difference between the original LST layer and the upscaled fine-resolution LST layer often results to an artificial box-like effect. To address this, a study by Mukherjee, Joshi, and Garg (2015) recommended to apply Regression-Kriging (RK) technique in spatial downscaling.
The main objective of this study is to spatially disaggregate 30meter Landsat-derived land surface temperature to finer spatial resolution using different biophysical indices. Specifically, it aims to: (1) assess the LST prediction capability of different combinations of vegetation indices, built-up indices, and graylevel co-occurrence matrix (GLCM) textural images derived from 3-meter PlanetScope images, and (2) evaluate the performance of RK technique in downscaling moderate-resolution Landsat LST. The results of this study could aid in further improving the spatiotemporal analysis of LST, UHI, and various urban land use and land cover change scenarios.

Study Site
Located at the southeastern edge of Panay Island, Iloilo City (10°45' N, 122°33' E) is a highly urbanized city which is considered as the regional center of the Western Visayas region. Based on the Köppen climate classification system, the city has a tropical wet and dry season with hot dry months from March to May. Its average monthly air temperature ranges from 30 °C (January) to 33.1 °C (April and May). The city was selected as the area of study because of its heterogenous urban composition as shown in Figure 1. Its city proper has a very high built-up density, consisting of both Spanish-era type heritage buildings and contemporary ones. Moreover, other parts of the city are composed of agricultural lands, rivers, fishponds, mangroves, and other types of vegetation, which contribute to its complex land cover characteristics.

Landsat Imagery
Landsat 8 satellite is equipped with Operational Land Imager (OLI) sensor that captures visible up to short-wave infrared images with a spatial resolution of 30 meters, and Thermal Infrared Sensors (TIRS) that can provide two bands of thermal images with 100-meter resolution. In this study, a pre-processed Landsat OLI/TIRS satellite image acquired on May 14, 2019, covering Iloilo City was obtained from USGS Earth Explorer and Google Cloud Storage (Figure 2a).

PlanetScope Imagery
PlanetScope is a commercial Earth observation satellite constellation that acquires daily satellite images with resolution of 3 meters. Although superior in terms of spatial resolution, it consists of only four spectral bands, namely, Blue (455 -515 nm), Green (500 -590 nm), Red (590 -670 nm), and Near-Infrared (780 -860 nm) bands. Three pre-processed Planetscope Analytic Ortho Tiles acquired on May 18, 2019 covering Iloilo City were downloaded from PlanetLabs (www.planet.com), taking into account the cloud cover and the Landsat data acquisition date. These images were mosaicked prior to the downscaling procedure and clipped to the study area ( Figure 2b).

Figure 2. (a) Landsat OLI and (b) PlanetScope True Color
Image of the study site

Methodology
The methodology of the study consists of three general steps: (1) Landsat LST Retrieval, (2) Generation of PlanetScope Derivatives, and (3) Regression-Kriging Method.

Landsat LST Retrieval
To derive land surface temperature from the Landsat image, the study adopted the LST retrieval method developed by Jeevalakshmi et al. (2017) and implemented it using Google Earth Engine (GEE), a cloud-computing platform for geospatial applications. The method is divided into several steps. Digital

(a) (b)
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B5-2020, 2020 XXIV ISPRS Congress (2020 edition) numbers (DN) from the raw satellite data were first converted to at-sensor spectral radiance (Lλ) using the equation: where Lmax is the maximum radiance, Lmin is the minimum radiance, Qcal is the DN value of pixel, Qcalmax is the maximum DN value, Qcalmin is the minimum DN value, and Oi is the correction value for band 10. Next, the thermal band (Band 10) is converted to at-sensor brightness temperature (BT) or the temperature of an equivalent blackbody of a target object that is detected by the thermal sensor, using the formula: where K1 and K2 are thermal constants found in the metadata of the image. Normalized Difference Vegetation Index (NDVI) was then calculated using the red and near-infrared bands of the image using the formula in Table 1. From this, Proportional Vegetation (Pv) values were calculated to approximate the proportion of vegetation and bare soil per image pixel using the equation: This study used an NDVIs value of 0.2 and an NDVIv of 0.5, similar to the values applied by Jeevalakshmi et al. (2017) for global conditions using Landsat OLI images. Land Surface Emissivity (LSE), which serves as a scaling factor for the blackbody radiance, was computed for each pixel. LSE is dependent on the surface roughness and vegetation cover, and its formula varies depending on its NDVI value. The formulas for the NDVI Threshold Method (NTM) are shown below: where ελ is land surface emissivity, εsλ is soil emissivity, εvλ is vegetation emissivity, and Cλ is the surface roughness (0.005). Land surface temperature is then estimated using the equation: where Ts is the land surface temperature (°C), λ is the average wavelength of thermal band, and ρ = 1.438 x 10 -2 mK.

PlanetScope Image Derivatives
Since the sensitivity of different indices in predicting LST varies depending on the land cover composition and the heterogeneity of the urban landscape, the study tested a total of 16 spectral and textural derivatives extracted from the PlanetScope image. Spectral derivatives, which include vegetation and built-up indices, are proven to be effective relative measures of the amount of green vegetation and impervious surfaces, respectively. Meanwhile, textural derivatives are capable of distinguishing between urban cover types which exhibit similar spectral characteristics such as bare soil and built-up areas. The combination of spectral and textural information in characterizing complex urban areas could possibly improve the spatial disaggregation of the coarseresolution LST.

Spectral Indices
Eight spectral indices were calculated from the four spectral bands of the pre-processed PlanetScope image using ArcGIS 10.

Textural Indices
Gray-level Co-occurrence Matrix (GLCM) was applied to extract textural information from the high-resolution image. This method creates a matrix which describe how often the individual pairs of values in a specified spatial relationship appear within the image (Haralick et al., 1973). From this, various statistical calculations are performed to generate different textural derivatives. In this study, eight GLCM measures were calculated from each spectral band using ENVI 5.1 image analysis software: Contrast, Correlation, Dissimilarity, Entropy, Homogeneity, Mean, Second Moment, and Variance using a kernel size of 3 x 3 meters. The formulas used to calculate each feature are shown in Table 2.
To reduce the number of textural layers that were generated, Principal Components Analysis (PCA) was performed on the four bands of each GLCM index. PCA is a dimensionality reduction technique which orthogonally transforms a large (1) (3) The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B5-2020, 2020 XXIV ISPRS Congress (2020 edition) dataset into a smaller set of linearly uncorrelated variables and selects the leading principal components which account for most of the variations in the original dataset (Liu et al., 2017). Based on the plot of its eigenvalues, the first PCA band of each set of layers were selected to represent the eight GLCM textural features. These layers were also resampled to the spatial resolution of the Landsat-derived LST.

Exploratory Regression
10% of the aggregated pixels (19,223 points) were randomly selected and divided into training (80%) and validation set (20%). It was ensured that both sets of points were evenly distributed throughout the image and each general land cover type (built-up, vegetation, bare soil, water) was wellrepresented. Then, stepwise multilinear regression via backward elimination was performed on the 16 explanatory variables to remove redundant and statistically insignificant predictors. Exclusion of candidate predictors was based on its Percentage of Variable Significance (%Significance) and Maximum Variance Inflation Factor (VIF). The combinations of significant predictors with the highest Adjusted Coefficient of Determination (R 2 ), Akaike Information Criterion (AIC), and lowest VIF were selected for the next step of the spatial downscaling process.

Regression-Kriging Method
Regression-Kriging (RK) is a hybrid spatial interpolation technique which combines linear regression and kriging interpolation. This study adopted the method implemented by Mukherjee, Joshi, and Garg (2015) which applied Ordinary Least Squares (OLS) Regression to estimate the downscaled LST based on the predictive variables, and Ordinary Kriging (OK) to interpolate the OLS residuals before resampling and adding it back to the generated high-resolution LST layer.

Ordinary Least Squares (OLS) Regression
OLS Regression is a least squares regression method which estimates the linear relationship between the independent and dependent variables by minimizing the sum of the squares of the residuals (Butler, 1999). The method was implemented in the study using ArcGIS 10.0; each of the selected combinations of spectral and textural indices were set as the explanatory variables while the 30-meter LST was set as the dependent variable.

Ordinary Kriging (OK) Interpolation
OK is a geostatistical technique which predicts the semivariance of the dependent variable based on the assumption that the distance between sampling points are spatially correlated and the unknown mean is constant over the search neighborhood of each estimation point (Farmer, 2016). The LST residuals resulting from OLS Regression were used as input in the Ordinary Kriging tool in ArcGIS 10.0. The semivariograms of each set of residuals were first inspected before selecting the appropriate model parameters for the interpolation. The output residual surfaces were then added to the OLS-predicted LST rasters to obtain the final downscaled LST layers.

Accuracy Assessment
Since there is no high-resolution reference LST layer to be compared with, the PlanetScope-derived LST images were upscaled to 30 meters and the root-mean-square error (RMSE) and mean error (ME) of each model per land cover type were calculated relative to the Landsat LST values at each validation point. The best Regression-Kriging model was selected based on its Adjusted R 2 , AIC, RMSE, and ME.

Selection of Significant Predictive Variables
9 spectral and textural indices (ARVI, IPVI, NDVI, OSAVI, BAI, Contrast, Correlation, Dissimilarity, Second Moment, and Variance) were removed based on the results of the stepwise linear regression due to inconsistent significance and high multicollinearity with other variables.
The table below shows the summary of the %Significance and VIF of the remaining explanatory variables. A high positive %Significance indicates a consistently strong positive relationship while a high negative %Significance means that a stable negative relationship exists among variables. Meanwhile, a higher VIF value implies that there are two or more redundant predictors. All of the remaining variables have %Significance of above 90% and relatively lower VIF compared to the excluded indices and were included in the regression. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B5-2020, 2020XXIV ISPRS Congress (2020 This contribution has been peer-reviewed. https://doi.org/10.5194/isprs-archives-XLIII-B5-2020-115-2020 | © Authors 2020. CC BY 4.0 License.

Selection of OLS Models
The following tables display the five best OLS models which used only spectral indices (Table 4), and combination of spectral and textural indices (Table 5), as exploratory variables. Only two-variable and three-variable models were considered in the analysis since the addition of another predictor did not significantly improve the R 2 and considerably increased the VIF of the resulting models.

Model
Adj Adjusted R 2 represents the amount of variability in the dependent variable that can be explained by the regression model while AIC is a relative measure of the balance between fitting the dataset and model complexity. Based on the tables above, all of the models obtained satisfactory results despite using only 10% of the pixels as training and validation samples. Higher Adjusted R 2 and lower AIC were calculated for the models which combined both spectral and textural derivatives compared to the models which only used spectral indices, indicating better prediction performance. In addition, inclusion of GLCM textural features addressed multicollinearity issues that can be observed in the models having three spectral indices as independent variables (VIF > 60). Based on the three criteria, 6 out of 10 regression models were selected for the next phase of the Regression-Kriging method.

Comparison Between OLS and RK Models
Comparing the calculated RMSE for each model using two different methods as shown in Figure 4, it can be observed that all of the RK models (RMSE = 1.10 -1.16 °C) performed considerably better than the OLS models (RMSE = 1.60 -1.75 °C). This implies that redistributing the LST residuals using Ordinary Kriging and combining it with the OLS surface significantly improves the accuracy of the models, since the interpolated residuals preserve the spatial variability of LST that were not captured by the spectral and textural indices.  Generally, a positive bias (ME = 0.02 -0.24) in predicting the LST of soil pixels can be observed based on the mean error, which means that the models tend to overestimate the downscaled LST values of areas with bare soil. On the contrary, negative values (ME = -0.05 --0.11) were calculated for the water pixels, indicating underestimation. In predicting the surface temperature of built-up areas, the four regressionkriging models with EVI and VgNIR-BI as its explanatory variables generally overestimated the actual values (ME = 0.01 -0.09), while the other two models have underestimated it (ME = -0.01 --0.06). Meanwhile, all of the models (ME = -0.06 -- Despite this, the MSAVI/EVI/Mean RK model was selected as the best model to improve the resolution of the Landsat-derived LST to 3 meters since the need for downscaling the surface temperature of water is not as critical as in the case of built environments and its RMSE in water is acceptable considering the limitation on the available spectral bands of the PlanetScope image and the unique thermal properties of water bodies compared to the other three land cover classes. Moreover, it is the most visually accurate model relative to the reference Landsat LST layer. For example, in other models, blue-painted building roofs appear to be a cold spot (Figures 7c, 7d, 7e, 7g, and 7h), possibly because its RGB properties are quite similar with water bodies. This is not the case with the MSAVI/EVI/Mean model (Figure 7f), which estimated that the LST of roofing materials are higher than vegetation and water during daytime, regardless of its color. The sensitivity of MSAVI and EVI in detecting variations in vegetation and soil abundance, was complemented by the textural information provided by GLCM mean, resulting to better accuracy. Comparing Figures 8 and 9, it can be observed that the spatial distribution and patterns of LST within the study site are visually similar. Different features such as the boundaries between land covers, building edges, major and minor roads, river channels, fishponds, and croplands are better discriminated in the downscaled LST map. Based on visual and statistical analysis, it can be concluded that the methodology can be applied in complex urban landscapes and could yield accurate results. One issue regarding the result of the study is the large deviation in its maximum and minimum values, relative to the original layer. Most of the building roofs were overestimated by the MSAVI/EVI/Mean model, its maximum value being 46.82 °C, while the highest Landsat LST value is only 37.79 °C. Meanwhile, some water pixels were underestimated probably because the indices used in the model are less sensitive to the water surface temperature variations. However, it cannot be concluded that these differences indicate that the predicted LST is erroneous because the study was not able to perform model validation using in-situ surface temperature data or a 3-meter reference LST. This limitation can be addressed in future researches through ground validation or by obtaining a highresolution reference thermal image.

CONCLUSION AND RECOMMENDATIONS
This study aimed to downscale a 30-meter Landsat-derived land surface temperature to higher spatial resolution using different indices derived from a 3-meter PlanetScope image. Multicollinearity of variables was addressed by the study by including textural indices derived using gray-level-cooccurrence matrix (GLCM), which improved the model fit without increasing the variance of the coefficient estimates. Regression-Kriging (RK) method was applied to the significant variables to determine the best set of predictors that will result to the most accurate fine-resolution LST.
Based on the results of the study, the MSAVI/EVI/Mean model with an R 2 of 0.75 was selected as the final model to be used in downscaling the Landsat LST image. It performs best in predicting the surface temperature of bare soil and vegetation pixels, with RMSE of 0.58 °C and 0.80 °C, respectively, while its accuracy in estimating LST of impervious surfaces and water bodies are 1.12 °C and 1.45 °C, respectively. This suggests that the application of RK method based on the relationship between LST and the image derivatives used in the study may be applied in order to spatially disaggregate the LST of heterogeneous urban regions. However, it cannot be guaranteed that the MSAVI/EVI/Mean model will produce accurate results when applied to a different area. It is recommended to redo the methodology to come up with a specific model for a different study site since the results are highly dependent on land cover composition and the satellite image used.
For future work, it is recommended to collect in-situ LST data at different locations within the study site or acquire a highresolution thermal image of a sub-site on the day and time of Landsat satellite passage using an unmanned aerial vehicle (UAV), to serve as a reference LST layer. This could improve the assessment of model accuracy which was not conducted by the study because of logistical and time constraints.