MODELLING OF LAND SURFACE TEMPERATURE USING GRAY LEVEL CO-OCCURRENCE MATRIX AND RANDOM FOREST REGRESSION

Modelling of land surface temperature (LST) is conducted to be able to explain the spatial and temporal variations of LST using a set of explanatory variables. LST in a previous study was modelled as a linear function of vegetation cover and built up cover as quantified by the normalized difference vegetation index (NDVI) and the normalized difference built-up index (NDBI), respectively, and other variables, namely, albedo, solar radiation (SR), surface area-volume ratio (SVR), and skyview factor (SVF). SVF requires a digital surface model of sufficient resolution while SVR computation needs 3D volumetric features representing buildings as input. These inputs are typically not readily available. In addition, NDVI and NDBI do not fully describe the spatial variability of vegetation and built-up cover within an LST pixel. In this study, PlanetScope images (3m resolution) were processed to provide soil-adjusted vegetation index (SAVI) and VgNIR Built-up Index (VgNIR-BI) layers. The following gray level co-occurrence matrices (GLCM) were generated from SAVI and VgNIR-BI: Mean, Variance, Homogeneity, Contrast, Dissimilarity, Entropy, Second Moment, and Correlation. Random Forest regression was run for several cases with different combinations of GLCM features and non-GLCM variables. Using GLCM features alone yielded less satisfactory models. However, the use of additional GLCM features in combination with other variables resulted in lower MSE and a slight increase in R. Considering NDBI, NDVI, SAVI_GLCM_contrast, VgNIR-BI_GLCM_contrast, VgNIR-BI_GLCM_dissimilarity, and SAVI_GLCM_contrast only, the RF model yielded an MSE=1.657 and validation R=0.822. While this 6-variable model’s performance is slightly less, the need for DSM and 3D building models which are necessary for the generation of SVF and SVR layers is eliminated. Exploratory regression (ER) was also conducted. The best 6-variable ER model (Adj. R=0.79) consists of SVR, NDBI, NDVI, SAVI_GLCM_second_moment, VgNIR-BI_GLCM_mean, and VgNIR-BI_GLCM_entropy. In comparison, OLS regression using the 6 non-GLCM variables yielded an Adj. R=0.691. The results of RFR and ER both indicate the value of GLCM features in providing valuable information to the models of LST. LST is best described through a combination of GLCM features describing relatively homogenous areas (i.e., dominant land cover or low-frequency areas) and the more heterogenous areas (i.e., edges or high-frequency areas) and non-GLCM variables.


INTRODUCTION
As cities continue to densify and expand, urban heat islands (UHI) are increasingly becoming larger and more persistent. UHIs have adverse impacts in urban environments such as increases energy consumption and thermal discomfort (O'Malley et al., 2014, Yang et al., 2016. Surface urban heat islands are examined using land surface temperatures (LST) estimated from thermal bands of satellite systems such as Landsat, MODIS, and Sentinel-3. According to Zhou et al. (2018), more than 70% of UHI studies have utilized Landsat and MODIS images, largely due to their temporal resolution and availability.
Beyond examining the spatio-temporal variation of LST, efforts have been made in modelling or estimating LST. These are attempts to understand how LST are affected by the mix of environmental variables and to possibly estimate LST for different future scenarios (e.g., urban expansion, decrease of vegetation cover). Variables describing the distribution of land use land cover types, materials, urban morphology, and solar irradiance, among others, have been used in previous studies to model LST (see Alcantara et al., 2019, Cañete et al., 2019. The main objective of this study is to evaluate the usefulness of texture measures derived from high-resolution satellite image in estimating Landsat-derived land surface temperature (30-m resolution). Gray-level co-occurrence matrix (GLCM) textural images generated from 3-meter PlanetScope-derived vegetation and built-up layers were subjected to regression modelling together with commonly used layers (e.g., NDBI, NDVI). The results of this study could be used in understanding how spatial variation of features within a pixel affects the LST at that location.

Study Area
Situated in the northeast portion of Metro Manila, Quezon City is close to the region's major activity centers, including the Ninoy Aquino International Airport (NAIA). It is a highly urbanized city with an area of 16,112.58 hectares. Among the sixteen (16) cities and one (1) municipality in the region, It is the largest and almost one-fourth the size of Metro Manila. It also has the largest total population of over 2.9 million according to the 2015 Census of Population. Composed of 142 barangays (villages), the city is mostly residential and commercial. As per World Population review 2019, its average population density is approximately 18,000 residents per sq. km with a recent annual population growth rate of more than 2%. LST hotspots are frequently and commonly found in Quezon City (Landicho and Blanco, 2019)

Methodology
The general methodology is shown in Figure 2. Layers representing different variables related to or known to affect LST are prepared. In this study, the following data layers were from Alcantara et al. (2019): LST, NDVI, NDBI, Albedo, Solar Radiation (SR), Surface Area-Volume Ratio (SVR), and Skyview Factor. The generation of these layers were described in detail in Alcantara et al. (2019) and will not be presented in in this paper. PlanetScope image (acquired on 23 March 2019) is processed to generate soil-adjusted vegetation index (SAVI) and visible green NIR built-up index layers at 3-m resolution. GLCM textures layers are then derived from SAVI and VgNIR-BI to characterize the spatial pattern of vegetation and built-up within an LST pixel. The layers, in various combinations, are subjected to Random Forest Regression (RFR) and Exploratory Regression to evaluate the importance and utility of GLCM textures in estimating LST.

Landsat Image and Derivative Layers
The derivation of LST from Landsat image was described in several papers (see Alcantara et al., 2019. Previous studies have used the LST retrieval method developed by Jeevalakshmi et al. (2017) and implemented it using Google Earth Engine (GEE). The LST layer used in this study is shown in Figure 3. in Quezon City.

SAVI and VgNIR-BI from PlanetScope Image
Built-up index (VgNIR -BI) and SAVI (soil-adjusted vegetation index) were derived from PlanetScope to show the spatial variability of the vegetation and built-up cover at a higher resolution. For this study, the interest is the description of these spatial variability within each of the Landsat-based LST pixels.

SAVI
SAVI was established to improve the sensitivity of NDVI to soil backgrounds. It minimizes the influence of soil brightness by introducing the soil conditioning index "L" in the formula below which was developed by Huete. The range of L is from 0 to 1. The value of L depends on the specific environmental condition. When the vegetation cover degree is high, L is close to 1, and can only be applied to large canopy density and coverage (Xue and Su, 2017). But when L values are close to zero, SAVI is equal to NDVI (Royimani et.al). In this study, L =0.5 was used which is a common practice for most environmental conditions (Xue and Su, 2017). Figure 5 shows the SAVI layer of Quezon City. Water areas (with very low SAVI value) were excluded in subsequent analysis.

VgNIR Built-up Index
Visible (Vis) -based built up indices have a better potential in separating built-up lands from dry vegetation, which has been an important challenge in the application of spectral indices for classifying built-up lands from satellite imageries (Estoque and Muruyama, 2015). In this study, a layer was generated to show the spatial variability of the built-up areas in the region. The The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) visible green built up index (VgNIR-BI) formula used the visible green and NIR channel as shown below. − = ρ − ρ ρ + ρ Figure 6 shows the VgNIR-BI layer of Quezon City. Water areas (with very high VgNIR-BI value) were excluded in subsequent analysis.

Textural Indices
Gray-level co-occurrence matrix (GLCM) is a statistical method of examining texture that considers the spatial relationship of pixels (Suresh, 2012). Developed from a gray-level image, GLCM indicates the joint probability of distribution of a pair of gray levels at specific distance and orientation (Mhangara and Odindi, 2013). Haralick et al. (1973) described 14 statistics that can be calculated from the co-occurrence matrix with the intent of describing the texture of the image. In this study, eight GLCM textures were were generated from the SAVI and VgNIR-BI layers using a neighbourhood of 11 x 11 pixels and 1-pixel shift in both x-and y-directions: Mean, Variance, Homogeneity, Contrast, Dissimilarity, Entropy, Second Moment, and Correlation. See the equations used to calculate each feature (Table 1) and the brief definitions/descriptions that follow. The value of GLCM at the center of the 30-m LST pixel was utilized as the GLCM value for that pixel.

Correlation
∑ ∑ ( , ) * ( , ) − the whole image. Its value ranges from 1 (perfectly positively) to -1 (perfectly negatively) correlated image, and infinity for a constant image (no variation image). Figure 7. Subset of the study area depicted as (from top-left, left to right) Satellite image, OSM map, and GLCM textures from VgNIR-BI: Mean, Dissimilarity, Variance, Entropy, Homogeneity, Second Moment, Contrast, and Correlation.

Random Forest Regression
Previous studies utilized ordinary least squares (OLS) regression to explain LST in terms of NDBI, NDVI, and other variables. OLS Regression minimizes the sum of squares of the residuals in estimating the linear relationship between the independent and dependent variables by minimizing the sum of the squares of the residuals (Butler, 1999).
This study utilized random forest regression (RFR) to address the inherent limitations of linear regression techniques. Model inputs are the Landsat-based NDVI and NDBI, albedo, SVR, SVF, SR, and the PlanetScope-based GLCM features. The value of GLCM at the center of the 30-m LST pixel was utilized as the GLCM value for that pixel. RF was run for the following input variable cases: Case 1 (6 original variables, namely, NDBI, NDVI, SR, Albedo, SVR, SVF), Case 2 (SAVI_GLCM features), Case 3 (VgNIR-BI_GLCM features), Case 4 (SAVI_GLCM features and VgNIR-BI_GLCM features) and Case 5 (A. all variables/features, B. selected variables/features from results of A).

Exploratory Regression
Exploratory regression was also conducted to create 6-variable regression models of LST and compare with the OLS model comprising of the non-GLCM variables. To remove redundant and statistically insignificant predictors and bring down the number of variables to 6, stepwise multilinear regression via backward elimination was performed. Percentage of Variable Significance (%Significance) and Maximum Variance Inflation Factor (VIF) were used as the exclusion criteria. SAVI_GLCM_mean, VgNIR-BI_GLCM_contrast, SAVI_GLCM_correlation, and VgNIR-BI_GLCM_dissimilarity are the top variables with variable importance of 29%, 21%, 13%, 7%, and 7% respectively. Contrast feature is a measure of the amount of local variations present in an image. This local variation is not adequately accounted for by the mean GLCM texture measures. High values of GLCM contrast can be found in heterogeneous areas or areas with considerable variation in cover. It can be inferred however that using GLCM features alone yielded less satisfactory models compare to Case 1 model. 1.246 0.866

Random Forest Regression Models
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) additional GLCM features resulted in a decrease of MSE from 1.559 (Case 1) to 1.246 (Case 5) and a slight increase in R 2 . Analysis of the scree plot of variable importance for Case 5 indicates that NDBI, NDVI, SAVI_GLCM_contrast, VgNIR-BI_GLCM_contrast, VgNIR-BI_GLCM_dissimilarity, SAVI_GLCM_contrast can be retained for a 6-variable model (Case 6). RF model of these 6 variables yielded an MSE=1.657 and validation R 2 =0.822. While this 6-variable model's performance is slightly less compared to Case 1, the need for DSM and 3D building models which are necessary for the generation of SVF and SVR layers is eliminated. Figure 8. Random forest-derived variable importance expressed in percentage

Exploratory Regression Models
Based on the resulting best 6-variable model (Adj. R 2 =0.79), the variables selected are SVR, NDBI, NDVI, SAVI_GLCM_second_moment, VgNIR-BI_GLCM_mean, and VgNIR-BI_GLCM_entropy. In comparison, OLS regression using the 6 RF Case 1 variables yielded an Adj. R 2 =0.691. The second best ER model has the following explanatory variables: SVR, NDBI, NDVI, SAVI_GLCM_entropy, VgNIR-BI_GLCM_mean, and VgNIR-BI_GLCM_entropy. The third best ER model is composed of the following explanatory variables: SVR, Albedo, NDBI, NDVI, VgNIR-BI_GLCM_mean, and VgNIR-BI_GLCM_entropy. It should be noted that NDBI and NDVI are key variables as they directly estimate the abundance of built materials and vegetative cover, respectively. VgNIR-BI_GLCM_mean is included in these three models, indicating the significant additional information provided by the mean textural layer derived from the built-up index VgNIR-BI. VgNIR-BI_GLCM_entropy proved to be significant since entropy feature measures the degree of disorder in the image. This degree of disorder which can be related to the way the built-up structures are arranged and interspersed with vegetative cover is not accounted for by NDBI and NDVI. Figure 8 show the variable importance of all variables (GLCM and non-GLCM). NDBI and NDVI are the two most important variables regardless of the models used. The GLCM mean features follow next and are found to be more important than SVR which describes in an integrated manner the form and size of the buildings and houses. VgNIR-BI_GLCM_mean is also included in the top 3 ER models. In this study, as explained earlier, the GLCM textures were generated from PlanetScopederived vegetation and built-up index layers. Hence, they describe spatial variations within 30m x 30m LST pixels. Spatial (textural) relationships are not necessarily correlated with spectral data (Hall-Beyer, 2013) or derivatives spectral data such as vegetation and built-up indices. GLCM Mean, Homogeneity, Correlation and Second Moment are associated with patch interiors (Hall-Beyer, 2013; see Figure 7).

On the importance of GLCM features
It can be seen that after SVR, three texture measures, namely, VgNIR-BI_GLCM_contrast, VgNIR-BI_GLCM_dissimilarity, and SAVI_GLCM_correlation have higher variable importance compared to Albedo and SVF. Contrast, Dissimilarity, Entropy and Variance are commonly related to visual edges (Hall-Beyer, 2013; see Figure 7). Edges in the spectral image are areas where abrupt transitions from one cover to another occur. LST at these transitional areas generally fall between higher LST ranges and lower LST ranges. Inclusion of these edge-related GLCM features allows the models to describe LST in less homogenous areas.
In the RF and ER models with input GLCM features, patch interior-related and edge-related GLCM features are typically present in the same model. This implies that LST is best described through a combination of GLCM features describing relatively homogenous areas (i.e., dominant land cover or lowfrequency areas) and the more heterogenous areas (i.e., edges or high-frequency areas).

CONCLUSION AND RECOMMENDATIONS
The use of GLCM features alone yielded less satisfactory models compared to the model based on non-GLCM features. The addition of GLCM features to non-GLCM features in a random forest regression model led to a significant decrease in MSE. However, improvement in the validation R 2 is very minimal. While the performance of a 6-variable model comprising of GLCM and non-GLCM features is slightly less compared to that of an all-non-GLCM model, the need for DSM and 3D building models which are necessary for the generation of SVF and SVR layers is eliminated. GLCM contrast provided significant information to the RF-based models. Based on exploratory regression (ER), the best models are a mix of non-GLCM and GLCM features, though the non-GLCM features, specifically NDB and NDVI, are commonly more important than the GLCM features. In ER model, GLCM entropy proved to be significant. The results of RFR and ER both indicate the value of GLCM features in providing valuable information to the models of LST. Furthermore, it was observed that each model includes GLCM features describing patch interior and edges.
It is recommended to further examine GLCM features in terms of kernel size and shift, and analyse how their explanatory power changes. Different resolutions of vegetation and built-up index layers should also be evaluated in terms of describing the spatial patterns within LST pixels. GLCM describes textures of single variable. It would be interesting to examine metrics jointly characterizing patterns of vegetation and built-up indices.