HISTORICAL GIS DATA AND CHANGES IN URBAN MORPHOLOGICAL PARAMETERS FOR THE ANALYSIS OF URBAN HEAT ISLANDS IN HONG KONG

Rapid urban development between the 1960 and 2010 decades have changed the urban landscape and pattern in the Kowloon Peninsula of Hong Kong. This paper aims to study the changes of urban morphological parameters between the 1985 and 2010 and explore their influences on the urban heat island (UHI) effect. This study applied a mono-window algorithm to retrieve the land surface temperature (LST) using Landsat Thematic Mapper (TM) images from 1987 to 2009. In order to estimate the effects of local urban morphological parameters to LST, the global surface temperature anomaly was analysed. Historical 3D building model was developed based on aerial photogrammetry technique using aerial photographs from 1964 to 2010, in which the urban digital surface models (DSMs) including elevations of infrastructures and buildings have been generated. Then, urban morphological parameters (i.e. frontal area index (FAI), sky view factor (SVF)), vegetation fractional cover (VFC), global solar radiation (GSR), Normalized Difference Built-Up Index (NDBI), wind speed were derived. Finally, a linear regression method in Waikato Environment for Knowledge Analysis (WEKA) was used to build prediction model for revealing LST spatial patterns. Results show that the final apparent surface temperature have uncertainties less than 1 degree Celsius. The comparison between the simulated and actual spatial pattern of LST in 2009 showed that the correlation coefficient is 0.65, mean absolute error (MAE) is 1.24 degree Celsius, and root mean square error (RMSE) is 1.51 degree Celsius of 22,429 pixels.


INTRODUCTION
The surface temperature is one of the main urban climatic parameters (Voogt and Oke, 2003).Some studies have investigated the effect of urbanization to regional warming (Christy and Goodridge, 1995;Dai et al., 1999;Diem et al., 2005;Lam, 2006;Lau and Ng, 2013).Hong Kong has the highest population density district in the world (Zhang, 2000).Understanding how the urban climatic changes caused by the past urban landscapes can permit planners and authorities to have the perspective for urban planning and design.
Human activities changes the natural environment, which lead to the reductions of natural surface and long-wave emission of surface, release of atmosphere pollutants and waste heats.These artificial factors determine the urban climate (Oke, 1978;Landsberg, 1987;Goldreich, 1995;Gomez et al., 1998).Thermal remote sensing of urban surface temperatures observes land surface temperature (LST) effectively in related to the surface energy balance.Taha (1997) found the variations of surface albedo and vegetation cover are the controlling factors of climate.It has been proved that a negative correlation existed between Normalized Difference Vegetation Index (NDVI) and temperature (Chen et al. 2006, Zhang andWang 2008), but a positive correlation is shown between Normalized Difference Build-up Index (NDBI) and temperature (Chen et al. 2006).Higher wind speed reduces the temperature differences inside the canopy layer due to inner atmospheric turbulent mixing and advection activity (Oke, 1982).Kolokotroni et al. (2006) investigated the upper limit of heat island intensity, which increased until a certain wind speed and then gradually declined as the wind speed increased.
Currently, most researchers have focused on studying the spatial and temporal distribution, cause and effect, as well as the prediction model of the urban climate.This study aims to investigate the relationship between the urban morphology of a high density city and its urban climatic characteristics from a historical perspective.And, it will help predict the spatial distribution of urban surface temperature in the future based on the predictable morphologic factors.Our research team has developed historical 3D building morphological models between the 1964 and 2010.Following urban morphological parameters (i.e.FAI, SVF), VFC, NDBI, and GSR were calculated between the 1985 and 2010, LST data between the 1987 and 2009 were retrieved from historical Landsat images data.Finally, a linear regression method in WEKA was used to predict spatial patterns of LST.

STUDY AREA AND DATA USED
The Kowloon Peninsula in Hong Kong (Figure 1), a highly urbanized with an area of 160 km 2 and an average population density of 44,917 persons/km 2 in 2011 (Population Census, 2011), is selected as the study area.The Mongkok district in the Kowloon Peninsula has the largest population density in the world (130,000 persons/km 2 ) (Boland, 2013).Kowloon has one large park (Kowloon Park) and a few small diverse urban parks, which separate built-up landscape in general.The topography is mainly flat, but goes up to 300 m in the northern of the Peninsula.Since the core areas in Kowloon is over 3 km from the coast, it has been strongly believed that the "wall effect" prevents cool sea breezes from penetrating to the inner city.

LST RETRIEVAL
In the study, the LST data were converted from the raw digital numbers (DN) in thermal infrared band 6 using the monowindow algorithm.The calibration parameters on the satellite sensor and the Planck equation developed were used to calculate the brightness temperature.The parameters include upwelling and downwelling atmospheric radiances, atmospheric transmittance and surface emissivity.After atmospheric correction and emissivity calculation, surface temperature was calculated from brightness temperature.Specifically, LST at 30 m spatial resolution from Landsat image data was retrieved through the following four steps: (1) Conversion from digital number to radiance The DN were transformed into radiances using the TM postcalibration dynamic ranges provided by Markham and Barker (1986), which is described as: where L  is the spectral radiance at the sensor's aperture, the Gain  and Bias  are band-specific rescaling factors, which are obtained from the metafile of the image.
(2) Land surface emissivity estimation The thermal infrared band emissivity were simulated as a weighted mean according to tabulated numbers published by Masuda et al. (1988) using: where    is the narrowband emissivity based on land use type, such as impervious surface area, vegetation, buildings areas.  f  is the spectral response function (Markham and Barker, 1985).
(3) Atmospheric correction The Top of Atmospheric (TOA) radiances were then converted to surface-leaving radiance by removing the effects of the atmosphere in the thermal region.Specifically, the radiosonde launched near the study area at 0:00 GMT including geopotential, pressure, temperature, relative humidity and ozone profiles allow us to estimate the atmospheric transmissivity and atmospheric upwelling and downwelling radiances by MODTRAN 4.0 (Abreu and Anderson, 1996).The atmospheric profiles data with 27 pressure levels are provided by European centre for Medium-Range Weather Forecasts (ECMWF).Atmospheric data are available globally every 6 h and every 0.125 degree in latitude and longitude.Then, surface-leaving radiance will be calculated using the following equation (Barsi et al., 2005).
where T L is the blackbody radiance of kinetic temperature T, L  is the space-reaching or top-of-atmospheric radiance measured by the instrument, L  is the upwelling atmospheric radiance,  is the total atmospheric transmissivity between the surface and the sensor, L  is the downwelling atmospheric radiance, L  and L  were calculated using a radiative transfer code e.g.MODTRAN (Berk et al., 1989) Lastly, the radiances calculated were converted to surface temperature using the Landsat specific estimate of the Planck curve (Chander and Markham, 2003): where B T is the temperature in degree Celsius (°C), 1 K and 2 K are pre-launch calibration constants.For Landsat 5 TM, 1 K =607.76Wm -2 sr -1 μm -1 and 2 K =1260.56K; For Landsat 4 TM, 1 K =671.62 Wm -2 sr -1 μm -1 and 2 K =1284.30K; For Landsat 7 TM, 1 K =666.09Wm -2 sr -1 μm -1 and 2 K =1282.71K; L  is the spectral radiance.The apparent surface temperatures have uncertainties less than 1 °C when the atmosphere is relatively clear.Figure 2 shows the spatial distribution of the retrieved LST between the 1987 and 2009, and densely high rise areas appear cool on daytime images, which is consistent with the previous study in Hong Kong (Nichol, 2005).The RCP database aims at documenting the emissions, concentrations, and land-cover change projections.In the study, the global mean temperature chosen is modelled using RCP 6.0, which is developed by the Asia-Pacific Integrated Model (AIM) modelling team at the National Institute for Environmental Studies (NIES) in Japan (Fujino et al., 2006;Hijioka et al., 2008).Table 3

VFC, NDBI
The VFC and NDBI indices were used to characterize the land use and land cover (LULC) types and used to evaluate the quantitative relationships between LULC types and LST.NDVI was calculated using the following equation.are not fixed value (Kaufman and tanre, 1992;Qi et al., 1994), even in the same image.According to the size of VFC, the medium-high (0.5< VFC ≤ 0.7) and high (0.7 < VFC) vegetation cover were chosen as vegetation areas in the study.NDBI (Equation ( 7)) as an index is sensitive to the built-up area (Zha et al., 2003).

SVF calculating using a raster DSM
The SVF is defined by the ratio between radiation received from a flat surface and the entire hemispheric radiation environment (Watson and Johnson 1987).The range of the SVF is between 0 and 1, and closes to 1 in perfectly flat terrain, and decreases proportionally in these locations with obstructions such as buildings and trees, reaches to 0 in a completely obstructed location means that all outgoing radiation and received diffuse solar radiation will be intercepted by obstacles (Oke 1993;Brown and Grimmond 2001).
The software of calculating the SVF can be downloaded from (IAPS ZRC SAZU, 2010), which is implemented using the remote sensing software ENVI+IDL (ITT Visual Information Solutions, 2010).In SVF calculation, the DSM data were resampled to 30 meter resolution for maintaining the consistent spatial scale with LST data and input into the software, searching radius is 90 m (3 pixels) in 16 directions.Figure 3 shows the spatial distribution of SVF values in the study area.

FAI derived from a GIS-based method
For urban renewal study, it is essential to facilitate a measure to estimate wind condition in respects to the geometry and orientation of buildings.
In the study, the frontal area (FA) was calculated by inputting raster data with the height information, wind direction, and backward flow coefficient in a GIS-based program.From the perspective of scale assimilation, buildings vector data was first converted to raster data at 30 m spatial resolution.The FAI is the total area of building facets projected to plane normal facing the particular wind direction.The FAI is the FA divided by the plane area (Burian et al., 2002;Grimmond and Oke, 1999;Wong et al., 2010) (Equation ( 8)) Where FAI is the frontal area index, FA is the frontal area, Area_grid is the grid area (900 m 2 in the study).Figure 5 shows the FAI values of the Kowloon Peninsula in east-west direction at 30 m spatial resolution from 1985 to 2010, in which nodata indicates that wind flow of the ground objects was blocked and sea surface did not take part in FA calculation.Due to east wind of high frequency in Hong Kong, FAI values were calculated easterly wind movement towards the west.

Simulation of GSR
The GSR received by topography and surface features is a function including the geometry of the earth, atmospheric transmittance, geographical location, sun elevation angle, orientation (slope and aspect), and elevation (Allen et al., 2006).
The solar radiation derives from the sun, which will be intercepted as direct, diffuse, and reflected components, the three radiation forms the GSR.The solar radiation analysis tool in the ArcGIS Spatial Analyst extension maps and analyses the global solar radiation over a geographic area at specific time periods (ESRI Inc., 2015).The GSR does not include reflected radiation, as the direct radiation is the principal component, secondly is the diffuse radiation.The DSM data were resampled to 30 meter resolution and input into ArcGIS software for calculating GSR at a specific time period.The fraction of radiation passing through the atmosphere as an important factor is calculated by the average of transmissivity values of all wavelengths.And, the average values of transmissivity from all wavelengths were estimated by the MODTRAN 4.0.Figure 6 shows that GSR including direct radiation and diffuse radiation which was modelled based on the transmissivity values in Table 4.Then, the simulated GSR values in the King's park extracted, according to the geographical location, were used for evaluating the accuracy.

PREDICTION MODEL
The WEKA is a significant and systemic workbench in data mining and machine learning.It not only provides a toolbox about learning algorithms, but also a framework in which new algorithms can be implemented without the support of infrastructure data manipulation and program evaluation (Hall et al., 2009).
The WEKA software package was downloaded from Machine Learning Group (2013).Classifiers as main learning methods in WEKA include a series of rule sets and decision trees for data modelling.Moreover, the command interface in WEKA also includes visualization tools and algorithms for data analysis and predictive modelling (Frank et al., 2004).In the study, VFC, NDBI, FAI, GSR, building height, SVF, wind speed, altitude, longitude, and time after multicollinearity analysis were selected as attributes to predict the spatial distribution of LST.
The correlation coefficient was used to measure the accuracy on the predictions.And the correlation coefficients from the method of linear regression, Reduced Error Pruning Tree (REPTree) (a fast decision tree learner), instance-based learners (IBk) (k-nearest neighbour learner), model trees and rules (M5Rules) are 0.65, 0.59, 0.33, and 0.61, respectively.As a result, it is found that linear regression will be the best in predicting the spatial pattern of LST. Figure 8 shows the predicted spatial pattern of LST in 2009, which was used to compare with the spatial distribution of LST retrieved from Landsat image.It can be found that the differences of maximum LST and minimum LST are 2 °C, 1 °C, respectively.Furthermore, it can also be found by analysis of pixel-to-pixel that the correlation coefficient is 0.65, MAE is 1.24 degree Celsius, and RMSE is 1.51 degree Celsius of 22,429 pixels.Ultimately, it is found that a linear regression method outperformed in predicting the spatial pattern of LST, which can be verified by the comparison between the simulated and actual spatial pattern of LST in 2009.

Figure 1 .
Figure 1.The study area: Kowloon Peninsula in Hong Kong (The right: DOM in 2014) To validate the atmospheric correction, sea surface temperature (SST) was calculated with emissivity of 1.0 in the years of 2005 and 2009.SST retrieved and observed at the Waglan Island station are 23.7 °C and 23.1 °C in 2005, 19.1 °C and 19.9 °C in 2009.

Figure 2 .
Figure 2. Spatial pattern of LST in the Kowloon Peninsula at 30 m spatial resolution (a): 1987; (b): 1990; (c): 1995; (d): 2000; (e): 2005; (f): 2009 is the reflectance value of red band (band 3) and 4 b is the reflectance value of near-infrared band (band 4) from the Landsat images.The NDVI values range from −1 to 1, and positive values indicating vegetated areas, negative values signifying non-vegetated surface features.Gutman and Ignatov (1998) used a dense vegetation mosaic-pixel model to derive the VFC by scaled NDVI, which is defined as, values of NDVI for bare soil and a surface with a VFC of 100%, respectively.In theory, min NDVIshould be zero for most soil types, but changes from -0.1 to 0.2(Carlson and Ripley, 1997;Rundquist, 2002), due to the influences of many factors.max NDVI should be the max of NDVI, but subject to the spatial or temporal change, due to the influences of vegetation type.Thus, max NDVI and min NDVI b are the respective DN of mid-infrared band (band 5) and near-infrared band (band 4) of the Landsat images.

Figure 4 .
Figure 4.The parameter setting window of the FAI calculation

Figure 8 .
Figure 8.Comparison between simulated and actual spatial pattern of LST in 2009

Table 1
,  is the land surface emissivity.

Table 3 .
shows global temperature anomaly between the 1987 and 2009 which can help to reduce the effect of global warming in the study area.The mean LST and global temperature anomaly

Table 4
Table4shows that the maximum difference and minimum difference between observed GSR and simulated GSR at King's park station from 1987s onwards is 0.09 MJ/m 2 and 0.00 MJ/m 2 , respectively.
. A comparison between observed and simulated GSR