FORECASTING URBAN EXPANSION BASED ON NIGHT LIGHTS

Forecasting urban expansion models are a very powerful tool in the hands of urban planners in order to anticipate and mitigate future urbanization pressures. In this paper, a linear regression forecasting urban expansion model is implemented based on the annual composite night lights time series available from National Oceanic and Atmospheric Administration (NOAA). The product known as 'stable lights' is used in particular, after it has been corrected with a standard intercalibration process to reduce artificial year-to-year fluctuations as much as possible. Forecasting is done for ten years after the end of the time series. Because the method is spatially explicit the predicted expansion trends are relatively accurately mapped. Two metrics are used to validate the process. The first one is the year-to-year Sum of Lights (SoL) variation. The second is the year-to-year image correlation coefficient. Overall it is evident that the method is able to provide an insight on future urbanization pressures in order to be taken into account in planning. The trends are quantified in a clear spatial manner.


INTRODUCTION
Urbanization proceeds at a rapid pace in many parts of the globe and significantly affects the environment as well as the quality of life (Grimm et al., 2008;Seto et al. 2012).Therefore there is a need to monitor urbanization and forecast future urbanization pressures in order to reduce their side effects via planning.There are two main types of urban expansion forecasting models depending on the degree of spatial explicitness.The first and less exact one uses aggregated units, such as administrative divisions, to exploit the time series and forecast the result.For example, the percentage of urban extent increase can be forecast per municipality in a city.The second type of models is more exact.Raster data are used as a time series and forecasting is done per pixel.A typical example in this category is the SLEUTH model (Clarke, 2008;Pramanik and Stathakis, 2015) where several urbanization parameters (slopes, evolution of past urban areas, land use etc.) are input and the model predicts the probability of each pixel being urbanized in the future.
The latter type of models has an additional merit.Rather than merely revelling the magnitude of the phenomenon it also reveals the spatial patterns of urbanisation trends (e.g.monocetric vs. poly-centric development, intra-urban peri-urban or exurban growth, road stripe or coastal sprawl etc).Revealing urbanization patterns is fundamental to understand the driving factors of the phenomenon and subsequently be able to better design planning policies (Triantakonstantis and Stathakis, 2015).The fact that the latter type (exact models) is based on the raster structure makes them particularly suitable for use with remote sensing data which are also, by design, structured as rasters.Given the resent plethora of remote sensing datasets this is an exceptionally convenient aspect.
Past urban extent studies have been mainly based on optical data, in specific on LANDSAT and MODIS data, due to the relatively extensive times series they offer (since 1972 and 1999 respectively).The problem in using these two data sources is that extracting urban areas is not a trivial task.Urban areas are spectrally mixed with barren land in many parts of the globe due to the similarity of building materials to natural elements (soil, sand, rocks etc.).Therefore the process is not straightforward, it is laborious and typically needs to be combined with ancillary data that are not always available or cheap (Stathakis and Faraslis 2014;Stathakis et al. 2012).In this framework, it has been recently suggested that it is more efficient to use night time instead of day time optical data.
The Defence Meteorological Satellite Program -Operational Linescan System (DMSP/OLS) data in particular has been used in several urbanization studies.The main advantage stems from the fact that night lights observed from space is a quite straightforward indication of human presence as inhabited areas are clearly outlined (Imhoff et al. 1997a).Night lights have been used to study urbanization at global, continental and national scales (Elvidge et al. 2007, Elvidge et al. 2014;Imhoff et al., 1997A;Imhoff et al., 1997b;Gao et al., 2015;Fan et al., 2014;Ma et al., 2012;Liu et al., 2012;Small and Elvidge. 2013).It has been found that stable lights significantly correlate with population and Gross Domestic Product (GDP) data (Wu et al. 2013;Mellander et al. 2013), both pillars of urbanization.
The main objective and novelty of this paper is the use of DMSP/OLS time series to establish a regression model in order to forecast urban pressures in a spatially explicit manner.

Study areas
Two case studies are selected as shown in Figure 1.The first one is Sicily, in the south of Italy, and the second is Athens, the capital of Greece.Sicily is an area where the full range of possible digital number (DN) values is present.Its population is approximately five million people, relatively stable in the past twenty years.Athens is a metropolitan area, of slightly less than four million people.Its urban extent rapidly expanded due to newly built infrastructure (metro, highways, a new airport etc) for 2004 Olympic games.
Currently, the most used form of night lights is the 'stable lights' annual composite product in which daily data are processed and ephemeral lights are removed (Baugh et al. 2010).The spatial resolution of 'stable lights' is approximately 1km at the equator.The time series currently includes 23 annual composites .A subset of them is presented in Figure 2.

Intercalibration
The 'stable lights' product of the Global DMSP-OLS Nighttime Lights Time Series 1992-2013 (Ver. 4) has been downloaded from the NOAA website and used in this study.In order to reduce year-to-year fluctuations due to differences in sensor calibration, in acquisition times etc. the standard second order regression intercalibration process has been applied (Elvidge et al., 2009a The coefficients of the intercalibration function are derived by regression.Each year is compared to a fixed area, termed invariant region, for a specific year, termed base year.In addition, the presumed noise has been removed in all images in the time series by applying the threshold in Equation ( 2) where DN' is the new value and DN is the original value.

Selection of intercalibrated satellite year.
In several instances, there are two satellites operating concurrently (Hsu et al, 2015).Therefore, two stable light products per year are produced, as shown in Table 1.A reasonable approach to select one of the two annual composites is to keep the one with the highest averaged number of cloudfree observations of each pixel that is provided as metadata by NOAA for each annual stable lights product (Li et al. 2013).This information is also shown in Table 1.

Linear regression forecasting
A linear regression model is fitted to the raster time series using Equation 3. The linear model is fitted per pixel (not a single one for the total raster).

y=a+bx
(3) where a is the line intercept, b is its slope, x is the raster of the previous year and y is the raster of the next year (forecast).
The output is also corrected to the original range by applying the threshold where DN' is the new value and DN is the original value.

Validation metrics
As a means to observe and validate the forecasting quality of two metrics are used.The first one is the evolution of the Sum of Lights (SoL).SoL is the sum of all DN values in an area (Li et al. 2013).SoL is calculated per region, by a possess frequently termed zonal statistics, as shown in Equation (2).
where DNi is the digital number value.
The second metric is the year-to-year Pearson correlation coefficient between the current and next date rasters in pairs.

Limitations
Clearly the linear regression model is a simplification of the urban expansion trends.Other, more complex models can be used.The linear model is used as a first step to understand the capacity of the method and has the advantage of straightforward interpretation of its coefficients.Also, any regression model fails to take into account contextual information (neighbouring relationships) and predict urbanization is new locations (where lights do not currently exhibit an increasing trend).In addition, it is assumed here that night lights are a perfect indicator of urban expansion.However, this is not true for several reasons ranging from energy saving policies to development stages and types (Stathakis et al, 2015).Nevertheless, it is a fact that night lights strongly correlate with urban expansion.This is enough to make a valid time series and forecast.

RESULTS
The expected change in DN values in five years time is shown in Figure 3 The rightmost part of the diagrams in these two last figures is relatively stable after 2013 because it refers to the forecast data.
For Sicily, future urban expansion pressures are primarily focused on the coastal and peri-urban areas.For Athens there is a clear evidence of urban pressures along the new highway as well as towards the east coastline as a result of the new infrastructure built for 2004 Olympic games (airport and highways).

CONCLUSION
A linear regression model has been deployed to quantify future urban extent trends based on night lights time series.The practical value of the method consists mainly on the relatively easiness with which the time series is constructed, compared to day time optical data.The method can be applied near globally to obtain a quick estimate of which areas will be under urbanization pressures in the future provided that the assumptions proposed hold (i.e. that night time lights strongly correlate with urban expansion and past driving factors remain in the future).
Because the prediction is based on the regression line the effect of the year-to-year variation, that remains in the time series even after intercalibration is applied, is further generalized and reduced.
The method can be improved in the future by fitting other models to project the data, including non-linear regression functions and neural networks.The use of non-linear functions will also allow to obtain more reliable quantitative results and make specific estimates of urbanization increase per region in time.Stathakis D., Tselios V., and I. Faraslis, 2015, Urbanization in European regions based on night lights, Remote Sensing Applications: Society and Environment, 2, pp. 26-34. Triantakonstantis D and D. Stathakis, 2015, Urban Growth Prediction in Athens -Greece, Using Artificial Neural Networks, World Academy of Science, Engineering and Technology International Journal of Civil,Structural,Construction and Architectural Engineering,9(3), p. 5. Wu J., Wang Z., Li W., Peng J., 2013, Exploring factors affecting the relationship between light consumption and GDP based on DMSP/OLS nighttime satellite imagery, Remote Sensing of Environment, 134, pp. 111-119. Revised March 2016

Figure 1 .
Figure 1.The study areas.Sicily and Athens.

Figure 2 .
Figure 2. Time series of night lights for Sicily (left) and Athens (right).
(a).Expected change is calculated as DN d =DN e −DN c (4) where DNd = difference DNe = estimated DN values in 5 years DNc = current (2012) DN values The predicted maps for the next five years are shown in Figure 3(b).The year-to-year SoL and image correlation coefficients are shown in Figure 3(c) and Figure 3(d) respectively.(a) forecast change in the next 5 five years (2013 -2018) (b) forecast 2018 (+5 years) (c) SoL year-to-year variation (d) R 2 year-to-year variation Figure 3. Forecast results for Sicily (left) and Athens (right) ). Equation 1 is the intercalibration formula.
Table 1.Average cloud-free observations per pixel for each satellite year of stable lights.Discarded data in parenthesis.