RANDOM FOREST FOR CLASSIFYING AND MONITORING 50 YEARS OF VEGETATION DYNAMICS IN THREE DESERT CITIES OF THE UAE

The United Arab Emirates (UAE), a dryland country, has since its independence, emphasized on giant greening projects. Monitoring the trend of greening progress in the UAE has gained importance for environmental management and carbon footprint monitoring. Hence, this study created and analysed a time-series (TS) vegetation map to track and analyse vegetation dynamics over an extended period of fifty years. Study area included three selected desert cities of the UAE, Abu Dhabi (AD) capital city, Dubai city and Al Ain city. Random Forest algorithm was applied on Landsat multi-temporal images from 1972 until 2021 for classifying and monitoring the vegetation dynamics and change trajectories. Four vegetation subclasses (coastal/wetland vegetation, urban vegetation, farms/crop fields, and natural/artificial forests), were assessed then grouped and mapped as one vegetation class. With the adopted approach, we achieved overall classification accuracy ranging from 86% to 94%, with kappa coefficients ranging from 0.7200 to 0.8800. Current study showed that the vegetation cover extent in the UAE was at a constant growth for the past five decades from only 1,231.1 ha in 1972 to 23,176.46 ha in 2021, 19 times folds. Furthermore, it showed that desert cities tend to increase their vegetation cover while continuing their steady urban growth. The other drivers found include demographic increase and governmental policies (granting farms to locals and environmental protection laws). Finally, the approach implemented in this research can effectively and reliably be used in other urban centres for future monitoring and management of the vegetation cover status in the country.


INTRODUCTION
The UAE has witnessed an accelerated rate of growth over the last five decades owing to the discovery of oil. Substantial land use and land cover (LULC) change has taken place and dramatically changed the landscape in key parts of the country (Abdi & Nandipati, 2010;Al Ahbabi, 2013;Aldogom et al., 2019;S. Issa & Al Shuwaihi, 2012;Liaqat & Chowdhury, 2017;Mohamed & Elmahdy, 2018;Nassar et al., 2014). A major class of great importance in the arid ecosystem of UAE is the vegetation class. Understanding the change patterns of the vegetation class is necessary to study the impacts of such change on ecosystem services such as invasive plants, biomass estimation, carbon sequestration, and climate regulation S. Issa & Dohai, 2008;S. M. Issa et al., 2020;Liu et al., 2017). This is also important to identify the drivers of such change and help develop prediction models. Moderate resolution satellite imagery provides a source of data that can be used to study LULC change over long time periods. The free availability of global multispectral data sets such as Landsat and Sentinel promoted further the use of remote sensing in LULC mapping and lead to the development of improved processing algorithms. Landsat data, for instance, has been available since 1972 with a frequency up to 16 days; hence, enabling the creation of time series of LULC maps for wide areas at relatively short time steps. Such data sets are essential in studying LULC dynamics, their drivers, and impacts. Traditional multispectral classification approaches such as pixel-based unsupervised and supervised algorithms have been used to create LULC maps from RS data. Many of these approaches rely on statistical assumptions, such as normal distribution, of input data. Their accuracy depends on data availability and validity of the assumptions. More recently, a new breed of Machine Learning (ML) algorithms independent of the input data characteristics have emerged and applied to the classification of RS data and in detecting and studying vegetation cover (Feng et al., 2019). Various approaches have been developed with ML applied to multispectral data for mapping vegetation cover. They include Classification and Regression Tree (CART), k-nearest neighbors, Support Vector machines (SVM) and Random Forest (RF) algorithms. Various studies have investigated the application of these algorithms in studying vegetation and concluded in the superiority of SVM and RF over traditional approaches for mapping vegetation using RS data (Ghayour et al., 2021;Hengl et al., 2018;Macintyre et al., 2020;Talukdar et al., 2020;Xia et al., 2018). The main goal of this study is to study vegetation cover change over three representative areas in UAE over the period 1972 to 2021 using Landsat imagery. A time series of Landsat data acquired in 1972Landsat data acquired in , 1978Landsat data acquired in , 1986Landsat data acquired in , 1992Landsat data acquired in , 1997Landsat data acquired in , 2002Landsat data acquired in , 2007Landsat data acquired in , 2013Landsat data acquired in , 2017 and 2021 is used to detect vegetation patterns in the three areas using a ML RF approach. The study produces vegetation maps for each period that are consequently used to assess change over successive period and potential drivers.

Study Area
The study area covers three distinct desert cities of the UAE, Abu Dhabi (AD) Capital city, Dubai city, Al-Ain city ( Figure  1). The three urban centers lie at 24°26′9″N, 54°24′42″E for AD; 25°5′26″N, 55°10′51″E for Dubai and 24°13′14″N and 55°43′5″E for Al-Ain city respectively. The discoveries of oil reserves during the 1960s and 1970s accelerated the population growth in the country and attracted a large labor force. The government used oil revenue to develop the infrastructure of urban centers and industrial projects. The total area of study area-1 is 72,052.95 ha which covers the city of AD, the capital of the UAE. AD is located on a small desert island that jets into the Arabian Gulf and includes the AD island and outside the island (e.g. Maqta', Mussafah, Lulu, and Al Reem). The LULC can be categorized into 4 main classes: water, vegetation, urban, and undeveloped areas classes. Water covers around 45.5% of study area-1 (~37% deep water and ~8.5% shallow water). Around 30% of the study area-1 is undeveloped areas (consists of sand dunes, sand sheets, and sabkhas). The urbanized areas cover around 14.5% and around 10% are covered by vegetation class. The summer season (April -September) temperature varies between 35° and 45°, with a climax of 48°C in July and August, accompanied by a high concentration of water vapor boosting the relative humidity to reach up to 90%. On the other hand, the winter season's (October -March) temperatures range from 10 to 24°C. The humidity throughout all seasons is relatively higher in those areas near in Arabian Gulf. Rainfall occurs during the winter season mainly between November and February with precipitation amounts barely reaching 12 cm per year (National Center of Meteorology 1995. The mean annual rainfall can be highly variable between one year to another. The study area-2 covers part of Dubai city, the economic hub of UAE and the whole region. The total area is 20,455.87 ha and covers the Jumeirah area (including the massive engineering project, Palm Jumeirah Island), Al Barsha, and Expo Dubai 2020's location. Dubai city was established as a small fishing village and grew up rapidly at the beginning of the millennia to a cosmopolitan metropolis and became one of the world's most popular tourist destinations. The LULC classes of the area under study include shallow and deep water (~25%), sand dunes (~22%), sand sheets (~16%), urban (~14%), vegetation (~13%), and sabkhas (~10%). The summer season is extremely hot, in Dubai, with a very high humidity level. The average high temperature is around 40 °C and the average low temperature is around 30 °C in August, the hottest month. The winter season is comparatively cool with some rain. The average high temperature is around 24 °C and the average low temperature is around 14 °C in January, the coolest month. The amount of precipitation reaches 11 cm per year. Study area-3, Al-Ain city, lies in the eastern part of the UAE and belongs to the Emirate of AD. It consists of 35 districts with a total area of 185,372.7 ha. Al-Ain as a city was established around an old date palms oasis. The geographic setting is characterized by a sand dunes barrier to the west and the mountain chains to the east, protecting Al-Ain city from the effect of wind and sea breeze coming from both east and west directions. Around 82% of Al-Ain is composed of sand dunes (~50%) and sand sheets (~32%). These include Hafeet mountain, bare soils, exposed rocks, and undeveloped areas that exist in the outskirts of the city. Around 9% of the total area is vegetation area and same for urbanized area. The urban area includes built-up, roads, and buildings. Finally, some water bodies (~0.1%) exist in places like Ain Al-Faydah and Zakher lakes. The climate of Al-Ain is characterized by an average minimum temperature of 14.7C and an average maximum of 42.9C. The annual average long-term rainfall is 5.9 cm and the humidity is 44.3% (National Center of Meteorology).

Data Sources
The three study areas are covered by the following Landsat scenes (Table 1) TS images for 5-years interval were used as data sources to identify, compare, and detect the changes of the vegetation cover changes in the three study areas over the last 50 years (from 1972 to 2021). The TS datasets of Landsat scenes availability were first investigated at the USGS -Earth Explorer Website (http://earthexplorer.usgs.gov/). It was found that September or close to September was the most suitable time to conduct the change analysis because most available Landsat scenes needed for change detection purposes were available in that time (except for one scene at 1978 acquired in December). Furthermore, this time of the year is most convenient to conduct such study as vegetation phenology is minimal in this period of the year. Based on the data availability, image quality, land clouds cover (less than 10%), ten Landsat images were selected in this study. Therefore, the ten Landsat scenes were acquired which represent the years: 2021, 2017, 2002, 2007, 2002, 1997, 1992, 1986, 1978, and 1972. The Collection 2 and Level 2 (C2, L2) product of Landsat-8 OLI, Landsat-7 ETM+, and Landsat-5 TM images were downloaded in the GeoTIFF format from USGS Website, while (C2, L1) product of Landsat-1 MSS image were downloaded. The C2 of Landsat marks the second major reprocessing by USGS that provide image improvements in their geometric and radiometric quality. Additionally, panchromatic bands (Level 1) of the same scenes of Landsat-8 OLI and Landsat-7 ETM+ were downloaded and later used in the pan-sharpening process (subsection 2.3.2). All image bands were georeferenced and co-registered to the Universal Transverse Mercator (UTM) projection (Zone 40, WGS 84). Further information about the specifications of satellite data used in the study is provided in Table 1. Vector layer of the three study areas in shapefile was used to subset the study area. ERDAS Imagine 2020 software package was used to run data pre-processing, processing, and classification.

Data Pre-processing
Four pre-processing procedures were conducted on the acquired datasets: gap filling, pan-sharpening, resampling and stacking, and s explained below and summarized in Table 2. Spatial enhancement was conducted to remove the strips existed in ETM+ 2007 and MSS 1978 images. Focal analysis tool in was used with mean function of 3×3 window as focal size. The focal analysis was run several times till the gaps filled. Furthermore, panchromatic band-8 (15 m resolution) and the stacked multispectral images were merged (pan-sharpened) to produce an enhanced image of 15 m spatial resolution using using hyperspectral color space (HCS) resolution merge technique and the nearest neighbourhood (NN) algorithm. Pan-sharpening procedure was conducted only for OLI and ETM+ scenes (2021, 2017, 2012, 2007, and 2002 years) because they have the panchromatic band (PAN) available in Level 1 product. All scenes were resampled using NN algorithm, and the final pixel size is 20 m × 20 m to be consistent for vegetation cover analysis comparison as MSS' pixel size is not in synergy with the other sensors (OLI, ETM+, and TM) which need to be unified at the end. The images (10 images) were saved in (*.img) format of ERDAS. Four equivalent bands were selected and stacked for OLI, ETM+, and TM images. These bands are green, red, near infrared (NIR), and shortwave infrared 2 (SWIR2). While for MSS images, only three bands were stacked which are green, red, and NIR2.  Table 2. Details of spectral Landsat bands used, and the preprocessing procedures conducted in the current study.

Deriving Time-Series NDVI Layers
By using the NIR and red bands, normalized difference vegetation index (NDVI) layers were generated for each date using the formula 1. NDVI is widely used in vegetation study, although, it presents little to no increase once it attains the full canopy cover due to sensor saturation which is not the case in drylands regions where the vegetation is rare and sparce.
The derived NDVI layer for each year, then, was added to and staked with its corresponding multispectral image bands to end up with 5 raster Layers for each year namely: green, red, NIR, SWIR2, and NDVI (except for MSS 1978 and MSS 1972 which end up with 4 raster layers). These TS images were used as an input to train the dataset and to generate its spectral and textural attributes as outlined in the following steps.

Classification Using Machine Learning
Each image (10 images) was classified to two classes: vegetation class (VEG) and non-vegetation class (Non-VEG).
The VEG areas includes oases, farms, artificial forests, parks, gardens, grasses urban vegetation. The procedure of classification was followed by the typical ML project's four steps: train data, select the ML algorithm, train/test the selected ML algorithm, and generalize/classify for the whole image. In this current study, ML RF algorithm was used. Sufficient quantity and representative training data were carefully selected as polygons for both VEG and Non-VEG areas. Enough time was spent to cleaning up the training data from errors and outliers. The collected relevant attributes for both features (VEG and Non-VEG) are the mean of the digital number values (spectral) and variance (texture). The vector data in shapefile format and its mentioned attributes were created to be used for RF algorithm classification. Numerous ML algorithms exist in the literature with each has its own advantages and disadvantages. The one which could be suited in specific environment or application is not necessarily suited for another environment and application. CART, SVM, and RF algorithms were all run and tested. Finally, the RF algorithm was selected in the current study because it is more accurate and simpler to conduct. The RF algorithm was trained using the training data to create the "Machine Intellect" before generalizing and performing the actual classification. The final step is to conduct the actual classification. Then, the accuracy of each of the TS thematic map was assessed. The developed spatial model preforming the classification was built using ERDAS Imagine Spatial Modeler tool.

Assessing Accuracy
The accuracy of the TS classified thematic maps of 1972, 1978, 1986, 1992, 1997, 2002, 2007, 2012, 2017, and 2021 were assessed and analyzed. A set of points sampled using the stratified random sampling of 100 points, 50 points for each class (VEG/Non-VEG), for each thematic map. The points, different from the training points, were randomly. In addition, Google Earth, reference map and expert knowledge of the study areas were used in this process. A confusion matrix was created, and accuracy and error metrics (overall accuracy, producer accuracy, user accuracy, and kappa index) were calculated (subsection 3.2).

Vegetation Cover Change Detection
The areas (in hectare) and percentage of both classes (VEG/Non-VEG) were computed for each year. The vegetation cover change detection analysis was performed using the pixelby-pixel cross tabulation analysis for nine-time intervals: 1 . 1972-1978 (6 years) 2. 1978-1986 (8 years) 3. 1986-1992 (6 years 1972,1978,1986,1992,1997,2002,2013,2017, and 2021 and using ML-random Forest algorithm classification technique to produce binary map (VEG/Non-VEG). It is worth to mention that destripped ETM+ of 2017 was excluded from the TS database. Focal analysis of gap filling for ETM+ 2017 image, explained in Methodology section, could help slightly on correcting the scene visually, but spectrally it produced erroneous results. During the past five decades, the study areas has encountered a steady VEG growth as shown in Figure 5.
The 2021 status of all the study areas shows that a large part of these areas is composed of Non-VEG area, with over 91.22% of the total surface area of the study areas, and only 8.78% is covered by vegetation.

Classification Accuracy Assessment
Error matrices were used to assess the accuracy of the classification maps for all nine years using the RF algorithm using the stratified random sampling method to collect the ground (total = 900 points). Tables 3-11 show the overall accuracy of the classification maps for 1972, 1978, 1986, 1992, 1997, 2002, 2013, 2017, and 2021 to range from 86% to 94%, with the Kappa coefficient ranging from 0.7200 to 0.8800. The classification maps are considered reliable since the overall accuracy of the classification maps is over 85% (Anderson, 1976

Vegetation Cover Change Analysis
Table 12-13 and Figure 6 show the VEG growth in the three study areas during the last five decades. From 1972 till 2021, the VEG area, considering all three study areas, increased around 19 times (1,882.6%) from only 1,231.1 ha in 1972 to 23,176.46 ha in 2021. This demonstrates the huge investment of the UAE government in "greening" projects during the study period.

Vegetation Dynamics in Abu Dhabi (1972-2021)
In 1972, the VEG area in study area 1, AD, was limited to the coastal vegetation, e.g. mangroves. AD island, near land and islands areas, included in study area 1, was 99.8% covered by Non-VEG, mainly covered by sand dunes, sand sheets, and sabkhas. The enormous difference can be witnessed in the VEG area that emerged when the 2021 VEG map of AD was processed. The VEG area in AD has increased 60 times from only 120.8 ha in 1972 to 7,411.2 in 2021 and the Non-VEG area decreased 10% during the same period. During the time period of 1972-1978 (6 years), the VEG area has doubled to add 271.4 ha to the VEG areas in the main island of the city. For the time between 1978 and 1986 (8 years), the VEG area has witnessed a substantial increase by 373.6%, an area of 1,465.4 ha has been added to the total VEG cover. The following six years (1986)(1987)(1988)(1989)(1990)(1991)(1992) added even more by 2,108.6 ha and the total VEG area became close to four thousand hectares by Sept 1992. Therefore, the 14 years of steady increase (the sum of both time periods : 1978-1986 & 1986-1992) duplicated in the VEG area of AD more than 9 times. The main driver of the high growth of VEG cover in AD during the mentioned periods could be attributed to the planting of new parks, gardens, street trees, and other urban green spaces in the city. The increase of VEG area  percentages of the study areas during five decades.

Vegetation Dynamics in Dubai (1972-2021)
Non-VEG area has covered the whole study area 2, in 1972 and 1978 maps. These developed areas include important places of the international city, Dubai, like Palm Jumeirah, Emirates Hills, and Expo Dubai 2020. Only 53.8 ha of VEG area emerged at the beach of Jumeirah. The VEG area has duplicated around 4 times (381.3%) in 1992 mainly due to expanding of construction in the northern part of Jumeirah beach and building the Emirates golf courses. The increase of VEG area slowed down in the next periods of 1992-1997 (5 years) and, then increased by 74.3% during the period between 1997-2002 (5 years). The VEG area has ended up in 2002 being 618.7 ha which represents about 3.0% of the whole study area. It is worth mentioning here, that the 10 years (the sum of both time periods : 1992-1997 & 1997-2002) has witnessed extensive construction activities in the study area, mainly building its infrastructure; both inside the inland of the study area and in the sea (Jumeirah Palm). As seen before in study area 1, AD (subsection 3.3.1), it is expected that urbanization of the desert will be followed by increase of urban vegetation (parks, gardens, streets trees, and other green spaces). During the following 10 years, the time period 2002-2013, a strong growth wave of VEG area has spread throughout the study area which added around 1,045.1 ha of VEG area and increased the VEG area to represent around 8.1% of the whole study area by July 2013. This remarkable growth of the VEG area was driven by the increase of urban vegetation, mainly, in places like Jumeirah Palm and new residential cities like Jumeirah Islands, The Gardens, Discovery Gardens, and Arabian Ranches.  1972-1978 180.6 16.3 -180.0 -0.1 1978-1986 3,071.2 237.9 -3,071.0 -1.7 1986-1992 2,737.5 62.8 -2,738.0 -1.5 1992-1997 782.9 11.0 -783.0 -0.4 1997-2002 2,266.7 28.8 -2,267.0 -1.3 2002-2013 3,195.4 31.5 -3,195 Table 13. VEG and Non-VEG changed areas in hectare (ha) and percentages during the study period.

Vegetation Dynamics in Al-Ain (1972-2021)
In 1972, the VEG area in study area 3, Al-Ain, was concentrated around the old date palms oases like Al-Sarooj, Hili, and Al-Mutaredh oases (see Figure 4.a) and was expanded throughout the years. It was only 0.6% in 1972 and end up in 2021 with 7.8% of the total area. The VEG growth was duplicated about 12 times during the study period and was reflected in the reduction of the Non-VEG area. Around 13,263.0 ha was converted from Non-VEG to be VEG during the last 5 decades. In detail, during the time period of 1972-1978 (6 years), the VEG area in the Al-Ain slightly increased by around 16.3%. As for the Non-VEG area, it slightly decreased by around 0.1%. For the time period 1978 -1986 (8 years), the VEG area showed an increase of around 237.9% showing the highest period of VEG increase in Al-Ain city during the fiftyyear study period. Due to the increase in the VEG area, the Non-VEG area was reduced by around 1.7% for the same time period. For the time period between 1986 and 1992 (6 years), the increase of VEG area has continued with 2,737.5 ha more, recording a plus of around 62.8%. The increase of VEG area in Al-Ain was duplicated 4.5 times from 1978 to 1992, 14 years (the sum of both time periods : 1978-1986 & 1986-1992) from 1,290.9 ha to 7,099.6 ha. This increase could be driven by the expanding of urban vegetation (streets trees and green spaces) and extensive granting of agricultural farms to civilians as part of the Bedouin settlement policy of the government, and the building of the artificial forests which spread throughout the whole emirate of AD which is well seen at the eastern part of the city, along the road heading to AD capital city. Unexpectedly, the increase of VEG area, slowed a net decrease during the next period between 1992-1997 (5 years 1997-2002 & 2002-2013) showed the second growth wave of VEG area in Al-Ain from 7,882.5 ha to 13,344.6 ha, and the VEG area of Al-Ain was increased by around 70%. This increase could be driven mainly by the urban expansion of Al-Ain city which, in parallel, expanded its urban vegetation. In other words, Al-Ain, as a desert city, was tending to increase the VEG area as part of greening the city while continuing its steady urban growth. This "behavior" can be seen in the other desert cities under study (study area 1, AD, and study area 2, Dubai). The other driver could be the increase of international migrants to the country in general that gives a need for more agricultural goods to satisfy the increasing demand, reflected by an increase in the farming areas activities. The period of 2013-2017 (4 years), had the second slow down increase in the VEG area. During the last 4 years between 2017-2021, the VEG area in Al-Ain decreased by around 1,129.8 ha, around 7.3% presenting the only period of VEG area contraction in Al-Ain city.

LIMITATIONS OF THE STUDY
The TS of vegetation cover maps created in this study provided accurate and reliable input to our change analysis model. Postclassification comparison method was performed on the TS vegetation cover maps to reduce miss-classification errors, which was tedious, time-consuming, and prone to human error. In addition, the 2007 Landsat-7 ETM+ images were excluded from analysis due to the failure of the gap-filling of scan lines error to deliver accurate results. This deprives us from precious information on the detailed trajectory of vegetation cover dynamics over the period from 2002 until2013. The RF outperformed the other machine learning algorithms such as CART and SVM. Other machine learning classifiers and deep learning techniques can be tested for future studies. In terms of validating the results with reference maps, it was difficult to find high-resolution images or valid maps for the 1972 and 1978 images covering the UAE. NASA declassified highresolution stereo images were used as a reference only for specific regions (e.g., of AD).

CONCLUSION AND RECOMMENDATIONS
Studying the vegetation cover dynamics is very critical for decision-makers and planners to understand the area in terms of protecting the environment and maintaining sustainable development. Based on the results obtained from three representative study areas, namely AD, Dubai and Al-Ain, using RS and GIS technologies, it is evident that the vegetation cover extent in the UAE is at a constant growth for the past five decades around 19 times (1,882.6%) from only 1,231.1 ha in 1972 to 23,176.46 ha in 2021, as clearly shown in the produced trajectory maps. The vegetation area increased due to several drivers. Research findings show that the three desert cities tend to increase the vegetation area by expanding urban vegetation while continuing its steady urban growth. This includes parks, gardens, streets trees, and other green spaces. The intrinsic behaviour of desert cities of increasing the vegetation cover contradicts the common generalization that urbanization cause loss of vegetation cover derived from studies of the growth of the tropical and boreal cities. The second driver of the increase in vegetation cover in the UAE is the increase in farming areas promoted and accelerated by extensive governmental policies aiming at granting of agricultural farms to local population as part of the Bedouin settlement governmental policy. The demographic increase (alimented by migrant workers in different governmental and private sectors) also plays an important role hence increasing the demand for more crops and agricultural products. In 2019 the UAE was ranked as the country with the highest proportion of international migrants accounting for 88% of the country's population (World Migration Report, 2020). Another driver of vegetation cover change is the environmental protection laws that were introduced during the eighties and nineties to protect and conserve the environment, aiming at increasing green areas, artificial forests, developing water resources, improving and protecting the marine environment against pollution. The results of this study form the basis for further analysis to understand the nature of the vegetation expansion trend throughout the past five decades for the lands of the UAE and the future expected growth. Further studies will follow as part of the ongoing project implemented to get a deeper understanding of these changes and their implications.