NEAR-REALTIME FLOOD DETECTION FROM MULTI-TEMPORAL SENTINEL RADAR IMAGES USING ARTIFICIAL INTELLIGENCE

Flood extent delineation from RADAR images usually entails manual thresholding per scene, which is not feasible when tackling large-scale floods that often covers multiple RADAR scenes. It is also computationally intensive when processed through traditional remote sensing techniques that limit its use during emergency situations. To hasten the production of flood maps from RADAR images during flooding incidents, a deep learning model using Fully connected Convolutional Neural Network (FCNN) has been developed to delineate flooded areas with minimal human intervention. The model was formulated from the data gathered during a flooding event captured by both Sentinel-1A SAR satellite and Planet’s Dove optical satellites. Two pre-flood and one post-flood SAR scenes were used to detect the occurrence of water by analysing drops in backscatter values. The potential flood extents were verified using optical images which were then used to train the AI model. The model is currently being used operationally to map flood extent across the Philippines with no human intervention from data download to detection of flooded areas. The technique can detect floods across five Sentinel 1 scenes in less than four hours upon download of new satellite data.


Background of the Study
Flood maps in the Philippines are available in different types and accuracies. Various agencies are mandated to conduct floodrelated mapping with distinct objectives. The Mines and Geosciences Bureau of the Department of Environment and Natural Resources (DENR-MGB) produces flood susceptibility maps for the entire country through their National Geohazard Assessment Program (Nieves, n.d). These maps were released in 1:10,000 scale for critical areas and 1:50,000 for other areas. The map classifies areas with more than one-meter flood as highly susceptible, and low to moderately susceptible for areas with less than one-meter flood during heavy or prolonged rainfalls.
A different approach was conducted by a Department of Science and Technology (DOST)-funded research and development project called the Nationwide Operational Assessment of Hazards (Project NOAH). The project produces high-resolution flood hazard maps from one-meter resolution LiDAR data (Lagmay, et al., 2017) through hydrologic modelling. The project produced maps for 5-, 25-, and 100-year rainfall return events. The flood hazard maps were initially processed for the Philippines' 18 major river basins, also covering around 200 principal river basins across the country with LiDAR data.
Another agency under the DOST, the Philippine Atmospheric, Geophysical and Astronomical Service Administration (PAGASA), produces regular flood forecasting and advisory for different river basins and dams in the country ("PAGASA", n.d) through its Flood Forecasting and Warning System (FFWS). The system uses a network of rain gauges, water level gauges, warning posts, and monitoring stations to produce flood bulletin and flood information ("Flood Forecasting and Warning System for Dam Operation", n.d.).
More specialized systems for area-based flood monitoring were developed by the Advanced Science and Technology Institute of DOST (DOST-ASTI) through the use of sensors with predictive capabilities (Garcia et al., 2016). This type of urban flood monitoring system can predict flood heights with high accuracies and send warnings on nearby areas with available sensors. The same applies for the forecasting system by PAGASA but without the flood height estimates. This gap in PAGASA's system can be filled by the flood simulation maps of Project NOAH for different river basins if a rainfall event matches with the simulated map. On the other hand, the MGB flood susceptibility map covers the entire country but are too generalized to use on actual events. Given all the available flood datasets, there is a gap for data that delineates the actual extent of flooding during an event. To achieve this, an actual image of the flooded area is needed. Remote sensing techniques, specifically RADAR images, have been proven effective in solving these gaps.

Objectives
This study aims to create an automated flood delineation technique with minimal human intervention that is operationally viable on a national scale. The method must produce flood maps in near-real time (NRT) conditions especially during disaster events. The said maps can be printed or used by appropriate authorities and government agencies that handle and administer emergency response operations.

REVIEW OF RELATED LITERATURE
Floods can be easily identified in RADAR images due to the low backscatter values caused by specular reflection of RADAR signals on water surface (Westerhoff et al., 2013). This characteristic of RADAR signal has been utilized by many studies in detecting water bodies. There are different methods used in processing RADAR data to detect water. A review of these methods was conducted by Shen et al., (2019) which include: supervised and unsupervised; threshold determination, segmentation, change detection, visual inspection and manual editing vs fully automated processes; and water detection beneath vegetation or in urban areas. They all operate on the principle that water areas return low backscatter values. Most of them employ the thresholding method (Pulvirenti et al., 2010) to isolate water bodies in RADAR images. These methodologies are dependent on the choice of threshold values since different areas have different types of environmental parameters like land cover, slope, elevation, aspect and other geomorphological characteristics which also differs in every RADAR scene. For very large areas, it may also require a different threshold even if the water body being delineated is one and the same as demonstrated by Tan et al., (2004). Thresholding method entails manual work per RADAR scene making it not feasible for NRT applications. To overcome this limitation, a different approach is hereby proposed for faster production of flood maps using Artificial Intelligence (AI) specifically Artificial Neural Networks (ANN).
AI has been employed to analyze satellite images for disaster applications. Amit et al., (2016) developed a Convolutional Neural Network (CNN)-based detection model of natural disasters in satellite imagery. The model consists of three convolutional and maxpooling layers followed by two fully connected layers. The model was evaluated on two different types of natural disasters, namely landslides and floods.
Automatic detection of disaster-affected areas in satellite imageries can also be done using deep learning models along with wavelet transformation (Liu and Wu, 2016). Another interesting approach uses Generative Adversarial Network (GAN), originally proposed for retinal vessel segmentation, to detect floods (Ahmad et al., 2017). Moumtzidou et al., (2018) used a ResNet-50, pre-trained on ImageNet and fine-tuned on 224x224 satellite image patches to automatically identify passable roads in flood-affected areas.
Image classification using neural networks, once trained, can benefit from an end-to-end classification methodology and a semantic segmentation technique that incorporates spectra, feature, and form in its partitioning. The CNN is the ideal architecture for this application; it is designed to take advantage of the 3D structure of the input image. The architecture of CNN is limited to three dimensions: width, height, and depth. This constraint allows for a more efficient forward function and reduced number of parameters of the neural network (Karpathy, 2018).
But the default architecture of CNN is insufficient in realizing full semantic segmentation (Längkvist et al., 2016). To be able to delineate features in an end-to-end manner, the Fully Convolutional Network (FCN) can be utilized. FCNs, proposed by Long et al., (2015), were one of the first architectures to perform pixel-level segmentation by replacing the fully connected layer of the neural network with a convolutional neural layer. This modification transforms the CNN into a feature extractor outputting spatial maps instead of classification scores. However, FCNs produce coarse segmentation maps because of the inherent loss of information during pooling operations (Yue et al., 2016). Thus, there is a need to further modify the FCN to produce pixel-level high-resolution segmentation results.
Our proposed solution is based on a modified FCN architecture called U-Net (Ronneberger et al., 2015). It was previously used for biomedical image segmentation. Its architecture has both contracting and expansive paths and its feature maps from the contracting path are cropped and copied for the corresponding upsamplings in the expansive path. This allows combining lowlevel feature maps with higher-level ones, enabling precise localization. (Li et al., 2018) implemented a type of U-Net called DeepUNet to perform pixel-level sea-land segmentation. Results show significant improvement of sea-based complex structure segmentation over other neural networks like SegNet. Iglovikov, et al. (2017) utilized the U-Net structure to perform object detection and feature segmentation of WorldView-3 satellite images by training separate models per feature.

METHODOLOGY
The proposed method in this study is to train an AI model to detect floods from manually processed multi-temporal SAR imagery. It operates in the concept that consistent backscatter values must be observed in areas with no physical changes. A significant decrease in backscatter can be attributed to presence of flood.
Study Area: Typhoon Urduja (international name: Kai-Tak) struck the Philippines on December 12-19, 2017 (see Figure 1) and left widespread flooding across 484 barangays and resulted to more than 47 people died ("NDRRMC Update: SitRep No. 28", 2018). One of the most affected areas is Biliran Island located in Eastern Visayas which reported 26 people dead and isolated the island due to damage to at least five bridge (Viray, 2017), one of which was captured by DOST-ASTI's water level sensor (see Figure 2 and 3).  (2)  PlanetScope scenes taken on November 22, and December 29, 2017 were also gathered from DOST-ASTI's Philippine Earth Data Resource and Observation Center (PEDRO) (see Figure 8). The images were used to verify the flooded areas detected from the SAR images.
The proposed methodology consists of four major steps; data processing, training data generation, AI training, and flood prediction.

Data Processing
Data processing include pre-processing the RADAR datasets to bring them into the correct position on the ground and correct different geometric errors. Another part is preparation of the optical images to be used in verifying the flooded areas before converting the detected floods into labelled data to train the AI model.

RADAR Data Pre-Processing:
RADAR data preprocessing generates a terrain-flattened Gamma Nought image for all three images. Precise orbit files were applied into the two pre-typhoon images while the most recent orbit file was applied to the post-typhoon image. Radiometric terrain flattening algorithm was applied to the datasets to account for geometrical characteristics of the ground and eliminate errors due to slope, aspect, and their orientation to the sensor. The three images are then co-registered and stacked in order of acquisition. Terrain correction was also performed to bring the data to a common coordinate reference system from RADAR coordinate system. All the processing was done using ESA's Sentinel 1 Toolbox. Figure 5. Pre-processing workflow.
The stacked image is visualized in a GIS software using R-G-B color combination where the post-typhoon image is assigned in Red Band, and the two pre-typhoon images assigned in the Green and Blue input bands.
The resulting RBG color composite shows areas with decreased backscatter values in cyan color due to the lower values in the Red band and higher values in Green and Blue. Areas without change will show random grey intensities as shown in Figure 7.

Training Data Preparation
The goal of this process is to mark all the flooded areas from the RADAR images which will be converted into a flood mask as input in training the model together with the original stacked image.

Object-Based
Image Segmentation and Classification: To properly isolate the flooded areas from the stacked image, an Object-Based Image Segmentation approach was used. Image segmentation works faster than manually digitizing the extent of flooded areas. Segmentation also avoids human-induced errors that arises from manual digitization of flood extents. Image segmentation breaks the image into objects (segments) with similar statistical characteristics. The Mean-shift segmentation was used as implemented in Orfeo Toolbox (OTB); an open-source software for image analysis which implements many tools from OpenCV (an open-source computer vision software). The Mean-shift algorithm finds the centroid of a group of pixels with similar statistical values. The grouping will depend on the spatial and spectral (radiometric) radius defined by the user. The algorithm iterates until it can no longer shift the centroid or until a convergence value specified by the user is reached. The algorithm produces a vector file which contains statistical values of each object in each band; the mean backscatter values and the variances. The segments that fell within slopes of greater than 20% were removed. The slope threshold was chosen as it is above the regulatory limit of habitable areas. A total of 12,398 objects were created with a minimum size of fifty (50) m 2 using the parameters in Table 2.

Spatial Radius 3
Range Radius 9 Minimum Segment Size 50 Table 2. Segmentation Parameters. Figure 10. Objects created from Mean-shift segmentation.

Boost Classification:
The segments must be classified into flooded and non-flooded data. To aid in the classification, an algorithm called boost classification was used. Boost classification (Boosting) is a classification method that combines outputs of many "weak" classifiers to produce a powerful "committee". The predictions from the weak classifiers are combined through a weighted majority vote to produce the final prediction. The purpose of boosting is to sequentially apply the weak classification algorithm to repeatedly modified versions of the data, thereby producing a sequence of weak classifiers (Hastie et al., 2009).
Sample flooded and non-flooded segments were selected as training datasets for the boost classifier; 276 flooded, and 692 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) non-flooded objects (see Figure 11). The differentiation of flooded and non-flooded objects was verified using Planet imageries. Different combinations were tested to highlight the flooded areas including stacking pre-and post-typhoon NDWI values. The objects classified into Class 1 and 2 for flooded areas and non-flooded areas, respectively.
A vector classifier was trained using Real Adaboost algorithm implemented in OTB (see Table 3 for parameters). Real Adaboost is a version of boost classification that "utilizes confidence-rated prediction and works well with categorical data" ("Boosting -OpenCV 2.4.13.7 documentation", n.d.).

Weak Count 100
Trim Rate 0.95 Maximum Depth of tree 1 Table 3. Boost Vector Classifier Training parameters   The classified vector file was converted into a classified raster with values of 0 and 1 corresponding to non-flooded and flooded, respectively.

AI Training and Prediction
The machine learning framework used for the U-Net Architecture in this study is Keras. The user-friendly nature of Keras allows us to perform rapid prototyping by building and testing the neural network with minimal lines of code. The modular nature of Keras also makes it easy to build and modify the U-Net architecture and its hyperparameters. Figure 14. The U-Net Architecture.
Batch normalization was used for convergence acceleration during training. The primary activation function used is the exponential linear unit (ELU). ELU helps to learn representations that are more robust to noise (Iglovikov et al., 2017). The number of feature channels is doubled at each corresponding downsampling and upsampling step. The contractive path of the U-Net follows a typical convolutional neural network. The expansive path, on the other hand, consists of an upsampling operation of the feature maps followed by convolution with half the number of feature channels, concatenation with the corresponding feature map from the contracting path, followed by batch normalization and ELU.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII- B3-2020, 2020XXIV ISPRS Congress (2020 The default input for the neural network is the concatenation of the stacked RADAR images converting them into a single tensor. The loss function used for this classification task is binary cross entropy. Nadam Optimizer (Adam with Nesterov momentum) (Dozat, 2016) was incorporated and the network was trained for 50 epochs with a learning rate of 1 −3 .
Each epoch was trained on 400 batches with each batch containing 128 image patches. Additionally, each batch was created by randomly cropping 112x112 patches from the stacked RADAR image. Each patch was also modified for data augmentation by applying a random transformation from the Dih4 group (Dummit, 2013) in group theory. Figure 15. Model Training workflow.

Flood Prediction
The AI Flood Model accepts an input of 3 stacked SAR images (2 Before Event Images, 1 After Event Image). It outputs a raster mask with two binary values: 1 for Flooded Areas and 0 for Non-Flooded Areas. The raster mask is then vectorized using a GIS software to produce shapefiles of potential flooded areas.

RESULTS AND DISCUSSION
To quantify the actual ground detection accuracy, the AIpredicted flood maps was compared to the flood waters captured by Sentinel-2 optical satellite (see Figure 16) during a flooding event brought by Typhoon Mangkhut (local name: Ompong) that battered the Philippines with 145 to 165 km/hr of winds last September 2018. First, flooded areas were predicted by the trained model (see Figure 18) and compared to the 10-meter-resolution image captured by Sentinel-2. A cloud-masking algorithm was applied to remove the cloudy portions of the image. A section with minimal cloud cover was selected as the validation site with an approximate area of 625 km 2 which comprises parts of the City of San Carlos, Municipalities of Lingayen, Binamaley, Bugallion, Aguilar, Urbiztondo, Malasiqui and Bayambang in Luzon Island, Philippines. The data was converted into Top of the Atmosphere Reflectance (ToAR) where Modified Normalized Difference Water Index (MNDWI) was computed using the value of 0.2887 and above as water threshold (Du et. al 2016). Two hundred (200) random points were generated in the validation area and classified into flood and non-flood points (see figure 17 and 18). A confusion matrix was generated to calculate the producer's, user's and overall accuracies. The method achieved a producer's accuracy of 89.11%, a user's accuracy of 90%, and overall accuracy of 89.50% with 0.79 Kappa coefficient (see Table 4).   The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLIII-B3-2020, 2020 XXIV ISPRS Congress (2020 edition) Figure 18. Flood prediction (in green).
The whole flood map production from SAR image download to flood prediction has been automated using Python scripts that runs in DOST-ASTI's High Performance Computing (HPC) facility. The method was operationalize in 2018 and 2019 and distributed into various agencies and Local Government Units (LGU) across the country. Feedback from people on those areas attested the accuracy of the mapped floods. Figure 19 shows one of the detected large-scale floods during a monsoon event in the Philippines. It covers two swaths of Sentinel-1 images. The data was released in printable map layouts (1 layout per tile) for use in response and recovery operations. Figure 19. Flood prediction (in red) during a monsoon event.

CONCLUSION
The flood monitoring workflow using multi-temporal Radar images is a new technique developed which exploits the basic concept of color-mixing using the RGB channels to analyse drops in backscatter values due to water saturation. It operates on the concept that when flooding occurs, the backscatter of an area drastically decreases compared to the previous values when flooding is not observed. Image segmentation and Boost classification techniques were used to create the labelled training data to train the AI model.