A COMPARATIVE ANALYSIS OF SPATIOTEMPORAL DATA FUSION MODELS FOR LANDSAT AND MODIS DATA

In this study, three documented spatiotemporal data fusion models were applied to Landsat-7 and MODIS surface reflectance, and NDVI. The algorithms included the spatial and temporal adaptive reflectance fusion model (STARFM), sparse representation based on a spatiotemporal reflectance fusion model (SPSTFM), and spatiotemporal image-fusion model (STI-FM). The objectives of this study were to (i) compare the performance of these three fusion models using a one Landsat-MODIS spectral reflectance image pairs using time-series datasets from the Coleambally irrigation area in Australia, and (ii) quantitatively evaluate the accuracy of the synthetic images generated from each fusion model using statistical measurements. Results showed that the three fusion models predicted the synthetic Landsat-7 image with adequate agreements. The STI-FM produced more accurate reconstructions of both Landsat-7 spectral bands and NDVI. Furthermore, it produced surface reflectance images having the highest correlation with the actual Landsat-7 images. This study indicated that STI-FM would be more suitable for spatiotemporal data fusion applications such as vegetation monitoring, drought monitoring, and evapotranspiration.


INTRODUCTION
To date, enormous improvements have been achieved to the spectral, spatial, temporal and radiometric characteristics of satellite remotely sensed data.However, none of the operational satellite systems have the full technical requirements that fits the different surface parameters such as vegetation indices, land surface temperature, soil moisture, agriculture drought, evapotranspiration, human health, etc. (Zhang et al., 2015;Hazaymeh and Hassan, 2017).Given the tradeoff between spatial and temporal resolutions of satellite systems, several spatiotemporal remote sensing data fusion methods have been developed (Cammalleri et al., 2013;(Hilker et al., 2009;Gao, F., Masek, J., Schwaller, M., Hall, 2006;Zurita-Milla et al., 2011;Hazaymeh and Hassan, 2015a,b).These methods have been used as suitable cost-effective approaches to generate continuous time series consisted of original and synthetic remote sensing data.The main idea is to generate satellite-based data that have both high spatial and temporal resolutions through fusing the multisensor spatial and temporal characteristics of different satellite systems.(Chen et al., 2015) provided a survey of spatiotemporal data fusion methods and its applications and relevant studies.Among these methods, three received great interest within the remote sensing community.These included the (i) spatial and temporal adaptive reflectance fusion model [STARFM; (Gao, F., Masek, J., Schwaller, M., Hall, 2006), (ii) sparse representation based on a spatiotemporal reflectance fusion model [SPSTFM; (Huang and Song, 2012)], and (iii) spatiotemporal image-fusion model [STI-FM; (Hazaymeh and Hassan, 2015a,b).
It is worthwhile to mention that other researchers have performed such comparative studies e.g., (Chen et al., 2015) compared STARFM, Enhanced-STARFM (Zhu et al., 2010), Improved-STARFM (Fu et al., 2013), and SPSTFM; (Gevaert and García-Haro, 2015).Overall, they concluded that Improved-STARFM and Enhanced-STARFM performed more stable than other methods.However, this case study compared the original STARFM algorithm and with improvements efforts.(Gevaert and García-Haro, 2015) compared STARFM with unmixing based method and the spatial and temporal reflectance unmixing model (STRUM) developed in their study.They concluded that the methods were able to generate surface reflectance data and NDVI images with higher performance for STRUM.In this study, we performed a comparison between STARFM, SPSTFM, and STI-FM methods using the same dataset and evaluated their performance through statistical and visual comparisons.

Study site and data
In this study, we selected a study site located in Coleambally irrigation area (CIA; see Figure 1) in Australia (145°04′E, 34°00′S) to perform the comparison between the spatiotemporal methods.This site has been used for time series remote sensing research in (Emelyanova et al., 2013;Jarihani et al., 2014;Van Niel and McVicar, 2004;Van Niel and McVicar, 2004).Also, it was used to perform the comparison between STARFM, Enhanced-STARFM, Improved-STARFM, and SPSTFM by (Chen et al., 2015).The dataset included 17 pairs of daily MODIS images (i.e., MOD09GA) and their corresponding Landsat-7 ETM+ images during the growing season 2001-2002.This consisted of 17 actual Landsat-7 images.The dataset was freely obtained from United States Geological Survey (USGS).Here, we selected the red and near infrared spectral bands, and the normalized difference vegetation index (NDVI) images as the comparative dataset.Note that the study site is completely located within an overlapping area between two consecutive Landsat paths (i.e.paths 92 and 93 / row 84).This allowed for possible acquisition of two Landsat images at 8-day interval when no cloud exist.After (Chen et al., 2015).

2.2.1
STARFM: STARFM was developed by Gao et al., 2006(Gao, F., Masek, J., Schwaller, M., Hall, 2006) to generate time series Landsat images using MODIS images.The method was then applied in different applications and received several modifications by different researches (Zhu et al., 2010;Liu and Weng, 2012;Fu et al., 2013;Meng et al., 2013;Weng et al., 2014).The major steps of STARFM include; the selection of pixels that have similar spectral values within a user modified moving window using Landsat images, then a weighting factor is determined as a function of Landsat and MODIS images of interest.Finally, a synthetic Landsat image is generated at the prediction date at time two [synth-L(t2)] using Equation ( 1) where, M(t2) and M(t1) = the two MODIS images taken at two different times, L(t1) = the Landsat image taken at time one, and Wi is the weighting factor.According to the method, the type of change in spectral signatures between the two MODIS images is first observed.Accordingly, three types of change might be identified such as (i) positive change, (ii) negative change, and (ii) no change.After that, a simple linear relationship is developed between the consecutive MODIS images for each case of change.Then, the coefficients (i.e., slope and intercept) are calculated for each linear relationship and used with L(t1) to generate the synth-L(t2)] using Equation: where, a and c = slope and intercept, respectively.

Statistical and visual accuracy of synthetic images
Here, we performed visual comparisons between the synthetic and actual images.In addition, statistical metrices such as (i) coefficient of determination (r 2 ) to measure correlation between the synthetic and actual image; and (ii) root mean square error (RMSE) to reflect the overall bias between the synthetic and actual image.The formulations of these statistical measures are as follows: where, A(t) and S(t) = the actual and the synthetic Landsat-7 surface reflectance images; ̅̅̅̅̅ and S (t) = the mean values of the actual and the synthetic Landsat-7 images; n = the number of observations.

Prediction performance
Figure 2 shows a comparison between actual Landsat-7 image and synthetic Landsat-7 images for January 11, 2002 generated using the three spatiotemporal models over the CIA site for the red actual and synthetic images.The synthetic image of January 11, 2002 was generated using a one pair of MODIS-Landsat-7 images taken at two times such as, M(t1) and L(t1) in January 05, 2002; and M(t2) in January 11, 2002.A visual comparison of each synthetic and its corresponding reference Landsat-7 data showed that all spatiotemporal models have constructed the general landscape features observed in the actual image during the prediction period.This demonstrated the feasibility and the applicability of the three spatiotemporal fusion algorithms.Figure 2 shows a sample of the red band in a l-to-l fitting line and correlation between the actual and synthetic images.As observed in (Chen et al., 2015) at the CIA site STARFM and SPSTFM algorithms produce good 1-to-1 line fitting.The STI-FM algorithm produced more 1-to-1line fitting for the correlation between actual and synthetic Landsat-7 images.A visual inspection of a randomly selected date (i.e., January 11, 2002) indicated that STI-FM performed more stable than the other two algorithms.Table 1 shows the quantitative comparison of the actual and synthetic Lnadsat-7 images in January 11, 2002 for the red band generated by STARFM, SPSTFM, and STI-FM, respectively.It shows that STI-FM is having higher r 2 values and lower RMSE values.11, 2002 [i.e., M(t2)].This was reflected on the accuracy of the synthetic image in the prediction date.For instance, when the time lag between the base and prediction image dates is closer, the accuracy of the synthetic images is found to be more accurate.This reveals that the performance of each algorithm is consistent with the correlation between the two MODIS images of the base and predicted dates.An example of red spectral band for the January 11, 2002 image is presented in Table 2.It showed that the accuracy of the synthetic image is gradually increased from 0.499 when the November 9, 2001 image was selected as the base image comparing to 0.894 when the January 05, 2002 image was selected as the base image.2. Relationship between MODIS images of the base and prediction dates in the CIA site, and its corresponding correlation between actual and synthetic Landsat-7 images.The example is for the red spectral band in January 11, 2002 (i.e., the prediction date)

CONCLUDING REMARKS
We compared three spatiotemporal fusion algorithms, STARFM, SPSTFM, and STI-FM using the one-pair mode of Landsat-7-MODIS dataset at the CIA site in Australia.Visual evaluation and quantitative measures including r 2 and RMSE were used to evaluate the algorithms performance.The results showed that the three selected algorithms produced reasonable predictions, with r 2 values ranging from 0.842 to 0.894.In this study, we observed that STI-FM had better performance as it generated synthetic Landsat-7 surface reflectance and NDVI images with higher r 2 values and lower RMSE values.This would indicate that STI-FM is better for data fusion applications requiring continuous high spatiotemporal data, especially in areas where few actual high-resolution images are available.

Figure 1 :
Figure 1: a) A map of Australia with the Coleambally irrigation area labeled in a red square; (b) The false color composite image of Landsat-7 spectral bands SWIR1, NIR, and Red acquired on 8 October 2001.After(Chen et al., 2015).
Huang and Song, 2012) proposed the SPSTFM using three MODIS images at three different times M(ti) and two Landsat images L(ti) such as, M(t1), M(t2), and M(t3) and L(t1), and L(t3) to generate synthetic Landsat image at the prediction data L(t2).The method consisted of three major steps, (i) transforming the spatial resolution of MODIS images to the spatial resolution of Landsat images using sparse representation and dictionary training procedure, (ii) generating two transition images by calculating the difference image between M(t1) and M(t3); and L(t1) and L(t3).(iii) Then the two transition images are used with Landsat image [L(t1)] to predict the synthetic Landsat image [synth-L(t2)] at time 2 (t2) by employing a high pass modulation technique.2.2.3 STI-FM: (Hazaymeh and Hassan, 2015a,b) developed the STI-FM using two MODIS images taken at time one and time two [M(t1) and M(t2)] and one Landsat image taken at time one L(t1) to generate synthetic Landsat image at time two [synth-L(t2)].

Figure 2 .
Figure 2. Synthetic Landsat-7 images and the actual Landsat-7 image for the CIA site on 11 January 2002.(a-c) are the synthetic images of STARFM, SPSTFM, and STI-FM, respectively; (d) actual Landsat-7 image.The SWIR1, NIR, and Red bands are used to generate the figures.

Figure 3 .
Figure 3.Comparison of the actual and synthetic Lnadsat-7 images in January 11, 2002 for the red band from STARFM (a), SPSTFM (b), and STI-FM (c). the black line represents the 1-to-1 fitting line.
Table2shows the correlation between MODIS images that represent base [i.e., M(t1)] and the prediction [i.e., M(t2)] dates.Results showed that the more consecutive images were having higher correlation values.For example, the correlation between the two MODIS images in November 9, 2001 [i.