CONCENTRATIONS FROM HIMAWARI-8 AOD OVER HUBEI PROVINCE

Satellite remote sensing can effectively estimate the particulate matter on a large scale. Polar-orbiting satellites have limited frequency of observations, which cannot help us understand PM2.5 evolution. The observation frequency of Himawari-8, a geostationary meteorological satellite, increases to at least once every 10 min. Besides, this satellite can provide the hourly aerosol optical depth (AOD). PM2.5 concentration is closely associated with changes in wind speed. The air quality changes with the variations of wind direction and speed. In Hubei Province, the daily average wind speed varies greatly, while the wind significantly impacts the PM2.5 diffusion. In the present study, a mixed effect regression model is developed which predicts ground-level hourly PM2.5 concentrations in Hubei province and analyzes the hourly time variation trend and spatial distribution characteristics of the near ground PM2.5 concentrations using the annual Himawar-8 Level 2 aerosol product in 2016. The estimated hourly PM2.5 concentrations are consistent well with the surface PM2.5 measurements with high R (0.74) and low RMSE (20.5 ug ∙ m). The average estimated PM2.5 in Hubei province during the study is about 46.1 ug ∙ m . A clear regional distribution is shown in the spatial distribution of PM2.5 concentrations, and the PM2.5 concentrations in the central and eastern regions of Hubei Province is significant higher than that of the western region; from the perspective of time change, the pollution peak appears at 15 o'clock in the local time, the average concentration of PM2.5 reaches 50.1±21.8 ug ∙ m; the pollution reaches the lightest at 9 o'clock a.m., and the average PM2.5 concentrations is 41.7±17.5 ug ∙ m. These results are conducive to assessing surface PM2.5 concentrations and monitoring regional air quality.


INTRODUCTION
Fine particles (PM2.5), which are exposed to air, refer to particles with an aerodynamic diameter of 2.5 μm or less.It is one of the vital components of air pollution and the major cause of haze as well.As the latest research suggests, the burden of air pollution has continued to raise in the global environment since 1990 (Forouzanfar et al., 2015).The impact of PM2.5 is primarily reflected in the harm to human respiratory health, variations in surface temperature and atmospheric precipitation, etc.As the epidemiological studies have suggested, PM2.5 is associated with an increase in the incidence and the mortality of cardiovascular and respiratory diseases.Respiratory disease may be induced when we breathe submicron-sized particles in our lungs (Pope et al., 2002;Dominici et al., 2006;Wan Mahiyuddin et al., 2013).The reports show that 3.7 million people worldwide died of environmental air pollution-induced diseases in 2012, which has aroused the attention from people all over the world (Pope et al., 2002;Dominici et al., 2006).Thus, long-term, large-area PM2.5 monitoring and accurate PM2.5 concentration prediction are critical for air quality and public health.
The traditional PM2.5 ground monitoring network provides important space and time information for PM2.5 concentration and composition in the atmosphere.Besides, it has great potential to study air-related climate and air quality issues (Yap et al., 2012).Yet it inevitably has some limitations.Ground monitoring fails to obtain PM2.5 concentration in large-scale space due to the limited space coverage of observation instruments and high operating costs, especially for many developing countries, e.g., China.Before 2013, only a few cities in China (including Nanjing and Guangzhou) had research monitoring sites (Chen et al., 2010;Wei et al., 2009).Due to the lack of spatial and temporal continuity of ground PM2.5 measurements, researchers face difficulties in accurately assessing the spatiotemporal variations of PM2.5 concentration, which brings substantive limitations to epidemiological studies in China.
Satellite remote sensing technology provides a more effective monitoring and estimation method for epidemiological studies (Hoff et al., 2009).The limitations of the ground PM2.5 monitoring network can be eliminated in terms of time and space by studying satellite-measured aerosol optical depths (AOD), especially where ground monitoring networks are not available (Engel-Cox et al., 2004;Liu et al., 2005;Schaap et al., 2009).In the early studies, the relationship between PM2.5 and satellite AOD was analyzed using a simple linear regression model, and the local scale factor of the global atmospheric chemistry model was employed.It was found that the AOD obtained through satellite remote sensing measurements is capable of effectively monitoring PM2.5 pollution (Chu et al., 2003;Koelemeijer et al., 2006).In recent years, many studies have established links between PM2.5 and satellite AOD using advanced statistical models (Generalized Linear Regression, Generalized Additives, Geographically Weighted Regression, and Land Use Regression Models).Besides, the meteorological parameters (boundary layer height, temperature, relative humidity) and land use information (elevation, population, vegetation cover) were also incorporated into the AOD-PM2.5 relationship as a common variable to improve model performance in the noted studies (Koelemeijer et al., 2006;Kumar et al., 2007;Liu et al., 2007a;Gupta and Christopher, 2009;Tian and Chen, 2010;Li et al., 2011;Wu et al., 2012;Hu et al., 2013Sorek-Hamer et al., 2013).Yet the estimation accuracy of the noted model can still be improved.In these studies, the effect of wind speed and direction on the inversion of PM2.5 concentration were rarely considered.Wind has a great influence on the diffusion of PM2.5 concentration.In the space environment, the concentration of PM2.5 should not be constant, but vary with the wind speed.
In addition, most of the above noted studies use MODIS AOD data to estimate ground PM2.5 concentrations.Yet the data provided by such polar-orbiting satellites is limited by the frequency of observations (e.g., MODIS only makes two observations per day).Accordingly, this still cannot clarify the evolution of PM2.5 in terms of continuity.The Himawari-8 weather satellite, as a geostationary meteorological satellite, is one of the sunflower series satellites designed and manufactured by the Japan Aerospace Exploration and Development Corporation.It is a new-generation meteorological satellite launched by Japan on October 7, 2014.Himawari-8 is the first stationary meteorological satellite, capable of taking color images around the world.The previous polar orbiting satellites were restricted by the frequency of observations, and the observation frequency of Himawari-8 increased to at least once every 10 min, covering a third of the earth (Western Pacific, East Asia, Southeast Asia and Oceania).Furthermore, the performance of continuous observation of clouds and other movements of Himawari-8 are also improved (Shang et al., 2017;Tian et al., 2010).
In the present study, a mixed-effect regression model is improved on the basis of AOD data provided by Himawari-8 satellites, meteorological data and ground observation data.Various meteorological factors are established to estimate regional scales hourly ground PM2.5 concentration, and the ability of wind speed meteorological factors is comprehensively evaluated to explain the spatial-temporal differences of PM2.5 in Hubei Province.

Study area
The Hubei Province locates in Central China.The region consists of 12 prefecture-level cities (Wuhan, Huangshi, Shiyan, Yichang, Xiangyang, Ezhou, Jingmen, Xiaogan, Jingzhou, Huanggang, Xianning and Suizhou) and 1 prefecture (Enshi Tujia and Miao Autonomous Prefecture).It takes up an area of 185,900 square kilometers, located in the central region of mainland China.Influenced by the subtropical monsoon humid climate, Hubei Province is hot and humid in summer and dry and cold in winter.The average annual temperature in Hubei Province reaches 15-17°C, and the average annual rainfall is nearly 800-1600mm.The region is densely populated, the local heavy industry is developed, and the northerly winds are strong.The pollutants in the north continue southward.The local climatic conditions in Hubei are not conducive to the spread of pollutants, which makes this province one of the regions with the most severe PM2.5 pollution in China (Zheng, Y et al., 2016).The location of the study area is shown in Fig. 1.

Satellite AOD retrievals:
The AOD data employed in this study is extracted from Himawar-8 Level 2 aerosol products, and a data set covering the year of 2016 is selected.Himawari-8 satellites are capable of providing nearly one-third of the Earth's coverage (West Pacific, East and South-East Asia, and Oceania) (Shang et al.,2017;Tian et al.,2010)with AOD data and angstrom indices with a time resolution of 10 min and a spatial resolution of 5 km.

Meteorological data:
The meteorological data applied in this study is extracted from the reanalysis dataset (ERA-Interim) in the European Centre for Medium-Range Weather Forecasts (ECMWF).The European Centre for Medium-Range Weather Forecasts went into operations on August 1st, 1979 to produce medium-term weather forecasts, and it has run two "re-analysis" programs.ERA-Interim is one of the ECMWF reanalysis data sets.Since 1979, ERA-Interim has provided global climate reanalysis data and updated it in real time.
In this study, the meteorological data used consist of surface temperature (K), surface pressure (Pa), wind speed (m/s), relative humidity ( %), and boundary layer height (m).

Land cover data:
This study also analyzes the impact of land cover downloaded from the NASA (http://neo.sci.gsfc.nasa.gov/).The MODIS Level 3 monthly mean normalized difference vegetation index (NDVI) with a spatial resolution of 0.05° x 0.05° is used.The area with a NDVI value greater than 0.4 is selected as a plant cover area, while others are taken as soil-dominated surface (Liu et al., 2014).The DEM data of the study area is yielded by NASA with a spatial resolution of 90 meters.In the PM2.5 prediction model of this study, NDVI data and DEM data serve as covariates (Ma et al., 2014).The details of the data sets applied in this study are listed in Table 1

PM2.5 estimated model
Studies have suggested that meteorological conditions including temperature and relative humidity can strongly impact the relationship between PM2.5-AOD.Since the extinction characteristics of particles vary significantly with the increase of the moisture absorption of aerosols (Liu et al., 2005;van Donkelaar et al., 2006), some researchers proposed the use of meteorological factors to improve AOD and Methods for the relationship between ground PM2.5 concentrations (Koelemeijer et al., 2006;Liu et al., 2007a;Tian and Chen, 2010;Wang et al., 2010).
In Hubei Province, the average daily wind speed varies to a large extent, and the wind has a huge impact on the diffusion of PM2.5 (Chou Tianxiong et al., 2017).These differences in daily meteorological conditions lead to a special relationship between AOD and PM2.5.Thus, we develop a mixed-effect regression model that considers the magnitude and direction of wind speed to predict PM2.5 concentrations.The changes are added, and land use information is introduced to calibrate the PM2.5 concentration prediction.The model consists of two parts, i.e., fixed effects and random effects.The complete model is as follows: ~(0， 2 ) Where PM2.5 refers to an hourly mean of the mass concentration of the near ground PM2.5, the unit is ug •  −3 .AOD is the aerosol optical thickness value provided by Himawari-8.BLH refers to the boundary layer height obtained by ECMWF data, and the unit m.RH are the relative humidity of the ground surface measured by the foundation, its unit is %, and the TEM is near ground temperature measured by the foundation, its unit is K.And PRES is near surface pressure, the unit is Pa and    is the component of the size and direction of the surface wind speed, with the unit of m/s.The vertical wind speed and horizontal wind speed are used on a two-dimensional plane to synthesize them into a total wind speed, representing the wind speed in any direction.DEM is the digital elevation model information of the study area.The unit is the m and NDVI is the monthly mean value normalized difference vegetation index, without unit. 0 is the fixed intercept and  1 ~8 are the corresponding regression coefficients in the equation which are associated with the prediction variables. represents the observation error, and the ~(0， 2 ) is a site term which accounts for the spatial difference of the AOD-PM2.5 relationship due to differences in site specific characteristics(i.e., surface reflectivity, topography PM2.5 emissions, and pollution transported to the observation sites).
To reduce the noise generated by data error and the influence of spatial difference, and to ensure the spatiotemporal consistency of the predictors, the spatial and temporal prediction factors are combined with the specific random effects to adjust the data in time and space.The nearest match is employed to match the meteorological data and the data of PM2.5 ground monitoring site.
The Himawari-8 AOD data are matched for each grid by the nonzero pixels mean values within the radius of 5 km around the PM2.5 ground monitoring site.

Model validation
The method of 10-fold CV is used in this study to evaluate the performance of the PM2.5 prediction model.Its basic idea is to group the original data.All data randomly fall into ten equalsized and non-overlapping subsets.One partial set is applied for verification, and the other nine partial sets are for training.The PM2.5 prediction model is trained with nine subsets.The validation repeats 10 times.Each subsample is validated once, and a single estimate is finally yielded by the average of 10 results.Then, the PM2.5 concentration predicted during all 10fold cross-validation process is compared with the measured PM2.5 concentration at the ground station, the coefficient of determination (  2 ) between the PM2.5 concentration are determined and estimated, and the mean absolute error(MAE) and Root Mean Square Error (RMSE) are used to evaluate the performance of the model.

Himawari-8 AOD spatial distribution
The spatial distribution of the Himawari-8 AOD data at different times of daytime in 2016 is shown in Fig. 3.It is suggested from the results in Fig. 2 that the spatial distribution of AOD shows higher values in the central and eastern regions of Hubei Province, while the AOD values in the western regions are lower.The mean AOD yielded from Himawari-8 is 0.28±0.24.The highest average AOD value during the day appears at 15 o'clock, is was 0.35±0.29; the lowest AOD value is at 10 o'clock, which is 0.25±0.22(Table 4).To assess the meteorological and land-use parameters applied in the final model to improve the accuracy of the fitted model, AOD serves as the sole independent variable, and different predictors are added to fit the model, as listed in Table 2.By calculating the coefficient of determination ( 2 ), mean absolute error (MAE), and root mean squared error (RMSE) between measured and predicted PM2.5 concentrations, the final model performance is evaluated.MAE is defined as (sum of absolute error values)/(number of observations).RMSE is defined as the square root of the mean squared error, the square root of the ratio of the square of the observed value to the true deviation and the number of observations.The cross validation scatter plot of the PM2.5 concentration measured by the ground monitoring site at different hours in the daytime and the PM2.5 concentration estimated by the model fitting at different hours of the day are shown in Fig. 4. In these scatter points, the color represents the number of data points of the corresponding pixels.In the improved hybrid effect model, the determining coefficient  2 of all time is more than 0.68, which can verify the feasibility of the proposed model in Hubei Province, and the model can be reasonably predicted.Yet the decision coefficient  2 of cross validation at different time periods (e.g.,  2 is 0.76 at 15:00 and a minimum of 0.68 at 9:00) means that the performance of our improved model is better in the afternoon than that at other daytime.One possible reason is that when PM2.5 concentration exceeds60ug •  −3 , we choose all Himawari-8 AOD valid values within 5 km radius given the consistency of space-time prediction factor.Besides, our model often underestimates this prediction, so it cannot be well shown in the large grid unit.The measured values of the ground stations at different times and the predicted PM2.5 concentrations are listed in Table 3.

DISCUSSION
The data measured by ground monitoring sites can only accurately reflect the air quality pollution within a certain area.
Given the sparse distribution of the traditional ground monitoring station measurements in space, satellite remote sensing data with large-scale spatial coverage has become extensive as one of the most important methods for estimating PM2.5 concentrations in geographical space.Aerosol Optical Depth, which is defined as the integration of the extinction coefficient of the medium in the vertical direction.It is a description of the effect of the aerosol on light reduction.Since the atmosphere is not evenly distributed in the vertical direction, coupled with the cloud layer, ice and snow, aerosol colloidal properties, as well as air relative humidity, the AOD value actually has no good linear relationship with atmospheric suspended particulate matter (PM) concentration.The relationship between surface PM2.5 concentration and AOD is associated with the vertical distribution and particle size distribution of aerosols (Li et al 2016;Zhang et al 2015 ).
Given the vertical distribution of atmospheric suspended particulates from the physics perspective, the vertical distribution correction of satellite AOD can improve the correlation between satellite remote sensing products and atmospheric suspended particulates (Chu et al.,2015;Barnaba et al.,2010).Besides, the correction of the influence of humidity on AOD is also necessary.The traditional "gravimetric method" is capable of measuring the concentration of PM2.5.After heating the airborne particulate matter to 50 degrees Celsius, the "dry" PM2.5 measurement is likely to reduce the mass of the aerosol particulate matter (The moisture absorption and growth characteristics of aerosol particles make AOD affected by humidity) (Song et al.,2014).
In the present study, relevant factors including meteorological parameters and land use information are considered in the PM2.5 prediction model, respectively.Among these factors, BLH and RH are correction factors for vertical distribution correction and humidity correction.Besides, in this model, the effects of wind speed and direction are also considered in the model performance.
As the results suggest, wind speed has significant inverse relation with PM2.5 concentration, and has a greater influence on the accuracy of model fitting.
It is found that the meteorological parameters and land use parameters have an obvious positive impact on the model, i.e., the mixed effect model with hourly specific random effects shows better performance than the test model using AOD as the only independent variable.In accordance with the statistical indexes of the fitting results, the model fitting results are better after the introduction of the weather parameters, land use information and wind speed prediction factors.Besides, they are better than the single variable model fitting results in the overall situation.
Furthermore, the average AOD predicted PM2.5 concentration in the PM2.5 monitoring points within the 5 km radius may not represent all the measurements.

CONCLUSIONS
This study is based on the Himawari-8 satellite remote sensing AOD product dataset in 2016 combined with meteorological model data, ground observation data and land use information.A mixed effect regression model is built to estimate PM2.5 concentration in Hubei Province.The model estimates the temporal and spatial distribution characteristics of PM2.5 concentration in the area.It is significant and necessary for us to understand the evolution of PM2.5 mass concentration.The results suggest that: (1) Based on the physics relationship of AOD-PM2.5, an improved mixed-effect model is developed using Himawari-8 AOD data to estimate the PM2.5 concentration on the ground.In accordance with the results of 10-fold cross-validation (e.g., the coefficient of determination  2 is 0.77, and the root-mean-square error is 19.03ug •  −3 ), the model is capable of fully and accurately estimating the surface PM2.5 concentration.
(2) Analysis of the temporal and spatial changes of PM2.5 in Hubei Province in 2016 suggests that the spatial distribution of PM2.5 shows a clear regional distribution, with the high value areas as a whole, and the average PM2.5 concentrations in the central and eastern regions are significantly higher than that in the western region.The places with higher concentrations of PM2.5 are primarily distributed in Xianning, Suizhou, Xiaogan etc.The heavily polluted areas are located in Wuhan and Jingmen, and the PM2.5 pollution is comparatively light in Shiyan and Enshi Tujia and Miao Autonomous Prefectures.From the perspective of time, the distribution of PM2.5 concentration in Hubei Province reaches the highest value at 15 o'clock, and reaches the lowest at 9 o'clock.
(3) Wind greatly impacts the diffusion of PM2.5.Test 3 suggests that the accuracy of the linear correlation analysis results is improved after considering the magnitude of the wind speed and the influence of the direction.In the synthesis of statistical results for each fitting result, by introducing meteorological parameters, land use information and wind speed and other predictors, the model fitting results become better, and globally better than a single variable model fitting results.

Fig
Fig. 1.Location of the study area

Fig. 3 .
Fig. 3. Spatial distribution of Himawari-8 AOD data at different times of 2016 have a significant positive impact on the model, i.e., compared with the test model using AOD as the only independent variable, the mixed-effects model with the hourly-level-specific random effect shows better performance.Test 3 suggests that after considering the magnitude and direction variables of the wind speed in the model, the coefficient of linear correlation analysis is 0.77, the mean absolute error is 13.12ug •  −3 , and the root mean square error reaches 19.03ug •  −3 .The accuracy of the model is slightly improved.According to the statistics of each fitting result, by introducing meteorological parameters, land-use information, wind speed and other predictors, the model fitting results become better, and globally better than a single variable model fitting result.

Fig 4 .
Fig 4. Cross-validation plot of the estimated PM2.5 concentration by the model fit

Table 1 .
. Data sets for this study

Table 4 .
AOD values and PM2.5 concentrations at different times in Hubei Province in 2016