Research on Short-term Ionospheric Prediction Combining with EOF and ARIMA Model Over Guangxi Area

According to the empirical orthogonal function (EOF), the non-stationary time series data are decomposed into time function and space function, so this mathematical method can simplify the non-stationary time series and eliminate redundant information, thus it performs well in non-stationary time series analysis. The ionospheric Vertical Total Electron Content (VTEC) is a non-stationary time series, which has non-stationary and seasonal variation and the activity of VTEC is more active in low latitudes. Guangxi is located in the middle and low latitudes of the Northern Hemisphere with abundant sunshine in summer and autumn. The energy released by solar radiation makes the ionospheric activity in this region more complex than that in the high latitudes. However, no expert or scholar has used EOF analysis method to conduct a comprehensive study of the low latitudes. The International GNSS Service (IGS) provided by high precision Global Ionospheric Maps (GIM) center in Guangxi are used in the modeling data, the GIM data of the first 10 days of different seasons are decomposed by EOF, and then the time function is predicted by ARIMA model. VTEC values for the next five days are obtained through reconstruction, and relative accuracy and standard deviation are used as accuracy evaluation criteria. The results of EOF-ARIMA model are compared with those of ARIMA model, and the prediction accuracy of EOF-ARIMA model at the equatorial anomaly is analyzed in order to explore the reliability of the model in the more complex region of ionospheric activity. The results show that the average relative precision of EOF-ARIMA model is 84.0, the average standard deviation is 7.45TECu, the average relative precision of ARIMA model is 81.5, the average standard deviation is 8.29TECu, and the precision of EOF-ARIMA model is higher than that of ARIMA model.; There is no significant seasonal difference in the prediction accuracy of EOF-ARIMA model, and the prediction accuracy of ARIMA model in autumn is lower than that of other seasons, which indicates that the prediction results of EOF-ARIMA model are more reliable; The prediction accuracy of the EOF-ARIMA model at the equatorial anomaly is not affected, and it is consistent with the accuracy of the high latitude area in Guangxi. It is shown that the EOF-ARIMA model has high accuracy and stability in the short-term ionospheric prediction in Guangxi at low latitudes of China, and provides a new and reliable method for ionospheric prediction at low latitudes.


INTRODUCTION
Total ionospheric electron content is an important parameter to characterize ionospheric delay. Improving the prediction accuracy of VTEC can improve the positioning accuracy of Global Navigation Satellite System (GNSS).In addition, it is also of great significance in the fields of pre-earthquake ionospheric disturbance, earth magnetic field research, and the influence of solar activity on the ionosphere [1][2][3].The commonly used VTEC prediction models mainly include grey model [4], neural network model [5], Holt-Winters model [6,7], ARMA model [8] and ARIMA model [9].Among them, the neural network model can be infinitely close to the complex non-linear relationship, and it can be well used in the prediction of VTEC. However, the network optimization is complex, the parameter selection is difficult, and the error of some predicted values is large, which limits its practical application to some extent. TEC prediction based on time series analysis has achieved great achievement [8,9]. However, using a single time series model directly to predict TEC will reduce its prediction accuracy. In order to improve the prediction accuracy of TEC, the literature [10] combined wavelet decomposition with ARIMA model to predict TEC, and the results showed that WARIMA model was feasible to predict ionospheric TEC, and the prediction accuracy was better than ARIMA model. It can be seen that pretreatment of TEC can improve the prediction accuracy of TEC. EOF is a kind of mathematical analysis method, and the matrix can be decomposed into time function and space function. According to the variance contribution rate to simplify eliminate redundant information, EOF has been widely used in space and time characteristics and data analysis of non-stationary characteristics, such as: precipitation analysis [11], the average temperature anomaly value forecast [12], subsidence data analysis [13], and so on. However, few literatures have introduced EOF function into ARIMA model for short-term prediction of ionospheric VTEC in Guangxi region. The ionosphere in this area not only has equatorial anomalies [14], but also is a frequent area of typhoons, volcanoes and earthquakes [15,16].
However, the ionospheric VTEC will become abnormal before the occurrence of typhoons, volcanoes and earthquakes. High-precision VTEC prediction values can provide important data sources for the seismic prediction analysis [17] of Guangxi region and the analysis of the impact of typhoons on the ionosphere [18].Therefore, it is of great significance to explore the use of EOF-ARIMA model to predict ionospheric VTEC values in Guangxi region (20°~27.5°N, 100°~115°E).

Introduction to EOF decomposition and refactoring fundamentals
It has been shown in literature [9][10][11] that linear change of the original data can simplify the original data information and eliminate its redundant information. However, EOF decomposition is a common method, which can decompose the original matrix into two parts: space function matrix V and time function matrix Y. Among them, the spatial function partly depends on the main variation characteristics of the variable field and does not change with time. The decomposed time function is composed of the linear combination of the variables of the space points and is the main component. The specific process is as follows: (1) Let the VTEC value with space-time characteristics be the matrix Xij, the spatial function vik and the time function ykj, k=1,2...,m has the following relationship: (1) Its matrix representation is (2) (2) The variance contribution rate is calculated and the principal components with a contribution rate over 95% are selected for reconstruction, which can effectively eliminate the redundant information in the original sequence and ensure that the reconstructed spatio-temporal sequence can be obtained with relatively high accuracy. The variance contribution rate can be obtained by formula (3).
(3) (3) The principal components that meet the requirements constitute matrix PC and are reconstructed through formula (4).
(4) 2.2 EOF -ARIMA modeling process Suppose a stationary time series is xt, t=1,2,3...,N, ARMA has the following structure: (5) Where p and q are seasonal or non-seasonal orders. ARIMA model is obtained by differential optimization of ARMA model and is suitable for forecasting non-stationary time series [19]. EOF -ARIMA modeling steps are as follows: Step 1 (data preprocessing) in this study, the experimental data were used as the 2-hour resolution VTEC value of 16 grid points covering Guangxi region. Therefore, EOF decomposition was carried out on the data with the original matrix of X308×120 of 16 grid points for 10 days in a season, and the time function Y308×120 and spatial function V121×120 were decomposed.
Step 2 (determination of order) carry out seasonal analysis on the principal components to determine the selection of seasonal prediction or non-seasonal prediction in ARIMA model. As the selected experimental samples of total ionospheric electron content have periodic changes with a period of 1 day, the seasonal prediction will be more accurate. The periodicity of the principal components is analyzed by sequence diagram to determine whether seasonal difference D or non-seasonal difference D is performed. The order of non-seasonal p and q and the order of seasonal p and q were determined by ACF and PACF.
Step 3 (prediction) determine the prediction days according to the experimental needs, and use the corresponding order to make seasonal prediction for the principal components.
Step 4 (reconstruction) construct the predicted principal components into matrix PC, and reconstruct the predicted value of 5 days according to formula 6.

The data processing
It has been shown in literature [9] that the factors affecting the prediction accuracy include sample size, days of prediction and the anomaly of ionosphere itself. Under the condition of the same sample size, the results show that the relative accuracy of the first 10 days is the highest, and the accuracy of the prediction after 30 days is significantly decreased. Under the condition that the forecast days are the same, when the sample number increases to more than 30 days, the prediction accuracy is not significantly improved. Therefore, in this paper, combined with the influence factors of prediction and the actual situation of Guangxi region, the global VTEC grid data of 2015 provided by IGS center were extracted to obtain the VTEC data of 2 h resolution of 16 grid points covering Guangxi region (20°~27.5°N, 100°~115°E) (as shown in figure 1).Considering the periodic change of sunspot 11a [20], the VTEC values of 40 days (annual product days are 82~91 in spring, 184~193 in summer, 266~275 in autumn and 318~327 in winter respectively) of 10 days in each of the four seasons were selected as the modeling data, and the EOF-ARIMA model proposed in this paper was used to The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-3/W10, 2020 International Conference on Geomatics in the Big Data Era (ICGBD), 15-17 November 2019, Guilin, Guangxi, China forecast the VTEC data of the next 5 days, and compared with the ARIMA model. Based on the VTEC value provided by IGS center as the reference value, standard deviation (STDd) and average relative accuracy (Ppd) were used to evaluate the prediction results of the model, and the formula was as follows: is the original sequence of VTEC provided by IGS center; n represents the NTH day of the forecast days; STDd represents the daily standard deviation of the difference between the prediction results and the VTEC observations provided by IGS center. Ppd represents the average daily relative accuracy of the forecast results.

Prediction accuracy analysis of EOF -ARIMA model
As mentioned above, EOF -ARIMA model and ARIMA model were used to forecast and analyze the VTEC values of 16 grid points in 4 seasons covering Guangxi region. Due to the space limitation, this paper only selected 4 typical grid points in Guangxi region to analyze the prediction accuracy, and the results are shown in figure 2 to 5.The x-coordinate in figure 2 to 5 is the annual product day (d), which contains 12 VTEC values per day since IGS provides a time resolution of 2 hours. Ordinate is VTEC value, unit is TECu; The blue square line segment represents the five-day VTEC measured value provided by IGS station, which is taken as the reference value in this paper. The red dotted line segment represents the VTEC value predicted by EOF-ARIMA model for 5 days. From figure 2 to figure 5, it can be seen that in the four seasons of the selected four grid points, EOF-ARIMA model and the measured values provided by IGS center have a good consistency, indicating that EOF-ARIMA model has superior prediction performance. However, when the VTEC value in the sample data fluctuates greatly, such as 22.5°N,105°E spring, 25°N,110°E spring, and 27.5°N and 105°E spring, the prediction accuracy decreases.
The main reason may be the defect of ARIMA model itself, which has poor fitting effect in the face of complex and irregular time series. Nevertheless, the EOF-ARIMA model combined with EOF still shows superiority in forecasting performance, especially in low latitude grid points in Guangxi. In order to deeply analyze the prediction effect of EOF-ARIMA model in Guangxi region, the prediction accuracy of two models representing 6 character nodes in Guangxi region was also statistically analyzed. The results are shown in table 1. As can be seen from table 1, in the same season, the relative accuracy and standard deviation of EOF-ARIMA model remain in the same prediction accuracy and are more stable than ARIMA model, which further indicates that EOF-ARIMA model has better stability in the short-term VTEC prediction in Guangxi region. In conclusion, the average relative accuracy and mean standard deviation of EOF-ARIMA model for VTEC prediction in Guangxi region are 84.0% and 7.45 TECu respectively, which can maintain good stability in different seasons, indicating that the introduction of EOF decomposition into ARIMA model can improve the short-term prediction accuracy of ARIMA model for VTEC.

Analysis of prediction accuracy in different latitudes
The ionospheric equatorial anomaly is caused by the movement of ionospheric electrons near the equator along the magnetic field line to both ends of high latitude, forming the phenomenon of hump in the northern hemisphere. The northern hump mainly occurs in the geomagnetic latitude 10°~12.5°N (the geographic latitude is about 20°~22.5°N) [21]. Territory is vast in China, stretching from north to south across the low, middle and high latitudes. The maximum daily peak value of all seasons occurs in the low latitudes, and the maximum daily peak value is much higher than the average daily average value of China. Due to equatorial anomalies in the ionosphere at low latitudes, it is necessary to further analyze the applicability of EOF-ARIMA model at different latitudes in China, especially at low latitudes in China.
fig. 6 measured VTEC of IGS stations at middle, low and high latitudes in China As can be seen from figure 6, VTEC value in the region of 15°~22.5°N is higher than that in other latitudes, and has no correlation with latitudes. In the region of 22.5°~50°N, VTEC value decreases with the increase of latitudes due to less solar radiation, which indicates that ionospheric equatorial anomalies exist in China's low latitudes. As can be seen from table 3, the relative accuracy of EOF-ARIMA model in VTEC prediction of different latitudes in China is above 80%, and the standard deviation does not fluctuate greatly. The relative accuracy in the region of 5°~22.5°N is above 84%, indicating that the short-term prediction accuracy of EOF-ARIMA model is not affected by the ionospheric equatorial anomaly. On the whole, the average relative accuracy of EOF-ARIMA model in different latitudes in China is higher than 80%, indicating that EOF can eliminate redundant information by preprocessing data. By introducing EOF decomposition, the short-term prediction ability of ARIMA model VTEC can be improved, so that EOF-ARIMA model can achieve higher short-term prediction accuracy of VTEC in China. Therefore, EOF-ARIMA model has good applicability and stability in short-term prediction of VTEC in China.

CONCLUSION
Based on the function of EOF to simplify non-stationary time series and eliminate redundant information, this paper introduces EOF into ARIMA model to obtain the combined model EOF-ARIMA. Based on the VTEC data of 308 grid points with 2-hour resolution covering Guangxi in 2015 provided by IGS center, the application of EOF-ARIMA model in short-term prediction of VTEC in China was analyzed. The results showed that: (1) In Guangxi, the overall prediction effect of EOF-ARIMA model was better than that of ARIMA model: The root-mean-square error of EOF-ARIMA model in predicting 5-day VTEC values is 7.451 TECu, with an average relative accuracy of 84.0%; the standard deviation of ARIMA model is 8.29TECu, with an average relative accuracy of 84.5%.
(2) The prediction performance of EOF-ARIMA model has no obvious seasonal change, and the prediction accuracy of ARIMA model in autumn is lower than that of other seasons, indicating that the introduction of EOF decomposition into ARIMA model can improve the stability and accuracy of its VTEC short-term prediction.
(3) Through the analysis of VTEC measured values of low, middle and high latitude grid points in China, it is found that VTEC values in low latitude areas of China have equatorial anomalies. Nevertheless, the average relative prediction accuracy of EOF-ARIMA model in different latitudes is still higher than 80%, and the relative accuracy and standard deviation of VTEC prediction in the region of 5°~22.5°N do not show abnormal fluctuations, indicating that the short-term prediction performance of EOF-ARIMA model is not affected by ionospheric equatorial anomalies. The EOF-ARIMA model can maintain good accuracy and stability in short-term VTEC prediction of equatorial anomalies in Guangxi and China's low-latitude regions, and can be used in other applications such as GNSS navigation and positioning in Guangxi and the sea area south of Guangxi. Thanks: thanks the IGS Center for providing the VTEC grid data!