EXPLORING THE IMPACT OF REAL ESTATE POLICY ON REAL ESTATE TRADING USING THE TIME SERIES ANALYSIS

: Housing price is a major issue affecting people's lives, but also closely related to the interests of the people themselves. Housing prices are affected by various factors, such as economic factors, population size factors, social factors, national policy factors, the internal factors of real estate and environmental factors. With the deepening of urbanization and the agglomeration of urban population in China, housing prices have been further accelerated. The Chinese government has also introduced a series of policies to limit real estate transactions and affect property prices. This paper also aims to explore a time series analysis method to analyse the impact of real estate policies on real estate prices. Firstly, the article searches for policy factors related to real estate through government official channels such as state, Prefecture and city, and analyses key words related to policy by means of natural language processing. Then, the real estate registration volume, transaction volume and transaction house price data which are arranged into time series are modelled using ARIMA time series model, and the data are processed according to scatter plot, autocorrelation function and partial autocorrelation function graph of the model to identify its stationarity. Finally, the LPPL (logarithmic periodic power) model and MPGA (multi-population genetic algorithm) are used to fit and detect turning points of real estate registration data, and the time series detection algorithm is used to obtain the inflection time nodes of the sequence, and then the relationship between real estate policy and real estate transactions is analysed. Taking the real estate registration data in Wuhan as an example, this paper validates the above time series analysis method. The results show that some real estate policies (such as purchase restriction policy, public rental policy, etc.) have a certain impact on real estate transactions in a short time. Part of the real estate policy (such as graduate security, settlement policy, etc.) does not have a significant impact on real estate transactions. To sum up, the government's brutal blockade of macro-control of the housing market cannot fundamentally solve the


INTRODUCTION
Housing price is a major issue affecting people's lives, but also closely related to the interests of the people themselves. Housing prices are affected by various factors, such as economic factors, population size factors, social factors, national policy factors, the internal factors of real estate and environmental factors. With the development of urbanization and the agglomeration of urban population in China, housing prices have been further accelerated. The Chinese government has also introduced a series of policies to limit real estate transactions and regulate the real estate market in order to make the real estate market develop smoothly and orderly.
At present, the research on real estate is mainly to study the cyclical fluctuation of prices. Cheng et al. optimized the LPP1 model parameters based on a multi-population genetic algorithm and effectively extracted the time series turning points to study the periodicity of oil prices and real estate prices [1] . Smirnov et al. established a reference chronology for the Russian economic cycle from the 1980s to 2015. In order to test the peaks and valleys of the economic cycle, three methods (local minimum/maximum, Bayesian model and Markov conversion model) are used to determine the periodic turning point of the time series data [2] . Giusto et al. proposed a simple machine learning algorithm called Learning Vector Quantization (LVQ) to quickly and accurately assess the turning point in the US real estate cycle over the past five years [3] .Greble and Burns compared the total real estate, public buildings, private buildings and residential buildings in the United States from 1950 to 1978. It was found that during the 28 years, the house experienced about 6 cycles, non-residential experience has 4 cycles and the real estate cycle lags behind the national economic cycle by about 11 months [4] . Hendershott and Kane cited data from 30 cities in the United States in the 1980s and came up with the relationship between office vacancy rates and economic losses. In the case of a 20% vacancy rate, the office building can bring about 13 billion yuan in losses per year to the United States [5] .Baldi pointed out that it was very important to analyze the change of real estate price from the perspective of credit funds. We should pay attention to the impact of changes in credit conditions such as the proportion of down payment on the real estate market and real estate prices [6] .Adalid and Detken analyzed the 18 member states of the Organization for Economic Co-operation and Development. The study showed that credit funds have a significant impact on the changes in asset prices. They pointed out that due to excessive support for credit funds in the real estate market, real estate prices have risen sharply [7] . Guido Baldi adds real estate industry-related variables to the New Keynesian dynamic stochastic equilibrium model to study the impact of bank credit changes on housing prices and macroeconomics in the process of rising housing prices [8] .
Compared with the study of foreign real estate cycle theory, the research on real estate in china started late and the related theoretical research is still immature. Since the establishment of the market economic system in 1993, real estate in China has entered the period of recovery and development. So far, the research on the real estate market in China has gradually started. Today, Chinese scholars have also made some achievements in the theory of the fluctuation of the real estate cycle, mainly from the following aspects.
Hu Kelin evaluate the Beijing real estate regulation and control policies ， such as land-stopping policies, credit policies, financial policies, and safeguard policies by the PSR model (pressure-state-response) based on the monthly data from 2001 to 2010. The study selected six indicators to test, including the infrastructure investment amount, land price, population quantity, urban resident's disposable income, CPI, M2. House prices are significantly affected by real estate control policies and have a three-month lag period [9] .Kuang studied the interaction between real estate investment, real estate credit and economic growth based on the selected data from 1996 to 2007 in 35 large and medium-sized cities in China. The results of the study show that real estate prices have a greater impact on bank credit than interest rates and economic growth [10] . Zheng Zhonghua and Zhang Yu established a dynamic stochastic general equilibrium model of multiple economic entities. The banking system and the real estate market constructed the main transmission factors of economic fluctuations in the model and simulated the impact on other factors in the macroeconomics. The results of the study show that the introduction of credit funds will lead to the flow of funds in the market to high-yield industries, thus pulling the price of real estate [11] . Based on the quarterly data of the national real estate market between 2003 and 2013, Wu established the ARDL model to study the longterm and short-term relationship between real estate prices and bank credits. It is found real estate prices and bank credits are mutually causal in the long run and house price fluctuations will drive bank credit in the short term. Based on the annual panel data of 35 cities in 2003-2013, the SYS-GMM method is used to estimate the relationship between house price fluctuations and bank credits in the eastern, central and western regions and it is found that the relationship between the two has obvious regional characteristics. The credit elasticity in the western region is relatively large and the central region is the smallest [12] .
In order to further improve the efficiency of real estate registration, explore the value of real estate data and assist in the decision-making of real estate business, we intend to carry out research on the spatial-temporal analysis and visualization of real estate data. The logarithmic periodic power law (LPPL) model is a timed and quantitative analysis method for the asset bubble of logarithmic periodic power law bubble theory. The construction of the model mainly consists of two parts. One is that the noise traders can influence each other and the influence is continuously strengthened to reach a peak and the investors will collectively sell assets to make the market collapse at the same time. Second, the market crash may occur before the asset bubble breaks. The LPPL model can be used to judge the state of the real estate market bubble to predict the future trend of real estate and can also be used to judge the bursting point of the real estate market bubble.

MPGA algorithm
Genetic algorithms must be highly adaptable to changes in complex dynamic environments. The genetic algorithm must be able to closely track the change of the solution in dynamic environment until the best solution is obtained that is an important difference between dynamic optimization and traditional evolutionary algorithms. The goal of traditional evolutionary algorithms is to gradually converge to a satisfactory solution which makes the population lose diversity that is the necessary condition to effectively explore the entire feasible space. Traditional evolutionary algorithms lose ability to adapt to environmental changes later in evolution which is the main challenge for the application of evolutionary algorithms in dynamic environments. In recent years, many scholars find some methods to solve this problem. Multipopulation genetic algorithm is a fairly good algorithm to further extend the genetic algorithm. The multi-population genetic algorithm can divide the entire population into small populations, several of the populations track the current extreme point and the others continue to search for new extreme points. Compared with traditional algorithms, multi-group genetic algorithms pay more attention to ensuring the diversity of populations which is a necessary condition for algorithms to adapt to environmental changes.
Taking the double population as an example, the chromosome migration process between the populations is shown in the Figure 1.

MPGA algorithm
The LPPL model is an algorithm commonly used in the financial field to detect the bursting of the economic bubble. It is also used to detect the oil price transition point. LPPL can be regarded as a method for detecting the turning point of time series. Therefore, the LPPL model is adopted as the research method of time series turning point detection in this study.

Time series turning point test
In order to verify the results of the time-breaking points obtained by the LPPL model, the spectral analysis method is used to test the model results. The spectrum analysis method has the following parts: (1)A series of frequencies set in advance (length is M) (2)Calculate the frequency spectral density for each frequency value, and the calculation formula is shown in Equation (2), the formula for calculating the parameters in Equation 2 are shown in Equation (3), Equation (4) , Equation (5) , Equation (6).
(3) The highest frequency is removed because it may be generated by a random sequence. The frequency sequence statistic is calculated as Equation (7).
Remove the frequency value where the P value is less than the statistic z.
(4)Select the frequency with the largest P value as the result of spectral analysis test.

Algorithms and processes
We mainly use the LPPL model to fit historical real estate data to derive potential turning points of time series. The LPPL model has 7 parameters, including 3 linear parameters (A,B,C) and 4 nonlinear parameters( c t ,ω,φ,α (2) The initial set population number is 10 and the number of individuals in each population is 100. The chromosomes of each individual are randomly selected from 0, 1.
(3) Fitness function is the standard for survival of the fittest in genetic algorithm. In this study, the fitness function value is the mean square error obtained by substituting the parameters obtained by each individual into the LPPL model. In this project, the linear parameters need to be calculated by least squares under the condition that the nonlinear parameters of each individual are calculated and the fitness function is solved in the LPPL. Substitute the parameters into the LPPL to solve the fitness function. The smaller the value of the function, the better the LPPL model parameters are.
(4) Population selection is based on the classic roulette algorithm. The fitness of each individual in the population needs to be calculated first. The smaller the fitness, the better the individual, the greater the probability of being selected to enter the next generation of reproduction. The roulette algorithm is used to determine the final selected individual.
(5) Crossover is the core of genetic algorithms. Set the crossover coefficients (0.7-0.9) randomly to determine the number of crossed individuals in each population. Perform a single point crossover operation on selected individuals in each population that randomly select a gene position and crossinterchange all the gene positions in the two chromosomes before and after the gene position to obtain two new individuals.
(6) Mutation operation is an important means for genetic algorithms to jump out of local best. The coefficient of variation ranges from 0.001 to 0.05 and is set randomly. The number of variant gene positions is determined and the corresponding gene positions are randomly selected from all individual gene position sets in the population. Mutations vary from 0 to 1 or from 1 to 0.
(7) Inter-population migration is the biggest difference between multi-population genetic algorithms and standard genetic algorithms. Inter-population transfer passes the optimal value of one population to the next population to replace the worst performing individual in the next population to speed up the convergence and prevent some populations from falling into local optimum prematurely.

Real estate registration data
The research data of this project is the real estate registration data of Wuhan from December 2016 to October 2018. We analyze Wuhan real estate registration transaction volume data by time series analysis method

Data related to real estate policy
Policy release is determined by several government departments and can be divided into national, provincial, and municipal policies. Since policies are issued by different departments, we collect policy data from the official websites of 11 agencies. The agency information is shown in the Figure 3.  Firstly, the data preprocessing of Wuhan real estate registration data includes a series of processing such as data regularization and abnormal data detection. Then, policy text data is processed by word segmentation, feature selection and weighting methods to eliminate the impact of data irregularities on the results.

Policy data theme extraction
We assume that each month from December 2016 to October 2018 is a time turning point. Taking into account the delays in the impact of policies on real estate, we assume that every month from December 2016 to October 2018 is a time turning point. Since the impact of policies on real estate is delayed, we conducted data mining on the policies in the three months before each turning point and obtained the results of policy interpretation.
The following picture shows all the themes and key words from the corresponding policies in each month from December 2016 to October 2018.

Timeliness Analysis of Policy on Real Estate Transactions
From the real estate related policy data and transaction data of Wuhan. At the end of 2016, a number of restrictions on purchases and loans were issued, which led to a trough in February 2017. The timeliness of such policies will continue to affect the real estate transactions within two to three months. There was also a trough in March 2018, which may be the result of a short-term one-month policy for urban security family

SUMMARY
This project conducts time series analysis from three aspects of real estate registration, transaction number and transaction price. It constructs time series for the statistical characteristics of multiple spatial scales of Wuhan city, administrative district, cadastral district and cadastral sub-district. The ARIMA model is used to model the time series after the statistics, and decompose the original sequence into periodic components, trend components, and residual components. On this basis, the log-periodic power law (LPPL) model under the optimization of multi-group genetic algorithm (MPGA) is used as the trend turning point detection of time series. At the same time, the project collects real estate-related policy data from the web portals of major real estate related departments since the second half of 2016, and mines the topic information and keywords in the policy information through the natural language processing theme model. Finally, the policy mining information is combined with the real estate statistical time series, and the policy is used to interpret the turning trend of real estate time series and visualize the display.