Tourists Forecast Lanzhou Based on the Baolan High-Speed Railway by the Arima Model

According to the analysis from the number of tourists who went to Lanzhou during 2009–2019, the ARIMA model of the number of tourists to Lanzhou was established. The results show that the AR(3) model is used to predict the number of tourists who traveled to Lanzhou during 2009–2019. The average relative error between the predicted value and the actual value is 1.03%, which can be used to predict and analyze the number of tourists in Lanzhou in the future.


Introduction
With the expansion of high-speed rail tourism and the improvement of benefits, travel management and decision-making work have also increased many difficulties. To ensure a sustainable and stable development of tourism, scientific methods must be used to manage, analyze, and predict the rules of the tourism market, which has far-reaching significance to the sustainable development of China's tourism industry. Therefore, this paper takes Baolan high-speed railway as an example and uses ARIMA model to compare the passenger flow data and forecast results from 2009 to 2019 to verify the accuracy of the model. By forecasting the passenger flow to Lanzhou in 2020, it can provide reference for the tourism planning and the layout of tourism resources in Lanzhou.
The opening of high-speed rail has promoted the economic growth of cities along the periphery, but it also brought challenges to developmental opportunities. The surge in the number of people will naturally lead to fullness of the scenic spots and the increase of accommodation and accommodation pressures, especially the problems of hotel accommodation and transportation, which have brought severe challenges to the local tourist reception capacity. How to predict passenger traffic and enhance local hardware and software facilities and increase the carrying capacity according to passenger traffic has become an urgent problem to be solved.
According to the relevant literature, Li Wei [1] believes that for the ARIMA model and the BP neural network model of predictable tourist numbers, the simulation prediction of the ARIMA model is better overall, while the BP neural network model is relatively less. Perhaps, Kong Chaoli [2] believes that with the deterministic factor decomposition model decomposition, the ARIMA model preview effect is better. Liu Ruoyu and Liu Libo [3] believe that compared with regression analysis, which needs basic function model for prediction and estimation, human factors have less influence on time series processing in ARIMA model. Therefore, according to the traditional regression analysis method, the ARIMA model has better precision. It can be seen that in the process of prediction, many researches do not involve the unit root test and white noise residual test, which leads to low reliability [4]. Based on the above research, this paper uses ARIMA model system and Eviews statistical software to process and analyze the number of tourists to Lanzhou during 2009-2019, and make short-term forecast.
3 The data source and description   As an important industrial base and comprehensive transportation in northwestern China, passengers who choose to travel to Lanzhou will also be affected by traffic, terrain, climate, culture, and so on. Some of these factors are difficult to change, and the relationship between various factors is complicated. It is difficult to predict by regression analysis and factor analysis, and it is difficult to achieve the desired prediction effect. Time series analysis is a statistical method of dynamic data processing. By analyzing some statistics of the data, the dynamic change law of the data is obtained, and the model is established by the boots for prediction [5]. Among those, including a variety of predictive models, the ARIMA model is one of them.
The ARIMA model is a commonly used random time series model, founded by Box and Jenkins. It is a more accurate continuous short-term prediction method. It can better understand the structure and characteristics of time series and the optimal prediction under the minimum variance [6]. So in real life, we often use the ARIMA model to predict and study time series, and get multiple ideal results [7].

The basic theory of ARIMA model
A system, if its response Xt at time t, is related to the previous value of the previous time, and there is a certain dependence on the disturbance of the previous time into the system, then this system is the autoregressive moving average system, corresponding. The model can apply the ARMA to the discontinuous time series containing the short-term trend, and then apply the ARMA to establish the differential autoregressive moving average model. The corresponding model is recorded as the ARIMA (p, d, q) model. Where AR is autoregressive, p is autoregressive, MA is the moving average, q is the moving average, and d is the number of differences when the time series becomes stationary.

Data preprocessing
The stability and randomness of the sequence are tested before modeling, and a general sequential model is established based on the sequence stability and non-pure randomness. First, test the stationary of time series with ADF. The test results are shown in Figure 3 and can be trimmed at a significance level of 0.05, accepting the presence of a column, d = 1, the sequence is first-order difference stationary sequence.

Model ordering and testing
According to the first-order difference autocorrelation and partial autocorrelation graph (Fig. 4), the model can be preliminarily judged as AR model, and the value of AR is realized by Eviews software. The AIC   Arch test is a method to observe the variance change of time series. Its idea is that in time series data, we can consider the existence of heteroscedasticity as arch process, and judge whether there is heteroscedasticity in time series by testing whether this process is tenable. Perform an ARCH test on the model to determine the lag order, as shown in Table 2. According to the Akike info standard (AIC), the lag order can be determined to be 1. In the ARCH(1) test, the Ps value corresponding to the Obs * R 2 statistic is at the significance level of 5%, and the null hypothesis is not rejected, that is, the test result indicates that there is no ARCH effect, that is, there is no heteroscedasticity of the ARCH form. The statistics of the model mean and the autocorrelation coefficient are all tested by the significance test. The model fully extracts the effective information through the residual autocorrelation test and can be used for prediction [8].

Model prediction
We use time series analysis to establish an autoregressive prediction model for the data of tourists to Lanzhou during 2009-2019. The predicted and actual values are shown in Table 3.
It can be seen from the table that the model is accurate in the short term, and the average error is 1.03%. However, because the modeling uses the smoothing data, the prediction results are underestimated, and the model error is expanding. At the same time, it is predicted that the number of tourists to Lanzhou in the autumn of 2020 will reach 81045529.

Conclusion
In this paper, using the relevant theory of time series, the AR(3) model is used to predict the number of tourists in Lanzhou City during 2009-2019. The experimental results show that using AR(3) model to better predict the short-term forecast of Lanzhou tourism population, this is a feasible way for Lanzhou tourism managers to predict the development of tourism market. Only considering the impact of the opening of the Baolan high-speed rail on the tourist population in Lanzhou, there is a one-sidedness. On the other hand, the ARIMA model is used to make short-term predictions of the number of tourists. The applicability of long-term prediction has not been studied in depth, and there are deficiencies.