通过分析时间序列,根据时间序列所反映出来的发展过程、方向和趋势,进行类推或延伸
# AR模型:自相关系数拖尾,偏自相关系数截尾;
# MA模型:自相关系数截尾,偏自相关函数拖尾;
# ARMA模型:自相关函数和偏自相关函数均拖尾。
# ARIMA模型(只适用于平稳的时间序列):将非平稳时间序列转化为平稳时间序列,然后将因变量仅随它的滞后值以及随机误差项的现值和滞后值进行回归,所建立的模型
# 如何判定一个时间序列是否平稳:均值和方差是否随着时间序列的推移而改变
# 使用差分法,构建平稳时间序列 diff_ts = ts.diff(d_差分的步数) ts-->时间序列数据
# ARMA模型(p,q)参数的估计
# sm.tsa.arma.order.select_ic(didd_ts_时间序列数据,max_ar_p值的最大值,max_ma_q值的最大值,ic_参数最优评判指标,全选【'aic','bic','hqic'】)
# armaModel = sm.tsa.ARIMA(diff_ts_时间序列,(p,q))
# armaModel.fit() 训练模型
# armaModel.predict(start_需要开始预测时间,end_需要结束预测时间) 预测数据
import pandas
import statsmodels.api as sm
import matplotlib.pyplot as plt
data = [10930,10318,10595,10972,7706,6756,9092,10551,9722,10913,
11151,8186,6422,6337,11649,11652,10310,12043,7937,6476,9662,9570,9981,9331,9449,
6773,6304,9355,10477,10148,
10395,11261,8713,7299,10424,
10795,11069,11602,11427,9095,
7707,10767,12136,12812,12006,
12528,10329,7818,11719,11683,
12603,11495,13670,11337,10232,
13261,13230,15535,16837,19598,
14823,11622,19391,18177,19994,
14723,15694,13248,9543,12872,
13101,15053,12619,13749,10228,
9725,14729,12518,14564,15085,
14722,11999,9390,13481,14795,
15845,15271,14686,11054,10395
]
data = pandas.Series(data)
data.index = pandas.Index(
sm.tsa.datetools.dates_from_range('1926', '2015')
)
data.plot(figsize=(10,6))
#第一步、确定d
diff_1 = data.diff(1)
diff_1.plot(figsize=(10,6))
diff_2 = data.diff(2)
diff_2.plot(figsize=(10,6))
d=1
#第二步、确定p、q
res = sm.tsa.arma_order_select_ic(
diff_1.dropna(),
max_ar=8,
max_ma=8,
ic=['aic', 'bic', 'hqic'],
trend='nc'
)
arma_mode = sm.tsa.ARMA(diff_1.dropna(), (6, 6)).fit()
arma_predict = arma_mode.predict('2015', '2020', dynamic=True)
print(arma_predict)
fig, ax = plt.subplots(figsize=(10,6))
ax = diff_1.ix['1926':].plot(ax=ax)
arma_predict.plot(ax=ax)
#第三步、将预测的差分值还原为
lastValue = data[len(data)-2]
for p in arma_predict:
lastValue = lastValue + p
print(lastValue + p)
参考文献
作者A:ken