


Introducing the use of python's statsmodels module to fit ARIMA models
Related free learning recommendations: python video tutorial
Import necessary packages and modules
from scipy import statsimport pandas as pdimport matplotlib.pyplot as pltimport statsmodels.api as smfrom statsmodels.tsa.arima.model import ARIMAfrom statsmodels.graphics.tsaplots import plot_predict plt.rcParams['font.sans-serif']=['simhei']#用于正常显示中文标签plt.rcParams['axes.unicode_minus']=False#用于正常显示负号
1. Read the data and draw the graph
data=pd.read_csv('数据/客运量.csv',index_col=0)data.index = pd.Index(sm.tsa.datetools.dates_from_range('1949', '2008'))#将时间列改为专门时间格式,方便后期操作data.plot(figsize=(12,8),marker='o',color='black',ylabel='客运量')#画图
#The passenger flow time series data used in this article: https://download.csdn.net/download/weixin_45590329 /14143811
#The time series line chart is as shown below. Obviously the data has an increasing trend, and the preliminary judgment is that the data is not stable
2. Stationarity test
sm.tsa.adfuller(data,regression='c')sm.tsa.adfuller(data,regression='nc')sm.tsa.adfuller(data,regression='ct')
is carried out Three forms of ADF unit root tests, as shown in some results, found that the sequence is not stationary
3. Perform first-order difference processing on the data
diff=data.diff(1)diff.dropna(inplace=True)diff.plot(figsize=(12,8),marker='o',color='black')#画图
Make a data Line chart after first-order difference, preliminary judgment is that it is stationary
4. Conduct stationarity test on the first-order difference data
sm.tsa.adfuller(diff,regression='c')sm.tsa.adfuller(diff,regression='nc')sm.tsa.adfuller(diff,regression='ct')
As shown in the figure, it shows that the sequence is stationary
5. Determine the order of ARIMA (p, d, q)
fig = plt.figure(figsize=(12,8))ax1 = fig.add_subplot(211)fig = sm.graphics.tsa.plot_acf(diff.values.squeeze(), lags=12, ax=ax1)#自相关系数图1阶截尾,决定MA(1)ax2 = fig.add_subplot(212)fig = sm.graphics.tsa.plot_pacf(diff, lags=12, ax=ax2)#偏相关系数图1阶截尾,决定AR(1)
According to the autocorrelation coefficient map ACF and the partial autocorrelation coefficient map PACF, determine the original data as ARIMA ( 1,1,1) Model
6. Parameter estimation
model = ARIMA(data, order=(1, 1, 1)).fit()#拟合模型model.summary()#统计信息汇总#系数检验params=model.params#系数tvalues=model.tvalues#系数t值bse=model.bse#系数标准误pvalues=model.pvalues#系数p值#绘制残差序列折线图resid=model.resid#残差序列fig = plt.figure(figsize=(12,8))ax = fig.add_subplot(111)ax = model.resid.plot(ax=ax)#计算模型拟合值fit=model.predict(exog=data[['TLHYL']])
7. Model test
#8.1.检验序列自相关sm.stats.durbin_watson(model.resid.values)#DW检验:靠近2——正常;靠近0——正自相关;靠近4——负自相关#8.2.AIC和BIC准则model.aic#模型的AIC值model.bic#模型的BIC值#8.3.残差序列正态性检验stats.normaltest(resid)#检验序列残差是否为正态分布#最终检验结果显示无法拒绝原假设,说明残差序列为正态分布,模型拟合良好#8.4.绘制残差序列自相关图和偏自相关图fig = plt.figure(figsize=(12,8))ax1 = fig.add_subplot(211)fig = sm.graphics.tsa.plot_acf(resid.values.squeeze(), lags=12, ax=ax1)ax2 = fig.add_subplot(212)fig = sm.graphics.tsa.plot_pacf(resid, lags=12, ax=ax2)#如果两图都零阶截尾,这说明模型拟合良好
8. Prediction
#预测至2016年的数据。由于ARIMA模型有两个参数,至少需要包含两个初始数据,因此从2006年开始预测predict = model.predict('2006', '2016', dynamic=True)print(predict)#画预测图及置信区间图fig, ax = plt.subplots(figsize=(10,8))fig = plot_predict(model, start='2002', end='2006', ax=ax)legend = ax.legend(loc='upper left')
For a large number of free learning recommendations, please visitpython tutorial(Video)
The above is the detailed content of Introducing the use of python's statsmodels module to fit ARIMA models. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



MySQL has a free community version and a paid enterprise version. The community version can be used and modified for free, but the support is limited and is suitable for applications with low stability requirements and strong technical capabilities. The Enterprise Edition provides comprehensive commercial support for applications that require a stable, reliable, high-performance database and willing to pay for support. Factors considered when choosing a version include application criticality, budgeting, and technical skills. There is no perfect option, only the most suitable option, and you need to choose carefully according to the specific situation.

HadiDB: A lightweight, high-level scalable Python database HadiDB (hadidb) is a lightweight database written in Python, with a high level of scalability. Install HadiDB using pip installation: pipinstallhadidb User Management Create user: createuser() method to create a new user. The authentication() method authenticates the user's identity. fromhadidb.operationimportuseruser_obj=user("admin","admin")user_obj.

It is impossible to view MongoDB password directly through Navicat because it is stored as hash values. How to retrieve lost passwords: 1. Reset passwords; 2. Check configuration files (may contain hash values); 3. Check codes (may hardcode passwords).

MySQL can run without network connections for basic data storage and management. However, network connection is required for interaction with other systems, remote access, or using advanced features such as replication and clustering. Additionally, security measures (such as firewalls), performance optimization (choose the right network connection), and data backup are critical to connecting to the Internet.

MySQL Workbench can connect to MariaDB, provided that the configuration is correct. First select "MariaDB" as the connector type. In the connection configuration, set HOST, PORT, USER, PASSWORD, and DATABASE correctly. When testing the connection, check that the MariaDB service is started, whether the username and password are correct, whether the port number is correct, whether the firewall allows connections, and whether the database exists. In advanced usage, use connection pooling technology to optimize performance. Common errors include insufficient permissions, network connection problems, etc. When debugging errors, carefully analyze error information and use debugging tools. Optimizing network configuration can improve performance

MySQL database performance optimization guide In resource-intensive applications, MySQL database plays a crucial role and is responsible for managing massive transactions. However, as the scale of application expands, database performance bottlenecks often become a constraint. This article will explore a series of effective MySQL performance optimization strategies to ensure that your application remains efficient and responsive under high loads. We will combine actual cases to explain in-depth key technologies such as indexing, query optimization, database design and caching. 1. Database architecture design and optimized database architecture is the cornerstone of MySQL performance optimization. Here are some core principles: Selecting the right data type and selecting the smallest data type that meets the needs can not only save storage space, but also improve data processing speed.

The MySQL connection may be due to the following reasons: MySQL service is not started, the firewall intercepts the connection, the port number is incorrect, the user name or password is incorrect, the listening address in my.cnf is improperly configured, etc. The troubleshooting steps include: 1. Check whether the MySQL service is running; 2. Adjust the firewall settings to allow MySQL to listen to port 3306; 3. Confirm that the port number is consistent with the actual port number; 4. Check whether the user name and password are correct; 5. Make sure the bind-address settings in my.cnf are correct.

As a data professional, you need to process large amounts of data from various sources. This can pose challenges to data management and analysis. Fortunately, two AWS services can help: AWS Glue and Amazon Athena.
