import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import mean_absolute_error, mean_squared_error
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.statespace.sarimax import SARIMAX
import pmdarima as pmd
9 Steps to create a time series forecasting model
For your time series project, you can use either the ARIMA model or the Prophet model - there is no need to include both. However, you might like to experiment with both, to see which one provides a more accurate forecast.
9.1 ARIMA model
Step 1: Import Necessary Libraries
You can’t do anything without these! You will need:
Step 2: Load Your Data
Load in the data and perform an inital examination: .head()
, .describe()
or .info()
are just some of the methods you can use.
What data types do you have? How many columns and rows? Are there any missing values? If so, deal with them. Make sure your data is as clean as possible.
Step 3: Visualise the Data
Plot the data to understand its trend and seasonality. This helps in setting the ARIMA parameters later.
Step 4: Check for Stationarity
ARIMA requires the data to be stationary. Use the Augmented Dickey-Fuller test to check stationarity. (Don’t forget to use from statsmodels.tsa.stattools import adfuller
to import adfuller first. If you get a p-value less than 0.05, the data is stationary.)
Step 5: Make the Data Stationary
If the data is not stationary, difference the data until it becomes stationary.
Step 6: Determine ARIMA Parameters
Identify the Autoregression (p), Differencing (d), and Moving Average (q) parameters. Use plots like the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) or automatic model selection techniques, like Auto ARIMA.
Step 7: Fit the ARIMA Model
Create and fit the ARIMA model with the parameters determined in the previous step.
Step 8: Review Model Summary
Examine the model summary to understand the fit and check if further tuning is needed.
Step 9: Make Forecasts
Use the model to make predictions. You can predict both within the range of your dataset (in-sample) and future dates (out-of-sample).
Step 10: Evaluate Model Performance
Assess the accuracy of the forecasts using metrics like Mean Squared Error (MSE) or others suitable for your analysis.
Step 11: Visualise the Forecast
Plot the original data along with the forecasted values to visually assess how well the model is performing.
9.2 Prophet Model
Step 1: Install and Import Necessary Libraries
You can’t do anything without these! You will need:
import holidays
import pandas as pd
from sklearn.model_selection import ParameterGrid
import matplotlib.pyplot as plt
import seaborn as sns
import plotly
from sklearn.metrics import mean_absolute_error, mean_squared_error
from prophet import Prophet
Step 2: Load and Prepare Your Data
Load in the data and perform an inital examination: .head()
, .describe()
or .info()
are just some of the methods you can use.
What data types do you have? How many columns and rows? Are there any missing values? If so, deal with them. Make sure your data is as clean as possible.
Prophet requires the DataFrame to have two columns: ds and y. ds must be the date (timestamp) and y must be the value you want to forecast. This naming convention is mandatory as Prophet’s model is designed to understand these as default column names.
Step 3: Visualise the Data
Visualising your data can give insights into trends, seasonality, and any outliers present. It’s an important step for understanding the underlying patterns in your data and for explaining these patterns to others. It also helps in assessing the adequacy of Prophet’s default model settings for your data.
Step 4: Create and Fit the Prophet Model
Split your data into training and testing sets using train_test_split
to evaluate your model’s performance on unseen data.
Instantiate a Prophet object and fit it to your dataset, using model = Prophet()
and model.fit(data)
.
This step involves the model learning from your historical data, identifying underlying patterns such as trends and seasonality. Prophet will automatically handle seasonality detection.
Step 5: Create Future Dataframe for Predictions
Generate a future DataFrame for which you want to predict future values, using make_future_dataframe(periods = )
. The period specifies how far into the future the model should forecast.
Step 6: Make Forecasts
Use the model to predict future values, using model.predict()
. This will return a DataFrame with the forecast components.
Step 7: Review Forecast Components
Prophet provides a breakdown of the forecast into trend, yearly, weekly, and daily components. Viewing these can help understand how different patterns contribute to the forecast. You can use model.plot_components()
.
Step 8: Plot the Forecast
Visualise the original data along with the forecast. This helps to see how well the model fits the data and what the future might hold.
Step 9: Evaluate Model Performance
If you have actual observed values for the forecast period, you can evaluate the model’s performance using metrics such as Mean Absolute Error (MAE).
Step 10: Adjust Model Parameters (Optional)
Depending on the initial results, you might want to adjust the model’s parameters or include additional regressors to improve accuracy. You could use a hyperparameter grid to find the best combination of parameters.