How Time Series Forecasting can predict Sales?

ankit aggarwal
5 min readFeb 24, 2021

1.Introduction:

In today’s world, the success of the organization is dependent on how much they can reduce and control the uncertainty in their system. The first and most important step in reducing the uncertainty is to better predict the sales of their products.

In this project, we would like to develop sales forecasting models for predicting the monthly sales of an agricultural equipment .Along with the available past monthly sales data, we will also consider other relevant data including economic and commodity indices which might be helpful for our forecasting.

1.1 Data Description:

Following two types of information is available:

1. Past Sales Data : From January 1990 to December 2016 (27 Years)

2. Other additional Predictive Information (total 35 indices):

  • Monthly Macroeconomic Indices (GDP, Unemployment rate, CPI, 30YearMtggRate etc.)
  • Monthly Commodity Indices (Hay, Corn, Wheat, Dairy, Livestock etc.)

1.2 Problem Statement:

Our goal for the project is to develop monthly sales forecasting model which can accurately predict sales for two months ahead in future i.e forecast horizon of two months. We selected two months period as the forecast horizon as for automobile manufacturers most of decisions regarding supply chain(including components ordering for manufacturing the final product) are confirmed two months in advance of the target sales month.

1.3 Performance Metrics:

  • RMSE(Root Mean Squared Error) will be considered as primary evaluation metric as this provides error in the same scale of our sales data.

Other tracked metrics:

  • MAPE(Mean Absolute Percentage): This provides the error in percentage terms.
  • Average Prediction Interval : This provides average range of values in which prediction will fall and is a indicator of uncertainty in our forecasting.

2. Project Insights:

In the further section of this blog, we will explore this project to answer following questions:

  1. Whats are the patterns present in our sales history data which can be used for forecasting future sales?
  2. How can we use statistical time series methods to forecast future sales based on past sales data?
  3. How is it possible to include additional predictive information including economic indices and commodity prices in sales forecasting?
  4. How does our results compare when we do the sales forecasting with and without the use of additional predictive information?
  5. What are the ways we can further improve our forecasting?

2.1 Whats are the patterns present in our sales history data which can be used for forecasting future sales?

Based on our data analysis and also seen in below monthly sales plot, we can clearly see that there is a cyclicity and yearly seasonality patterns present in the data which can be exploited for our future sales forecasting.

Monthly Sales data line plot

2.2 How can we use statistical time series methods to forecast future sales based on past sales data?

Based on the simplicity and proven effective of statistical ARIMA modelling, we decided to use this method for our modelling. The way ARIMA modelling works is that it aims to learn the linear relationships and the autocorrelations present in the data.

For our modelling we split our 27 years available sales data into train(18years),validation(6years) and test data(3years). We first find best ARIMA model for our problem with the help of walk forward validation and then test that model on our test dataset.

Based on our Arima modelling, we made the following observations and results:

  • As shown in below plots,our selected model is under-forecasting consistently during the validation period and overcasting during the test period. One of the possible reasons for this might be the lack of additional required information or model inability to learn some patterns like cyclicity in the data.
  • ARIMA Test Results : {‘Test RMSE’: 125.36, ‘Test MAPE’: 25.22, ‘Test Average Prediction Interval’: 433.39}

2.3 How is it possible to include additional predictive information including economic indices and commodity prices in sales forecasting?

In order to include the additional predictive information in our forecasting model, we first selected the feature through Boruta method. This method basically selects features which showed better performance in comparison to their randomized version of data. With the boruta method, we were able to select 5 features out of our all available macroeconomic and commodity indices.

To incorporate the selected features in our forecasting model, we decided to use model SARIMAX which is basically a regression model with seasonal ARIMA errors. This way model is able to learn from predictor variables as well as learn from the autocorrelation present in the response variable time series.

In this case, we made the following observations and results:

  • As shown in below plots, we can see that model has mix of under forecasting and overcasting periods during both validation and testing which is good from cumulative forecast bias perspective.
  • SARIMAX_Test_Results : {‘Test RMSE’: 145.56, ‘Test MAPE’: 28.69, ‘Testing Average Prediction Interval’: 506.47}

2.4 How does our results compare when we do the sales forecasting with and without the use of additional predictive information?

Based on model comparison between univaraite ARIMA and multivariate SARIMAX as shown in below table, we can clearly see that ARIMA model performs better in all metrics. This means that either the predictive information that we tried to include in our forecasting model was not relevant or our model was not able to learn the relationship between sales and predictive variables.

2.5 What are the ways we can further improve our forecasting?

  • Until now we have experimented with ARIMA based Statistical models which are linear models and will not be able to learn non-linear patterns. Experimenting further with machine learning and deep learning models like LSTM which have ability to learn complex patterns can help in improving the performance.
  • There are also other well known methods such as Prophet method which can also be experimented to look for improvement possibilities.

References:

For more details explanation and technical details for this project, please click on my GitHub repository available here.

--

--