Well, to put it simply: it does not work.

Despite the number of specialized software available on the market, most companies do a lot of manual work to refine their forecasts, spending a lot of time for rather mitigate results. Bad news.

Obviously, specialized software although specialized in forecasting do not perform as good as they should! Why?

Common forecasting software fail to do good forecasting

Most software used in supply chain demand forecasting or simply sales forecasting rely on traditional time series analysis. Basically they extrapolate the future from past sales using a few simple principles

  1. Split time into time buckets so to have series of values. Bucket should be of comparable size.
  2. Use methods based on moving averages (to put it simply): averaging sales volume, trend, seasonality.
  3. Or use methods based on least square regression: linear regression, damped curves or more complex like bass curves.

But sales in the real world are all about short life cycles, high seasonality, lots of promotions and other events. Time series fail at producing good forecast in this situation.

Time series are not appropriate

Too short history

Seasonal pattern require at least 3 years history to estimate seasonal coefficients. And even with three years, seasonality is based on averaging three values which are few. On the other hand, many companies in the Consumer Package Goods area for example have a product renewal rate of more than 20% per year. A simple computation shows that less than half of products have enough history for a reliable estimation of seasonality.

Complex cyclic patterns not captured

Classical Holt-Winter rely on seasonal coefficients: this means that each year must be subdivided into a defined number of buckets:

  • Months are good as there are 12 of them each year. But it is a too large bucket as most of the time we need a daily or weekly demand to plan supply. Also, the number of days varies from one month to the other. Weekly patterns are not captured and modify the month's volume completely. Fast varying sales within months change the volumes too.
  • Weeks could be good, but years are not well aligned on weeks, and some years have 53 weeks. Having 52 seasonal coefficients leads to over fitting see later.
  • Days is even worse from the number of variables point of view, and as given day index in year lead to different days of weeks from one year to another, it does not give any good result.

In all cases, I never saw in any commercial software a unique model combining several buckets: no statistical method estimate in a single pass daily weights, intra month profile and yearly seasonality. ARIMA model could do a rather good job on that, but the model is a complete black box.

Complex life cycle not modeled

The same goes for life cycle: only regression allow for complex life cycle curves, lacking any seasonality feature. Time series analyses that handle seasonality have only limited types of life cycle curves.

Most of the time, these methods estimate linear trends which does not fit into medium or short life cycles.

New product introduction impossible

Time series simply do not handle forecasting products that has not been sold yet.

Promotions not considered

Some exponential smoothing based methods add event coefficients to Holt-Winter trend and seasonal coefficients. But making good promotion forecast in this case requires:

  1. To have several identical promotions in the past at different buckets so that its effect can be estimated and distinguished from seasonality
  2. To use buckets small enough to handle properly promotions that often cross months and have 2 or 3 weeks duration.

The first point limits forecasting to a minority of promotions. The second leads once again to over-fitting.

Erratic sales not well analyzed

This comes when trying to do forecasting at store levels for example=le. It is a good idea to do so to remove any bullwhip effect or stock effect, but lead to too erratic series and too much noise for the model.

Over fitting

Talking about over-fitting: this occurs when modeling a process with a model that users too many variables. It allows for a model that mimics perfectly the past but nevertheless gives poor forecasting. It fails at distinguishing the noise and the signal so to say.

This phenomenon which has been discovered 60 years ago limits the usage of these methods:

  • Either use only simple models, typically remove seasonality from the model, which leads to inadequate models and poor forecast
  • Either use complex models only on very long history like 10 years, which limits the usage to a few percent of products
  • Or over fit the past sales and produce spectacular fast moving and irregular forecasts. Spectacular but wrong.

Black box are not trust

Some methods like ARIMA often are able to capture fast changing trends and cyclic patterns which are rather complex. But forecasting is not computing values to use them as is, calculated forecasts serve as a basis for discussion, revision, changes and so on.

Therefore, those figures must be understood by planners, and even better, the underlying mathematical model should be simple and clear enough so that is can be explained. Changing or forcing some components of that model like trend or seasonality is of huge value to the end user.

Some recent software package come with sophisticated calculations but lack this clarity and simplicity. They force the user to look at figures and them change the manually with no control on the calculation.

Most often in these cases, the user rejects the software. I experimented that several times at work.

Usual adaptations to get time series to produce finally forecasts

To circumvent these limitations, commercial software have built around the time series engine a lot of added value features.

  • Pre-treatment to remove the effect of having a varying number of days in each bucket
  • Pre-Treatment to remove biggest outliers
  • Product supersession consolidation so to have a single history series
  • The most important trick: aggregate data

Aggregating data is the natural answer to many of these limitations: doing so we can have a longer history, less noise (that is the big number law) and a better behavior of time series. Yes yes yes, for sure.

But at the end, what is required is a detailed daily of weekly forecast, and to spread the aggregated forecast down to details, these packages simply use an average weight: weight of each week in the month (mmm there would be much to say here), weigh of items within families and so on.

At the end, the result is not so good:

  1. Forecast error cumulate, there is an error in the computation of the weight plus an error in the aggregated forecast.
  2. Choosing an aggregation level is tricky and sometimes impossible as all references of one aggregation should share the same behavior (trend seasonality and so on). It causes headaches to the planners.
  3. Promotion forecasting is impossible as it is rare that a whole family is promoted.
  4. New product introduction is not handled
  5. The risk factor or uncertainty of the forecast is impossible to define as there is a sequence of several statistical operations. Thus this data is important as it is often used to define safety stocks or delays, or lower and upper values for budgets.

This gives better results, sure, but not good enough and adds a lot of complexity that most of the time the user sees.

Does not work

We can say that in most of the cases, traditional forecasting software only provide acceptable forecast for a minority of products, that it is not accurate enough for short term, and requires a lot of additional manual work.

This is why I spent some time trying to define another way of doing forecasting, see my previous post.

Next time: the few basic principles that my perfect forecasting software should adopt. Stay tuned.