The management of retail stores is subject to a number of factors such as weather, product availability and market trends, making it difficult for retailers to make accurate predictions about product demands and prices.
However, accurate predictions are very important for retailer to prevent overstocks or stockouts and false pricing models, because retailer lose margin with every unsold product. As margins are relatively low, especially in grocery retail stores, predicting product demand more accurately than its competition can be a major differentiator of retailers.
Product demand and prices depend on various factors and a forecasting model has to be able to capture all relevant factors for a precise prediction. Such factors are for example season, buying habit trends, weather conditions, holidays, events and competitor campaigns, which have to be collected in a time series data set, where the factors are the features for each product demand and price data point.
However, such data set is difficult to acquire, since it incorporates many factors and features from different sources, which may not be easily accessible and may not have the same format and terminology. Moreover, there are interdependencies between these factors and the influence of these factors on different product categories might be different, e.g. winter jackets sales might increase when temperature drops whereas demand for low sneakers might decrease.
A machine learning model can be trained to recognize patterns and regularities in the data and to predict future demand by product category. The machine learning model needs to be able to analyze time series data capturing long-term dependencies and correlations based on the features.
Current used techniques are seasonal ARIMA (autoregressive integrated moving average) and LSTM (long short-term memory) algorithms. These algorithms use past weighted state variables and past outputs in order to compute their influence on the current states and outputs. Training the model with training data allows the model to make predictions on unseen data.
The use of a support vector regression (SVR) model might also be an option for the analysis of time series data.