Published July 24, 2018
From: National Oceanic and Atmospheric Administration
When looking through data science case studies, posts, and articles you will likely notice the majority focus on prediction, not inference. Although this is not necessarily surprising, it is important to not overlook the value of inference. The difference between these is significant so it is critical to understand when you should pursue one versus the other.
Prediction problems are extremely popular these days and for good reason. Solving a prediction problem can produce significant value for companies. Some common examples include:
- Predicting back orders/stock-outs
- Predicting propensity to purchase
- Predicting sales volume
- Predicting response to promotions
Inference, on the other hand, is focused on quantifying the relationship between variables. A common inference problem companies face is measuring the impact certain variables have on performance metrics. When companies do this they create insights.
Although not always mutually exclusive, there is typically a tradeoff between solving prediction problems and developing inference or insights while solving them. For example, building neural network or gradient boosted tree models to predict certain events may prove to be 98% accurate, but will likely create little to no insight into why they are so accurate.
One way around this is to imbed monotonic constraints on independent variables within a model and then graph their influence on the dependent variable. This approach will produce some clear insight into the relative importance of key variables. The drawback of this approach is that forcing monotonic constrains may actually hurt predictive performance. Foiled again!
Thus, this is an example of the tradeoff many companies face when trying to solve predictive problems versus generating insights. Although solving both types of problems contributes to knowledge management, pursuing inference-based projects will typically result in a broader diffusion of the knowledge created. Ultimately organizations must choose a balance between pursuing prediction versus inference and this balance will likely change over time as the needs of the organization change.
The real question is, what does your organization need right now?
- Highly accurate predictions that improve performance; or
- Deep insights that aid in both tactical and strategic decision making.
More Resources
- Article: NextGen Supply Chain: Data science comes to the supply chain
- Article: Other Voices: Estimating Primary Demand vs Substitution