J. Prakash - Portfolio (2026)

Decision-Aware Stochastic Forecasting for Small High-cost Inventory Under Covariate Scarcity

Jithakrishna Prakash | | linkedin.com/in/jithakrishna-prakash/


This project involves forecasting demand for small inventory data using stochastic consumption patterns under lack of covariates. Stochastic modeling of demand was performed using Poisson-Gamma Conjugate Model Mock inventory data inspired from real-world biotech QC inventories is used in this analysis. Sensitive information such as material name and expiries have been masked.

The potential benefits of this analysis were:

  1. Optimization of restock frequency
  2. Reduction of effort to perform frequent restocks
  3. Optimum stock level that maximizes inventory level as well as minimize expected waste

Stochastic Process

Bayesian analysis involves representing a stochastic process (random process) using known probability distributions called likelihood functions. Typically, these likelihood functions will need parameter values to work. We usually start with a range of values, or a probability distribution, for these parameter(s), called priors. Then we will obtain a new distribution called the posterior probability distribution by combining the likelihood and the prior based on the Bayes Theorem. The posterior probability distribution will then be representative of the possible values of the random variable ranked by plausibility.

In this case, the nature of my inventory data is Count Data. Count data is fundamentally modeled using the Poisson Distribution as the likelihood function:

\((y~|~\lambda, ~n_{t}) \sim Poisson(\lambda \cdot n_{t})\)

where \(y\) is the consumption for \(n_{t}\) days (exposure), and \(\lambda\) is the rate for daily count.

However, the underlying usage changes every week. So, \(\lambda\) is assumed to follow the Gamma Distribution:

\((\lambda~|~a,~b) \sim Gamma(shape = a, ~rate=b)\)

With a Poisson Likelihood for the weekly consumption \(y\) and Gamma Posterior for \(\lambda\) with priors \(a_{prior}\) for shape and \(b_{prior}\) for rate and we obtain a closed-form expression for the Posterior Predictive Distribution of consumption \(Y\), which is the Negative Binomial Distribution:

\(Y \sim NegativeBinomial(x~|~size = a_{prior},~prob = \frac {b_{prior}} {1 ~+~ b_{prior}})\)

This is the posterior predictive distribution of consumption \(Y\) for one day. Exploiting the property of Poisson processes that counts scale linearly with exposure time, the posterior predictive distribution of consumption \(Y\) can be extended to \(H\) days by:

\(Y_{future} \sim NegativeBinomial(x ~|~ size = a_{prior},~prob = \frac{b_{prior}} {H ~+~ b_{prior}})\)

Since there is a closed-form expression for the posterior predictive distribution, the Gamma distribution is called the Conjugate Prior of the Poisson Distribution.

The shape and rate parameters for the Gamma posterior of \(\lambda\) can be updated with new data using the following simple posterior update rules:

\(a = a_{prior} ~+~ \Sigma(weekly~consumption)\)
\(b = b_{prior} ~+~ \Sigma(days~per~observation)\)

The new posterior predictive distribution for consumption \(Y\) for H days will then be:

\(Y_{future} \sim NegativeBinomial(size = a, ~prob = \frac {b} {b ~+~ H})\)

The initial priors \(a_{prior}\) and \(b_{prior}\) for shape and rate of the initial Gamma posterior of \(\lambda\) can be calculated as empirical Bayes priors from initial data, which are simply \(Median(days~per~observation)\) for rate and \(\frac {Median(weekly~consumption)} {Median(days~per~observation)}\) for shape.

The Negative Binomial distribution exhibits properties such as right-tailed distribution, overdispersion in the random variable (variance is greater than the mean), and the variance and mean have a quadratic relationship.

The next section is the exploratory data analysis to assess the suitability of the Poisson-Gamma Mix Model for this dataset.

Libraries

knitr::opts_chunk$set(echo=FALSE, message=FALSE, warning=FALSE)
library(readxl)
library(dplyr)
library(ggplot2)
library(lubridate)
library(stats)
library(patchwork)
library(geomtextpath)

source("utils.R")

set.seed(42)

Exploratory Data Analysis

The raw data contains mock weekly stock counts that mimic inventory data inspired from a real world biotech QC environment. The materials are high cost and have short expiry windows.

head(import_data)
## # A tibble: 6 × 11
##   material  date       stock     y     n Restock Excess rolling_y_in_cycle
##   <fct>     <date>     <dbl> <dbl> <dbl> <lgl>    <dbl>              <dbl>
## 1 Material1 2024-05-01    29     0     7 TRUE         0                  0
## 2 Material1 2024-05-07    75    11     6 TRUE         0                 11
## 3 Material1 2024-05-14    50    25     7 FALSE        0                 36
## 4 Material1 2024-05-21    36    14     7 FALSE        0                 50
## 5 Material1 2024-05-25   113     0     4 TRUE         0                  0
## 6 Material1 2024-06-04   111     2    10 FALSE        0                  2
## # ℹ 3 more variables: rolling_n_in_cycle <dbl>, total_y_prev_cycle <dbl>,
## #   total_n_prev_cycle <dbl>

The raw data originally contained only the Material Name, Weekly Stock Count, and Expiration Date (not shown). Expiration dates are hidden to respect data privacy.

However, to make any forecast, the units used in week is required. For this, an external Excel VBA script written externally that calculates the weekly units used y for all materials using combinations of checks such as: 1. Was restock performed in given week? 2. Is there expiry information available for the material in given week and previous week? 3. Did any units expire in previous week?

From this, all the other metrics in the data were calculated.

Data Visualization

The distribution of weekly use is right-tailed for both materials. There also is a positive correlation between restock cycle length (weeks) and the total number of units used in a given cycle. This indicated that the idea of forecasting usage across a specified time horizon could be plausible.

Mean-Variance analysis of rolling windows

For forecasting, rolling window analysis is a powerful method that can reveal trends in the data. Here, the drift of mean and the variance of different rolling windows are analyzed.

## 
## Call:
## lm(formula = rolling_var ~ rolling_mean + I(rolling_mean^2), 
##     data = rolling_stats)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -90.15 -38.22 -25.40  20.76 162.79 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       176.9971    93.3982   1.895 0.062397 .  
## rolling_mean      -19.0207     8.6722  -2.193 0.031760 *  
## I(rolling_mean^2)   0.6972     0.1866   3.737 0.000387 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 61.48 on 67 degrees of freedom
## Multiple R-squared:  0.642,  Adjusted R-squared:  0.6313 
## F-statistic: 60.07 on 2 and 67 DF,  p-value: 1.137e-15

## ratio_cv:  0.8948743
## 
## Call:
## lm(formula = rolling_var ~ rolling_mean + I(rolling_mean^2), 
##     data = rolling_stats)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.1454  -2.5738  -0.3656   2.5463  13.8005 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)  
## (Intercept)         2.8286    12.0346   0.235   0.8149  
## rolling_mean       -1.9392     4.2102  -0.461   0.6466  
## I(rolling_mean^2)   0.6246     0.3520   1.774   0.0806 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.045 on 67 degrees of freedom
## Multiple R-squared:  0.6839, Adjusted R-squared:  0.6744 
## F-statistic: 72.47 on 2 and 67 DF,  p-value: < 2.2e-16

## ratio_cv:  0.5458759

There is a positive correlation between mean and variance for both materials. In Material1, there is a plausible quadratic fit for the mean-variance relationship, whereas in Material2, the plausibility of the quadratic fit diminishes, although there is a positive correlation visually. The ratio of variance and mean appears to vary across the data for both materials and reaches high values in a large proportion of rolling windows.

In summary: 1. Weekly usage distributions are right-tailed. 2. There is positive correlation between restock cycle length and total number of units used per cycle. 3. Usage is overdispersed within rolling windows. 4. Window usage mean is positively correlated with window usage variance.

This indicates that the Poisson-Gamma conjugate model can be used to model the weekly use.

The Prior for Gamma rate parameter is chosen as the median of the length of one inventory period, which is typically 7 days and the prior for Gamma shape parameter is chosen as the median of the weekly use / prior for Gamma rate.

Negative Binomial Fit

Four functions are designed to perform the negative binomial fit:

  1. fit_nb_model: This function computes the posterior parameters for the Poisson–Gamma conjugate model, which leads to a Negative Binomial posterior predictive distribution.

Given weekly usage counts (y) and exposure times (n) and empirical Bayes priors (a0, b0) derived from the training window, the function returns: a = a0 + sum(y); b = b0 + sum(n)

These parameters define the posterior distribution of the weekly usage rate and will be used to generate predictive samples.

  1. predict_consumption_nb: This function uses the posterior parameters from fit_nb_model to simulate posterior predictive outcomes for the total usage over the forecast horizon H_days.

It draws Monte Carlo samples from:

\(Y_{future} \sim NegBinom(size = a,~ prob = \frac{b}{b ~+~ H_{days}})\)

This distribution represents all plausible future usages consistent with past data. The function returns: the median predictive consumption and the raw list of predictive samples (for use in decision-making)

  1. choose_restock_quantity: This function transforms the predictive distribution into an operational decision.

For a grid of possible inventory stock levels \(Q\), it computes the expected waste fraction:

\(Expected~Waste~Fraction(Q) = \frac {E[max(Q ~-~ Y, ~0)]} {Q}\) , where \(Y\) are the predictive samples.

The function identifies all \(Q\) values where:

\(Expected~Waste~Fraction(Q) \leq Waste Threshold\)

From these, it selects the maximum feasible stock level, i.e., the largest \(Q\) that still keeps expected waste below the threshold.

This separates forecasting (uncertainty modeling) from decision-making (choosing stock under business constraints).

  1. cross_validate_nb: This function performs rolling cross-validation across the time series:

At each time point, use a fixed training window. i. Fit the Bayesian model. ii. Generate the predictive distribution. iii. Apply the decision rule to select optimal stock. iv. Compare predicted vs. actual usage over the forecast horizon. v. Store posteriors, predicted, actual, residuals.

This produces a full time-series evaluation of both predictive performance, decision performance and illustrates how forecasting behaves under real operational variability.

Rolling-Origin Evaluation of Forecast

Rolling-origin evaluation is a standard validation technique used in time-series forecasting to obtain a series of forecasts over the validation set. Train window and test window sizes are chosen and multiple forecasts are performed along the validation set by moving through the chosen window lengths. Here, I have used this technique to assess the capacity of this modeling strategy to adapt to and capture varying shifts in the inventory data as we move along the windows.

Material 1

Rolling-Origin Evaluation

cross_validate_material1_object <- cross_validate_nb(data=import_data%>%filter(material=='Material1'), H_days=50, train_window=8, step=1, Nmc=1000, level_weights=FALSE, waste_frac_max=0.15)

Residual Plot

##   forecast_date actual_consumed prediction residual mean_data  var_data
## 1    2024-06-15             144         86       58    10.875  99.55357
## 2    2024-06-21             119        131      -12    16.125 189.55357
## 3    2024-07-02             112        140      -28    18.500 206.85714
## 4    2024-07-09              95        134      -39    17.750 200.21429
## 5    2024-07-13             103        134      -31    16.750 216.78571
## 6    2024-07-20             111        125      -14    16.750 216.78571
##   var_mean_ratio row_id
## 1       9.154351      1
## 2      11.755260      2
## 3      11.181467      3
## 4      11.279678      4
## 5      12.942431      5
## 6      12.942431      6

We observe that the residuals highly variable and cross tolerable thresholds. There is also no apparent correlation between residual magnitude and window dispersion (variance-mean ratio).

Posterior Plots for high residual predictions

Randomly sampled posteriors of high residual predictions show that the actual values typically fall in the tail of the posterior indicating that the Negative Binomial posterior alone could not explain the consumption behavior in those windows.

Material 2

Rolling-Origin Evaluation

cross_validate_material2_object <- cross_validate_nb(import_data%>%filter(material=='Material2'), H_days=50, train_window=8, step=1, Nmc=1000, level_weights=TRUE, waste_frac_max=0.15)

Residual Plots

##   forecast_date actual_consumed prediction residual mean_data var_data
## 1    2024-06-15              28         27        1     3.625 3.125000
## 2    2024-06-21              28         34       -6     4.375 1.410714
## 3    2024-07-02              31         29        2     4.125 2.696429
## 4    2024-07-09              32         29        3     4.125 2.696429
## 5    2024-07-13              38         29        9     3.875 3.839286
## 6    2024-07-20              40         23       17     3.375 5.696429
##   var_mean_ratio row_id
## 1      0.8620690      1
## 2      0.3224490      2
## 3      0.6536797      3
## 4      0.6536797      4
## 5      0.9907834      5
## 6      1.6878307      6

We observe that the residuals highly variable and cross tolerable thresholds. There is also no apparent correlation between residual magnitude and window dispersion (variance-mean ratio).

Posterior Plots for high residual predictions

Randomly sampled posteriors of high residual points show that the actual values can lie within the same quantile of the posterior, however, the tolerable threshold for this Material is too narrow compared to the distribution of residuals. There is also a fair chance for the actual value to lie on the tail of the posterior.

Lessons Learned

The Bayesian Analysis did not produce stable decision-grade forecasts under these constraints. The analysis shows that the inherent variability in usage is not well captured by a single stochastic process defined by fixed priors and likelihood, even with an overdispersed posterior such as negative binomial. This analysis revealed extent of complexity in the consumption process.

The critical insight is that some processes are more suited to be managed primarily using human oversight with timely communication rather than pure quantitative analysis.