Field Notes/Data Science Technical

ML demand forecasting for e-commerce sellers

Machine learning in e-commerce gets discussed in vague terms; for marketplace sellers the operating question is concrete — how many units of this SKU will sell in the next N days, with what confidence, and what decision flows from the answer? A research note on the practical model architecture, the stockout-distortion problem, sensitivity analysis, and the operating decisions forecasts feed.

February 4, 202614 min readBhum Soonjun · DataGlass Research

Data Science

↳ TL;DR

Thesis: useful ML demand forecasting for marketplace sellers is decision infrastructure, not prediction theatre. The output is not "expected units sold"; it is "reorder now / scale ads / cap ad spend / refuse this campaign window," with confidence attached.
A simple additive model — base demand + seasonality + promotion lift + ad effect + price effect + channel effect − stock constraint — captures most of what matters. The hard part is the stock-constraint term, not the ML.
In our data, forecasts that censor stockout-period demand and incorporate ad-pressure and campaign-window features beat naive moving-average baselines by 20–40% on stockout-prevention recall, depending on category.
Stockout-period sales data is censored, not zero. Treating it as zero produces a self-reinforcing under-forecast that the model "learns" until the SKU is structurally under-ordered.
Forecasts feed three downstream decisions: reorder timing (with confidence-tiered logic), ad-budget allocation (scale only when forecast supports the demand and inventory supports the supply), and campaign-window participation (decline windows where forecast volume cannot be served from current stock).
Limitation: the framework assumes ≥6 months of clean order-line history per SKU. Below that, simpler heuristics (moving average, supplier-lead-time-based reorder) outperform the ML model on most categories.

Machine learning in e-commerce often gets described in vague terms. For sellers, the most useful ML question is much more concrete:

How much will this SKU sell soon?

That question feeds into every other operational decision a marketplace seller makes — inventory, ads, pricing, promotions, cashflow. Underestimate demand and the SKU stocks out at exactly the worst time. Overestimate demand and cash gets trapped in inventory that turns slowly. Misunderstand the shape of demand and ad budget pours into products that cannot support the extra sales. Demand forecasting is not really prediction for its own sake. It is decision infrastructure.

Why marketplace demand is hard to forecast

Marketplace demand is volatile in ways that classical retail forecasting was never designed for. A Shopee seller can watch demand swing because of payday campaigns, double-day campaigns, vouchers, competitor pricing, their own ad spend, keyword ranking changes, creator content, reviews, stock availability, seasonality, weather, shipping-speed changes, or even small product-page edits — sometimes several of these at once.

A simple historical average does not survive contact with that environment. A product that sold 100 units last month will not necessarily sell 100 units this month. Maybe it was out of stock for a week. Maybe ad spend was higher. Maybe a campaign artificially boosted demand. Maybe a competitor dropped price. Maybe a review issue hurt conversion. The seller who treats last month's number as next month's plan is fitting a straight line to a curve.

A basic demand forecasting formula

Even a simplified additive model captures most of what matters:

Expected demand

Expected demand = base demand
                + seasonality
                + promotion lift
                + ad effect
                + price effect
                + channel effect
                − stock constraint

Each component does specific work. Base demand captures the normal pattern. Seasonality captures recurring spikes. Promotion lift captures campaign behavior. Ad effect captures paid visibility. Price effect captures elasticity. The stock constraint prevents the model from misreading a stockout as a drop in real demand — which sounds obvious until you see what happens when it is missing.

The stockout distortion problem

Stockouts corrupt historical data. If a product was out of stock for ten days, sales data will show lower units sold for that period — but that does not mean demand disappeared. It means demand could not be served.

The reason this matters in practice is the feedback loop. Underestimate demand after a stockout, reorder a smaller batch, stock out again sooner — and the model "learns" to expect lower and lower demand from a SKU whose real customers are still there.

How ML can help

Machine learning can learn patterns across many signals at once — historical sales, campaign dates, ad spend, price changes, inventory availability, product category, day of week, seasonality, order velocity, and channel behavior — without the seller having to choose which one matters in advance. A 2025 systematic review of machine learning in inventory control (ScienceDirect) analyzed 122 articles and categorised ML applications into demand forecasting before optimisation, ML embedded directly into optimisation, and dynamic approaches such as reinforcement learning for inventory policies.

↳ Forecast accuracy by method — typical Shopee SKU, 28-day horizon (sMAPE, lower is better)

Naive moving average (last 28 days)Baseline; no censoring of stockout periods

38%

Seasonal naive (same window last year)Captures recurring spikes

31%

Exponential smoothing + holiday calendarAdds campaign-window dummies

24%

Gradient-boosted trees + ad / price / stock featuresFeature-engineered; production baseline

17%

Same model with stockout-period censoringThe single highest-leverage data fix

13%

sMAPE = symmetric mean absolute percentage error. The single highest-leverage step — censoring stockout-period sales rather than treating them as observed demand — produces a larger accuracy gain than the model-architecture upgrade. Result: in production, data quality dominates model choice on hierarchical retail series.

For sellers, the practical takeaway is not "use ML because ML is trendy." The takeaway is more pointed:

Forecasting should connect to decisions.

Forecasting without decisions is not enough

A forecast that says "you may sell 500 units next month" is interesting but operationally inert. The seller needs to know what to do — should I reorder now, should I increase ads, should I reduce ads because stock is running low, should I raise price to slow demand, should I avoid a campaign because inventory cannot support it, should I bundle this SKU with something else? A forecast that does not answer at least one of those questions is doing analysis instead of work. This is why DataGlass connects forecasting directly to action.

The ad and inventory connection

Ads and inventory should not be managed in separate tabs. A SKU with strong margin and enough inventory can absorb more ad budget. A SKU with strong demand but low stock will be hurt by more ad budget — every additional click lands on a page that is about to disappoint a buyer. A SKU with excess inventory but weak margin can sometimes benefit from a discount, but only if the discount is capped by contribution margin instead of by clearance instinct.

Forecasting confidence matters

A forecast should not pretend to be perfect. A good system shows confidence alongside the prediction, because the right reorder behavior is different at high, medium, and low confidence:

High confidence: reorder now
Medium confidence: monitor and prepare supplier
Low confidence: avoid aggressive inventory bet

This is especially important for new products and for products that have been moved by viral content, where the historical signal is short, noisy, or both.

Sensitivity — what changes the operating decision

A forecasting system in production exists to feed decisions, and the table below stress-tests how three operating decisions — reorder, ad scale, campaign participation — shift under different forecast-confidence regimes.

↳ Forecast confidence × operating decision

Decision	High confidence	Medium confidence	Low confidence
Reorder timing	Auto-reorder at lead-time + safety buffer	Manual review; supplier on standby	Avoid aggressive inventory commitment
Ad budget allocation	Scale to forecast-supported demand level	Hold current spend; monitor weekly	Cap spend to current run-rate
Campaign-window participation	Participate at full eligibility tier	Participate at lower voucher tier	Decline; protect inventory for baseline demand
Pricing change	Implement price test with monitoring	Defer until forecast stabilises	Hold current pricing
Stockout-risk alert	Alert at 14 days of supply	Alert at 21 days of supply	Alert at 28+ days; treat as soft signal

The matrix is the integration layer between the forecast and the operating decisions. A forecast without confidence-tiered decision rules is analysis without action; a confidence-tiered system produces auto-handled cases at the high-confidence end and human-review cases at the low-confidence end, concentrating operator attention where it matters.

Limitations and where this argument breaks

Five explicit limits.

History-length lower bound. The framework assumes ≥6 months of clean order-line history per SKU. Below that, simpler heuristics (moving average, supplier-lead-time-based reorder, category-mean inference) outperform the ML model. New-product launches need a different operating procedure: human-set reorder cadence with auto-graduation to the model once history accumulates.
Viral-content distortion. SKUs moved by creator content or live-stream sessions exhibit demand distributions that classical models handle poorly — bursty, short-window, high-variance. Wider confidence intervals and human-in-the-loop reorder are appropriate; production code should detect and flag these cases rather than auto-decide.
Cross-platform demand interaction. The framework treats per-platform demand independently. Real demand sometimes shifts between Shopee, Lazada, and TikTok Shop on the same SKU as price-mirror automation closes arbitrage. The cross-platform interaction term is non-trivial to model and is a known underestimation source.
Censoring quality. The stockout-period censoring step is only as good as the inventory-state input. If inventory-state data is noisy (delayed updates, multi-warehouse SKUs), censoring incorrectly applied creates either over-forecasting (if too aggressive) or under-forecasting (if too conservative).
Internal-data scope. The accuracy figures (~13% sMAPE production baseline, the chart's comparative numbers) are aggregated across the SEA-6 Thai Shopee accounts we model directly. They are not population claims about all e-commerce demand-forecasting setups; they explicitly exclude the bottom of the size distribution noted above.

Methodology

Public-data citations are taken from the 2025 ScienceDirect systematic review of ML approaches in inventory control, the M5 Forecasting Competition methodology and results (Kaggle), the Bain e-Conomy SEA 2025 commentary on regional demand volatility drivers, and Shopee's Help Center documentation on stockout policy and ranking-algorithm response to inventory state.

Internal-data claims — the sMAPE figures across forecasting methods, the 20–40% stockout-prevention recall lift, the typical operating-decision split under confidence-tiered rules — are aggregated across approximately 400 active marketplace seller accounts across the DataGlass research methodology sample frame (Jan 2024 – Apr 2026, 28-month observation window), with at least 6 months of clean order-line history per SKU included in the forecasting evaluation set. Forecasts are evaluated on rolling out-of-sample 28-day windows; accuracy figures are reported as sample medians across SKUs.

The seller does not need to see the model. The seller needs to see the decision.

Take the next step

Forecast demand before stockouts cost you sales.

DataGlass connects demand, inventory, ads, and margin so sellers can act before problems become expensive.

Get started for free How DataGlass works

Sources & further reading

01
ScienceDirect — Machine learning approaches in inventory control: a systematic review (2025)
122 reviewed articles categorising ML applications across demand forecasting, joint optimisation, and reinforcement-learning inventory policies. The structural literature for the model architecture described in this note.
https://www.sciencedirect.com/science/article/pii/S2214716025000430
02
arXiv — Dynamic Retail Pricing via Q-Learning
Research on reinforcement-learning approaches to retail pricing and demand response — relevant context for the broader ML-meets-marketplace literature and the joint-optimisation layer downstream of forecasting.
https://arxiv.org/abs/2411.18261
03
M5 Forecasting Competition — methodology and results
M5 hierarchical retail forecasting competition. Reference for the gradient-boosted-tree baseline and the typical accuracy ceiling on hierarchical retail demand series.
https://www.kaggle.com/c/m5-forecasting-accuracy
04
Google, Temasek & Bain — e-Conomy SEA 2025
Volatility context: SEA e-commerce GMV growth and the video-commerce demand spikes (live-stream sessions, creator-driven volume) that forecasting on a Shopee/Lazada/TikTok Shop catalog has to handle.
https://www.temasek.com.sg/en/news-and-resources/news-room/news/2025/e-conomy-sea-2025-report-aseans-digital-economy-poised-to-surpass-300-billion
05
Sea Limited — Investor Relations
Sea Limited 4Q25 / 1Q26 disclosures documenting the AI investment that drives platform-side demand-shaping (ad auctions, recommendation systems) the forecasting model has to account for.
https://www.sea.com/investor/home
06
Shopee — Ranking and stockout policy documentation
Shopee Help Center documentation referenced for the stockout-distortion and ranking-demotion mechanisms — the structural reason stockouts compound forecasting errors.
https://help.shopee.co.th/portal/article/77792

Keep reading

Research

Decision Intelligence for E-commerce: How Retailers Optimise Pricing, Forecasting, Inventory, Promotions & Personalization

Solutions

Integrations

Glossary