Everything you need to know about SKU Forecaster
SKU Forecaster helps you predict future demand for your products using advanced time series forecasting. Here's how to get started:
For the fastest results, drag and drop a CSV file onto the app. It will auto-detect your columns and you can start forecasting immediately.
Pipelines let you connect to external data sources and automatically refresh your forecasts. Create a pipeline once, run it anytime.
| Connector | Use Case | Status |
|---|---|---|
| 🗄️ SQLite | Local database files (.db, .sqlite) | Full Support |
| 🐘 PostgreSQL | Production databases, data warehouses | Full Support |
| 🔷 SQL Server | Azure SQL, Microsoft SQL Server | Full Support |
| ❄️ Snowflake | Cloud data warehouse | Full Support |
| 🗂️ Parquet | Columnar data files, data lakes | Full Support |
| 📊 CSV/Excel | File uploads (drag & drop) | Full Support |
All database connectors support custom SQL queries. Use this to:
SELECT
product_id as sku,
sale_date as date,
SUM(quantity) as value
FROM sales
WHERE sale_date >= '2023-01-01'
GROUP BY product_id, sale_date
For accurate forecasts, map these columns correctly:
Unique product identifier (SKU, item_id, product_code)
Time period (date, week, month, period)
What to forecast (sales, units, revenue, quantity)
The app includes multiple forecasting methods. Use Auto to let the system choose the best model for each SKU.
Runs a leaderboard of all models below and picks the lowest-MAPE winner per SKU. With covariates enabled, the tree models (RF/GB/LightGBM) see event indicators + price as features and typically dominate.
Exponential smoothing with trend and seasonality. Great for stable demand. State-space, no exogenous regressors.
Autoregressive model. Good for data with trends and autocorrelation.
Facebook's model. Handles holidays and multiple seasonalities.
Gradient-boosted trees on lag features. Accepts exogenous covariates (events, price) when enabled.
Tree ensemble on lag features. Accepts exogenous covariates. Robust on noisy data with nonlinear interactions.
sklearn's GBM on lag features. Accepts exogenous covariates. Often wins on smooth nonlinear structure.
Bundled decomposition: level + seasonal + ML weighted blend. Returns named component weights for audit. Accepts covariates inside its ML slot.
For intermittent/sporadic demand (lots of zeros).
For best results, install: pip install statsmodels prophet lightgbm. Without these, the app falls back to simpler methods.
The app handles datasets with millions of rows and thousands of SKUs efficiently.
Process every SKU in the dataset
Focus on highest-volume products
Quick test on random subset
Pick specific SKUs to forecast
For large forecasts, jobs run in the background:
Forecast demand for SKUs with no sales history. skuf.ai offers three complementary approaches in the purple New Product Mode panel — pick by what data you have and what question you need answered.
Open New Product Mode → set the new SKU's attributes (category, subcategory, color, brand, price, etc.). The dropdowns pull from your active dataset, so only values actually present in your catalog appear. The badge counter shows how many attributes you've set.
Find historical SKUs that resemble your new one, then weighted-average their forecasts. A toggle above the button lets you pick how analogs get chosen:
Claude reads your target attributes and the catalog, returns analogs with a one-line reason per pick. Catches semantic similarity (e.g. T-shirts as analogs for Polos when subcategory differs). Slower (LLM round-trip) but interpretable.
Deterministic exact-attribute scoring: +2 per exact match, +1.5 for "similar" numerics. Returns the matches per analog so the ranking is defensible. Fast, repeatable, no LLM dependency.
After analogs are returned, adjust the selected set (the top 5 by score are picked by default), then click Run Forecast. Result is a weekly forecast for the new SKU, blended from analog histories.
A Ridge + Random Forest ensemble regression trained on every historical SKU's attributes → first-period sales volume. Predicts a single launch volume for the new item directly from its attributes — no analog matching needed.
implied_total_life into the decay-curve panel to convert the point estimate into a weekly schedule.Match the new SKU's attributes to a 52-week profile in your seasonality library, then multiply by a Base Sales Rate (BSR) the library carries from analog historical SKUs. Best when you have a strong matching profile and trust the analog calibration. See the methodology page section 4 (BSR) and section 5 (Seasonality library) for the math.
Strong library + matching attributes → Seasonal Component. Need weekly curve + history-based defensibility → Analog Blend. ≥20 historical SKUs + just need launch volume → Attribute-Based Launch Model. Many teams use Attribute-Based to set initial buy quantity and Analog Blend or Seasonal Component to schedule weekly receipts.
Improve forecast accuracy by feeding skuf.ai the upstream signals that drive demand. The platform supports two distinct paths:
When you create planning events and your dataset has a price column, skuf.ai can automatically build a per-SKU exogenous matrix and pass it to the tree-based models in Auto's leaderboard (Random Forest, Gradient Boost, LightGBM) at fit time. Set useExogenousCovariates: true on the job config and the system handles the rest:
has_holiday, has_promotion, has_other) — set to 1 in weeks the event covers, 0 otherwise.On a synthetic specialty-retailer benchmark (45 SKUs × 104 weeks, 5 event types, price elasticity), enabling auto-built covariates dropped Auto's mean MAPE from 33.14% to 24.41% on the same backtest — a 26% relative reduction with no model change, just better inputs.
For exogenous signals beyond events and price (weather, marketing spend, macro indicators), include them as columns in your data file and select them in the Covariates tab.
Custom column covariates require values for the forecast horizon. If you include "price" as a covariate, you must provide planned prices for the upcoming weeks. Auto-built event covariates handle this automatically — the planning_events table covers forward dates as well as historical ones.
AI-powered features help you understand your data and get actionable insights.
Get natural language explanations of your forecast results:
Automatic analysis of your data quality and patterns:
Natural language interface to run complex analyses:
"Forecast the top 50 SKUs by volume for the next 6 months"
"Find products with declining sales trends"
"Compare ETS vs ARIMA accuracy for seasonal items"
AI features require an Anthropic API key. Add it in Settings → API Keys, or set the ANTHROPIC_API_KEY environment variable.
Each forecast result includes:
| MAPE Range | Interpretation |
|---|---|
| < 10% | Excellent - highly accurate |
| 10-20% | Good - reliable for planning |
| 20-30% | Acceptable - use with caution |
| > 30% | Poor - consider more data or different model |
Download forecasts as CSV for Excel/BI tools
Save to app for later viewing
Interactive charts with zoom/pan
Access results via REST API
Access previous forecasts from the sidebar under "Forecast History". Each saved result includes:
Install Python 3.9+ and ensure it's in your PATH. Run: python --version to verify.
Check Python dependencies: pip install pandas numpy statsmodels
Verify credentials, check firewall rules, ensure database allows remote connections.
Check that your SKU/date columns are correct. Dates should be parseable (YYYY-MM-DD works best).
Use database connectors instead of CSV for 1M+ rows. Enable background jobs for large forecasts.
npm run dev shows detailed errorsnpm testHorizonAI SKU Forecaster v2.1