← Back to App

Help Guide

Everything you need to know about SKU Forecaster

Contents

🚀 Getting Started

SKU Forecaster helps you predict future demand for your products using advanced time series forecasting. Here's how to get started:

  1. Load your data - Upload a CSV/Excel file or connect to a database using Pipelines
  2. Map your columns - Tell the app which columns contain SKU IDs, dates, and values
  3. Select SKUs - Choose which products to forecast (or select all)
  4. Run forecast - Click "Run Forecast" and select your preferred model
  5. Review & export - Analyze results, compare models, and export to CSV
💡 Quick Start

For the fastest results, drag and drop a CSV file onto the app. It will auto-detect your columns and you can start forecasting immediately.

🔌 Data Sources & Pipelines

Pipelines let you connect to external data sources and automatically refresh your forecasts. Create a pipeline once, run it anytime.

Supported Connectors

Connector Use Case Status
🗄️ SQLite Local database files (.db, .sqlite) Full Support
🐘 PostgreSQL Production databases, data warehouses Full Support
🔷 SQL Server Azure SQL, Microsoft SQL Server Full Support
❄️ Snowflake Cloud data warehouse Full Support
🗂️ Parquet Columnar data files, data lakes Full Support
📊 CSV/Excel File uploads (drag & drop) Full Support

Creating a Pipeline

  1. Click + New Pipeline in the sidebar
  2. Select your data source type
  3. Enter connection details (host, credentials, etc.)
  4. Click Test Connection to verify
  5. Preview data and map columns
  6. Save the pipeline for future use

Custom SQL Queries

All database connectors support custom SQL queries. Use this to:

SELECT 
    product_id as sku,
    sale_date as date,
    SUM(quantity) as value
FROM sales
WHERE sale_date >= '2023-01-01'
GROUP BY product_id, sale_date

📈 Running Forecasts

Basic Forecasting

  1. Load your data (file upload or pipeline)
  2. Select SKUs from the list (or click "Select All")
  3. Set Forecast Horizon (number of periods to predict)
  4. Choose a forecasting model (or use Auto)
  5. Click Run Forecast

Column Mapping

For accurate forecasts, map these columns correctly:

📦 SKU Column

Unique product identifier (SKU, item_id, product_code)

📅 Date Column

Time period (date, week, month, period)

📊 Value Column

What to forecast (sales, units, revenue, quantity)

Forecast Options

🧮 Forecasting Models

The app includes multiple forecasting methods. Use Auto to let the system choose the best model for each SKU.

🔄 Auto (Recommended)

Runs a leaderboard of all models below and picks the lowest-MAPE winner per SKU. With covariates enabled, the tree models (RF/GB/LightGBM) see event indicators + price as features and typically dominate.

📊 ETS

Exponential smoothing with trend and seasonality. Great for stable demand. State-space, no exogenous regressors.

📉 ARIMA

Autoregressive model. Good for data with trends and autocorrelation.

🔮 Prophet

Facebook's model. Handles holidays and multiple seasonalities.

🌲 LightGBM

Gradient-boosted trees on lag features. Accepts exogenous covariates (events, price) when enabled.

🌳 Random Forest

Tree ensemble on lag features. Accepts exogenous covariates. Robust on noisy data with nonlinear interactions.

📈 Gradient Boost

sklearn's GBM on lag features. Accepts exogenous covariates. Often wins on smooth nonlinear structure.

🌀 Centrifuge

Bundled decomposition: level + seasonal + ML weighted blend. Returns named component weights for audit. Accepts covariates inside its ML slot.

⚡ Croston / SBA / TSB

For intermittent/sporadic demand (lots of zeros).

Model Selection Tips

💡 Install Optional Dependencies

For best results, install: pip install statsmodels prophet lightgbm. Without these, the app falls back to simpler methods.

📦 Large Datasets

The app handles datasets with millions of rows and thousands of SKUs efficiently.

Large Dataset Features

SKU Selection Options

🎯 All SKUs

Process every SKU in the dataset

📊 Top N by Volume

Focus on highest-volume products

🎲 Random Sample

Quick test on random subset

✅ Manual Selection

Pick specific SKUs to forecast

Background Jobs

For large forecasts, jobs run in the background:

  1. Configure your forecast settings
  2. Click Start Background Job
  3. Monitor progress in the Jobs panel
  4. Results auto-save when complete
  5. Download or view results anytime

🆕 New Product Forecasting

Forecast demand for SKUs with no sales history. skuf.ai offers three complementary approaches in the purple New Product Mode panel — pick by what data you have and what question you need answered.

🎬
Interactive walkthrough — 8 slides, ~3 minutes
See the full new-product flow with mock UI screenshots: the AI / Rule analog toggle, the Attribute-Based Launch Model output, and the recommended combined workflow.
▶️ Open walkthrough

Step 1 — Define attributes

Open New Product Mode → set the new SKU's attributes (category, subcategory, color, brand, price, etc.). The dropdowns pull from your active dataset, so only values actually present in your catalog appear. The badge counter shows how many attributes you've set.

Step 2 — Pick your method

🔍 Analog Blend (Find Similar SKUs)

Find historical SKUs that resemble your new one, then weighted-average their forecasts. A toggle above the button lets you pick how analogs get chosen:

🤖 AI-Powered

Claude reads your target attributes and the catalog, returns analogs with a one-line reason per pick. Catches semantic similarity (e.g. T-shirts as analogs for Polos when subcategory differs). Slower (LLM round-trip) but interpretable.

📐 Rule-Based

Deterministic exact-attribute scoring: +2 per exact match, +1.5 for "similar" numerics. Returns the matches per analog so the ranking is defensible. Fast, repeatable, no LLM dependency.

After analogs are returned, adjust the selected set (the top 5 by score are picked by default), then click Run Forecast. Result is a weekly forecast for the new SKU, blended from analog histories.

🧠 Attribute-Based Launch Model

A Ridge + Random Forest ensemble regression trained on every historical SKU's attributes → first-period sales volume. Predicts a single launch volume for the new item directly from its attributes — no analog matching needed.

📅 Seasonal Component Forecast

Match the new SKU's attributes to a 52-week profile in your seasonality library, then multiply by a Base Sales Rate (BSR) the library carries from analog historical SKUs. Best when you have a strong matching profile and trust the analog calibration. See the methodology page section 4 (BSR) and section 5 (Seasonality library) for the math.

Common attribute columns

💡 Pick the right method

Strong library + matching attributes → Seasonal Component. Need weekly curve + history-based defensibility → Analog Blend. ≥20 historical SKUs + just need launch volume → Attribute-Based Launch Model. Many teams use Attribute-Based to set initial buy quantity and Analog Blend or Seasonal Component to schedule weekly receipts.

📊 Covariates & External Data

Improve forecast accuracy by feeding skuf.ai the upstream signals that drive demand. The platform supports two distinct paths:

Path A — Auto-built event + price covariates (recommended)

When you create planning events and your dataset has a price column, skuf.ai can automatically build a per-SKU exogenous matrix and pass it to the tree-based models in Auto's leaderboard (Random Forest, Gradient Boost, LightGBM) at fit time. Set useExogenousCovariates: true on the job config and the system handles the rest:

📐 Measured impact

On a synthetic specialty-retailer benchmark (45 SKUs × 104 weeks, 5 event types, price elasticity), enabling auto-built covariates dropped Auto's mean MAPE from 33.14% to 24.41% on the same backtest — a 26% relative reduction with no model change, just better inputs.

Path B — Custom column covariates (advanced)

For exogenous signals beyond events and price (weather, marketing spend, macro indicators), include them as columns in your data file and select them in the Covariates tab.

  1. Include covariate columns in your data file
  2. Open the Covariates tab
  3. Select which columns to use
  4. Set lag values if needed (e.g., price effect delayed 1 period)
  5. Run forecast — covariates auto-included for the models that support them

Which models use covariates

⚠️ Future Values Required

Custom column covariates require values for the forecast horizon. If you include "price" as a covariate, you must provide planned prices for the upcoming weeks. Auto-built event covariates handle this automatically — the planning_events table covers forward dates as well as historical ones.

🤖 AI Features

AI-powered features help you understand your data and get actionable insights.

AI Insights

Get natural language explanations of your forecast results:

Dataset Intelligence

Automatic analysis of your data quality and patterns:

AI Agent

Natural language interface to run complex analyses:

"Forecast the top 50 SKUs by volume for the next 6 months"
"Find products with declining sales trends"
"Compare ETS vs ARIMA accuracy for seasonal items"
💡 API Key Required

AI features require an Anthropic API key. Add it in Settings → API Keys, or set the ANTHROPIC_API_KEY environment variable.

📋 Results & Export

Understanding Results

Each forecast result includes:

Forecast Accuracy (MAPE)

MAPE Range Interpretation
< 10% Excellent - highly accurate
10-20% Good - reliable for planning
20-30% Acceptable - use with caution
> 30% Poor - consider more data or different model

Export Options

📊 CSV Export

Download forecasts as CSV for Excel/BI tools

💾 Save Results

Save to app for later viewing

📈 Charts

Interactive charts with zoom/pan

🔗 API Access

Access results via REST API

Saved Results

Access previous forecasts from the sidebar under "Forecast History". Each saved result includes:

🔧 Troubleshooting

Common Issues

❌ "Python not found" error

Install Python 3.9+ and ensure it's in your PATH. Run: python --version to verify.

❌ Forecasts fail with "Exit 1"

Check Python dependencies: pip install pandas numpy statsmodels

❌ Database connection failed

Verify credentials, check firewall rules, ensure database allows remote connections.

❌ "No data" after column mapping

Check that your SKU/date columns are correct. Dates should be parseable (YYYY-MM-DD works best).

❌ Slow performance with large files

Use database connectors instead of CSV for 1M+ rows. Enable background jobs for large forecasts.

Getting Help

Performance Tips

HorizonAI SKU Forecaster v2.1

Back to App