Forecasting Without Writing Python

The situation

Priya is a category manager at a mid-size retail business. She owns 400 SKUs across homewares. Her CSV export from the data warehouse has 78 weekly rows per SKU (18 months of history), with columns: sku, week_ending, units_sold, avg_unit_price, promo_flag, competitor_promo_flag, stock_out_days, weather_index, category. The ask from finance is a 13-week forward forecast of units sold per SKU, deliverable in two weeks, with enough of an explanation that a director can challenge it without Priya needing a data scientist in the room.

Priya knows Excel well enough to build a naive seasonal-average forecast, but finance has asked for something better: one that accounts for promotions, stock-outs (units-sold is artificially capped in weeks where stock ran out), and the weather-index column the ops team started tracking last year. She knows pivot tables, not Python. Hiring a consultant is on the table but slow; the ML team can help in Q3, which is too late.

The platform team has AWS available. Someone in the data team has muttered about “no-code ML tools,” but nobody has been precise about which one or whether it fits.

What actually matters

Before reaching for a tool, pin down what shape of problem this is and what the data can honestly tell us.

The first question is what kind of problem this actually is. Forecasting weekly units-sold from historical data is a time-series forecasting problem: the target is a number over time, history is ordered, seasonality matters, and exogenous variables (promo flag, weather index) may explain variation. It’s not a classification problem (“will this SKU sell out?”), not a regression on cross-sectional features (“predict price from SKU attributes”), not an image or text problem. Anything we pick has to treat time as a first-class axis.

The second is the shape of the data. 400 SKUs × 78 weeks is 31,200 rows. That’s small by ML standards but each SKU has only 78 points of history, which isn’t a lot for any individual series. There’s a real choice between fitting one model per SKU (each model starves on 78 points) and fitting one global model across all SKUs (the model learns patterns that transfer between series, so a SKU with 20 weeks of history benefits from the other 399). For a 400-item catalogue with short histories, the global-model approach is the one that earns its keep.

The third is exogenous features. promo_flag and competitor_promo_flag are known in advance for future weeks (the promo calendar is set); weather_index is not known in advance (it’s a forecast of its own). The right framing distinguishes between related time series that are known in advance (we can include future values, and the model will use them) and those that only have historical values (the model uses the history to learn correlations, but can’t see future values at prediction time). Getting this distinction correct matters: classifying weather_index as known-in-advance leaks future information in trainingTrainingThe process of fitting a model’s weights to data by minimising a loss function. and produces optimistic backtests that don’t hold up live.

The fourth is stock-outs. Units-sold in a stock-out week is censored – demand existed, but supply capped what was recorded. A forecast trained on raw units-sold learns that demand drops in those weeks, which is wrong. The fix is data preparation: either exclude stock-out weeks from training, or adjust the target using the stock_out_days column (e.g. if stock_out_days >= 4, flag the row as unreliable). That’s a feature engineeringFeature (ML)An input variable to a model – the numeric or categorical signals you compute from raw data and feed in. decision, not a tool decision; whatever we pick, the business user has to own this choice and document it.

The fifth is explainability. Finance will ask “why is the Q2 forecast 20% higher than last year’s Q2?” The chosen tool has to produce some combination of feature importance charts, per-prediction explanations, and a “what-if” capability so that a director can challenge the forecast without a data scientist in the room. Black-box predictions that beat the baseline by 5% but can’t be narrated are worse than a transparent forecast that loses 5% of accuracy.

The sixth is operationalisation. A one-off forecast is a CSV download. A repeatable quarterly forecast is a scheduled job. If this forecast is going to run every quarter, the tool needs a path from “model built in the UI” to “model called on a schedule” without rebuilding from scratch each time. Otherwise we’re committing to clicking through the same wizard four times a year for as long as the business cares about the answer.

And finally, the audience constraint. Priya knows Excel and SQL. She doesn’t know Python, statistics-as-code, or Jupyter. Anything that requires writing a notebook – even a friendly one – shifts the work back to the ML team and defeats the whole point. The tool has to be navigable by a category manager.

What we’ll filter on

Six filters, applied to the forecast-building tools Priya could use.

No-code interface – does the tool let a non-coder build and run the model?
Handles time-series forecasting natively – as a first-class problem type, not cross-sectional regression?
Supports exogenous features (known-in-advance vs. historical-only)?
Explainability – feature importance and per-prediction explanations?
Repeatable – can the same model run on a schedule without rebuilding in the UI?
Priced appropriately for 400 SKUs × quarterly cadence?

The no-code ML landscape

SageMaker Canvas. AWS’s no-code ML workspace. Supports tabular classification and regression (via AutoML), time-series forecasting, image and text classification, and GenAI-backed exploration (ask-your-data via a foundation model). For time-series, it runs a SageMaker Autopilot AutoML job under the hood, trying multiple algorithm families (DeepAR, CNN-QR, ETS, ARIMA, Prophet) and selecting the best by backtest. Explanation via feature importance charts; registration to Model Registry for scheduled reuse. Priced per-session-hour for the UI plus the underlying training and inferenceInferenceRunning a trained model to produce output – as opposed to training it. costs.
QuickSight + ML Insights. QuickSight is AWS’s BI tool; ML Insights adds anomaly detection and a forecasting feature that produces simple time-series forecasts on visualisations using an internal algorithm. Useful for quick “what’s the trend?” answers directly in a dashboard, not for a model-quality forecast with exogenous features or explainability. Good for situational awareness; not the correct tool for a 400-SKU production forecast.
Amazon QuickSight Q / Amazon Q in QuickSight. The natural-language interface to QuickSight. Answers questions like “what was last quarter’s top-selling SKU?” in English. Not a forecasting tool; complementary to a forecast once it exists.
SageMaker Autopilot (direct). The AutoML backbone Canvas uses. Callable via the SageMaker SDK or Studio UI. Produces the same models Canvas does but requires a user comfortable enough with notebooks to trigger jobs, inspect candidates, and call endpoints. The path a data scientist would take; not the path for a non-coder.
Amazon Forecast (retired as standalone). Was a dedicated time-series forecasting service. Functionality folded into SageMaker Canvas’s time-series forecast type. Mentioned for historical context; don’t plan new work against it as a separate service.
A third-party Excel add-in or a spreadsheet model. Priya could build an ETS or seasonal-naive model in Excel or a forecasting add-in. Cheap, familiar, but limited – hard to include exogenous features, hard to evaluate honestly, hard to explain beyond “I used a trend line.” Not a scaling answer for 400 SKUs with exogenous drivers.

Side by side

Tool	No-code	Time-series native	Exogenous features	Explainability	Repeatable	Sized for this
SageMaker Canvas	✓	✓	✓	✓	✓	✓
QuickSight ML Insights	✓	Partial	✗	✗	✓	✗
Amazon Q in QuickSight	✓	✗	✗	✗	N/A	✗
SageMaker Autopilot (direct)	✗	✓	✓	✓	✓	✓ (wrong audience)
Excel add-in	✓	Partial	Manual	✗	Partial	Limp

Only one tool ticks every box for the scenario: SageMaker Canvas in time-series forecast mode. The others either can’t handle the problem shape (QuickSight, Excel) or are the wrong audience (Autopilot directly).

Canvas in time-series mode

Six stages; Canvas automates four of them. The business judgement lives in preparation (what's a stock-out worth?) and review (does this backtest make sense?).

The pick in depth

Canvas time-series forecast, trained on the prepared dataset, registered for quarterly reuse.

The import is a two-click exercise: Canvas reads from S3 (or Snowflake, Redshift, Athena, or a direct upload up to 5 GB). Priya’s CSV lands in a dataset that Canvas can inspect.

Preparation in Data Wrangler. Canvas has an embedded Data Wrangler view – a visual transform builder. Priya’s required transforms:

Exclude unreliable rows. A filter step: stock_out_days < 4. Weeks where stock was out for more than half the week are removed from training. The alternative – scaling units_sold up to impute demand – is defensible but introduces assumptions; excluding is cleaner for a first pass.
Parse timestamp. Confirm week_ending is recognised as a date with weekly frequency.
Derive features. Add week_of_year and month columns (Canvas offers one-click “extract date parts”). These give the model explicit seasonality signals.
Confirm types. promo_flag and competitor_promo_flag should be categorical (not numeric), units_sold should be numeric, sku should be categorical as the item identifier.

The prepared dataset gets exported as the training input.

Forecast configuration. In Canvas’s time-series flow:

Target column: units_sold
Item identifier: sku (the column that distinguishes one time series from another)
Timestamp: week_ending
Frequency: Weekly
Forecast horizon: 13 (weeks)
Forecast quantiles: P10, P50, P90 (this is the spread of the probabilistic forecast; finance can see downside and upside, not just a point estimate)
Related time series – known in advance: promo_flag, competitor_promo_flag, week_of_year, month. These are all known for future weeks because the promo calendar is set and calendar features are deterministic.
Related time series – historical only: avg_unit_price, weather_index. Unknown for future weeks; the model uses their history to learn correlations but must impute them for prediction.
Item metadata: category. Static attributes of each SKU – useful for the model to learn category-level patterns.

Training. Canvas runs a SageMaker Autopilot job that tries several algorithms: DeepAR+ (a deep-learning autoregressive model that pools across items), CNN-QR (convolutional quantile regression), ETS (exponential smoothing), ARIMA, and Prophet. For 400 items × 78 weeks, typical training time is 2-4 hours. The job backtests each candidate on a rolling-origin split (train on weeks 1-65, predict 66-78; train on weeks 1-52, predict 53-65; etc.) and scores each on weighted quantile loss (wQL) at the chosen quantiles.

The winning model is usually DeepAR+ for retail-style data with many items, because it pools information across items – a SKU with 20 weeks of history benefits from what the model has learned about the other 399. For smaller datasets or single-item forecasts, classical methods (ETS, ARIMA) often win.

Review. Canvas presents a dashboard:

Accuracy metrics: wQL, MAPE (mean absolute percentage error), and RMSE on the backtest. Priya compares to her naive seasonal-average baseline – if Canvas’s model doesn’t beat it by a material margin, the added complexity isn’t earning its keep.
Feature importance: a bar chart showing which columns drove predictions. If promo_flag and week_of_year dominate, the story is coherent; if category alone dominates, the model may be learning a category-level average and ignoring within-category variation.
Per-SKU plots: historical vs. forecast on held-out weeks. Priya clicks through a sample of 20 SKUs and eyeballs whether the forecasts look reasonable. This is the human judgement step that no backtest metric captures.
What-if: for a chosen future week, override promo_flag from 0 to 1 and see the forecast shift. This is the explainability story for finance: “if we don’t run the Q2 promo, the forecast drops 15%.”

Prediction. Canvas generates a forecast CSV: one row per SKU × future week × quantile, written back to S3. For a quarterly cadence, Priya registers the model to SageMaker Model Registry and an engineering partner wires up an EventBridge Scheduler rule that calls a Lambda that triggers a SageMaker batch-transform job on the registered model each quarter. Priya re-uses the same model for three quarters, retrains in Canvas when accuracy starts drifting or when new SKUs enter the catalogue.

The honest limits

Canvas isn’t magic. A few things to name:

Small-history SKUs are still hard. A SKU with 12 weeks of data has no seasonal history; the model imputes from category peers, but confidence is low. Priya flagged these as fallback-to-human for the first forecast, which is the correct call. Trust the model where the data supports it.

Exogenous-feature honesty. Classifying weather_index as known-in-advance would leak future information into training – Canvas would learn to “use next month’s weather” and produce a spuriously good backtest that fails in production. Classifying historical-only is the correct answer; accept that the model uses history of weather, not future, and that it might miss a forecast-able weather-driven shift.

Stock-out handling is a modelling choice, not a tool choice. Canvas can’t know what a stock-out week’s “true” demand was. Priya chose to exclude them; someone else might scale up using stock_out_days as a censoring indicator. Either is defensible; the choice should be documented so the next quarter’s forecast is consistent.

The quantile spread is real information. A P10-to-P90 range that’s narrow says the model is confident; wide says the model doesn’t know. Finance should not be given only the P50 – the range is part of the story. If the width is embarrassingly wide, the honest answer is “this forecast is a rough guide, not a commitment.”

What’s worth remembering

SageMaker Canvas is AWS’s no-code ML interface. Tabular classification and regression via Autopilot; time-series forecasting as a first-class mode; image and text classification; GenAI-backed data exploration. Business analysts, product managers, and category managers are the audience.
Time-series forecasting is a distinct problem type. Target over time, ordered history, seasonality, exogenous features. Don’t solve it with cross-sectional regression; Canvas’s time-series mode is the correct tool.
The global-model approach pools across items. 400 SKUs × 78 weeks is better trained as one model over 400 series than as 400 separate models. Short-history items benefit from patterns learned on richer series.
Classify exogenous features correctly. Known-in-advance (promo calendar, calendar features) go into future predictions directly. Historical-only (weather, price) inform via lag correlations but aren’t known for future weeks. Misclassifying leaks future information and produces optimistic backtests.
Data preparation is where business judgement lives. Canvas doesn’t know what a stock-out week means. Filtering or adjusting target values is a modelling choice; document it.
Backtest metrics plus per-item plots plus feature importance is the review triangle. Don’t trust one metric alone; eyeball a sample of forecasts, confirm the drivers look sensible, and compare to a naive baseline before trusting the model.
Quantile forecasts give finance downside and upside. P10/P50/P90 is more honest than a single point estimate. Narrow quantile spread means confidence; wide means the model doesn’t know, and saying so is better than faking precision.
Model Registry + EventBridge + batch transform is the quarterly-cadence plumbing. Canvas builds the model; an engineering partner wires the schedule. One Canvas build can serve several quarters before retraining is warranted.

A category manager with a spreadsheet-level skill set and two weeks can produce a defensible 13-week forecast for 400 SKUs, with quantile uncertainty and feature-attribution explanations, using Canvas. The ML team stays free for the harder problems. What Canvas gives up – the last few percentage points of accuracy a hand-tuned model might squeeze – is usually worth trading for the months of analyst time it returns.