Choosing Between QuickSight SPICE and Direct Query

December 20, 2027 · 15 min read

The situation

We operate a suite of five dashboards in QuickSight.

Daily Sales Review, used by the commercial team, ~50 viewers a day, refreshed once a morning against yesterday’s closed sales. Viewers want “yesterday’s number”, not live.
Ops Live, used by the ops floor, ~40 viewers with the dashboard open continuously, backed by a pager-duty-adjacent view of active incidents. Viewers want now; a minute of staleness is a minute of wrong answer.
Exec Weekly, ten viewers, a trend view refreshed weekly. Used on laptops, flown through, then closed until next Friday.
Customer Detail Drill-through, used by support, ~100 viewers, filters by customer ID and shows that customer’s live order history. Very variable per-query, very cardinal (a million customers, filter by one).
Finance Closing, used once per month-end, heavy calculations, thirty viewers on the closing day all hitting the dashboard within a two-hour window.

Five dashboards, five shapes of use, one warehouse underneath, and one QuickSight tenant to configure. The question for each is whether to pull data into SPICE or query direct. The wrong choice on each side has a specific cost: cache what should be live and the ops floor sees stale data; query direct when the traffic is bursty and the warehouse capacity becomes the bottleneck and every user waits in the queue.

What actually matters

Before reaching for either mode, it’s worth separating three different kinds of “fast” that affect dashboard performance.

Query latency is how long a single user’s click-to-render takes. Depends on the query plan, the data size, and the load on the underlying engine. A purpose-built in-memory cache tends to deliver predictable sub-second latency; live queries against the warehouse give whatever the warehouse gives you, which ranges from hundreds of milliseconds to tens of seconds depending on the engine, workload management, and how well the query tunes.

Concurrency is what happens when many users query at once. Live-query mode pushes every user’s click to the warehouse; if the warehouse doesn’t have spare concurrency, users queue. A cache-backed mode evaluates queries in compute that scales independently of the warehouse, a hundred concurrent reads against the cache don’t bother the warehouse at all.

Freshness is how recent the data underlying the query is. Live query is always current. Cached data is as fresh as its last ingestion, which can be scheduled hourly, incremental, or triggered on demand.

Those three pull in different directions. A live-query dashboard is maximally fresh and can be very fast if the warehouse is idle, but concurrency scales with warehouse capacity. A cached dashboard is maximally concurrent and query-latency-predictable but only as fresh as the last refresh. The correct mode for a given dashboard is the one that makes the correct trade for that dashboard’s users.

A second consideration is cost shape. A pre-bought cache is priced by capacity, you pay for the storage footprint whether the cache is used or not. Live query has no caching cost but adds load to the warehouse, which is usually where the real cost is. The accounting isn’t obvious: a dashboard that’s cheap on the BI side might be expensive in warehouse-concurrency terms, and vice versa.

A third is what the dataset looks like when it’s pulled into a cache. The dataset in a cache is a snapshot; joins have to be expressed against the snapshot, not against live data. If the dashboard logic assumes “join to the latest customer record”, and the customer record is in the warehouse but the orders are in the cache, either both have to be cached or the dashboard goes live.

What we’ll filter on

Distilling that exploration into filters we can score each mode against:

Freshness, how recent is the data when the user sees it?
Concurrency scaling, does the dashboard’s scale depend on the warehouse or on QuickSight itself?
Query latency, predictable sub-second (SPICE) vs warehouse-dependent (direct)?
Dataset size limits, are there practical constraints on how much data fits?
Refresh mechanics, if caching, how does new data arrive in the cache?

The two-mode landscape

SPICE (Super-fast, Parallel, In-memory Calculation Engine). QuickSight’s in-memory columnar cache. Datasets are ingested into SPICE capacity (pre-purchased per-GB), held in memory, and queried by dashboards without touching the source. Dataset size limits are practical (up to 1 billion rows per dataset in Enterprise edition, with a hard ceiling that’s been rising). Refresh modes: full refresh (drop and reload the whole dataset), incremental refresh (pull new rows by a lookback window on a date column, merge), on-demand refresh (API call triggers a refresh). Scheduling: hourly, daily, weekly, or cron-based.
Direct query. The dashboard sends the user’s query to the source engine at view time. The source engine (Redshift, Athena, RDS, Snowflake, various JDBC sources) runs the query, returns results, QuickSight renders. Freshness is “whatever’s current in the source”. Concurrency is “whatever the source supports”; the dashboard’s scale depends entirely on the source’s ability to handle concurrent queries.
Mixed-mode datasets. A dataset can be configured as direct query with a query-level cache that holds recently-run query results for a configurable window. Not SPICE exactly, but softens the “every view hits the warehouse” problem for repeated identical queries.
Row-level security with SPICE. SPICE supports row-level security (RLS) by attaching a permissions dataset, same SPICE or direct query, that maps users to the rows they can see. RLS with SPICE is faster than RLS with direct query, but the permissions dataset itself has to be fresh too.

Side by side

Mode	Freshness	Concurrency	Latency	Size limits	Refresh mechanics
SPICE full refresh	As of last refresh	QuickSight-scaled	Sub-second typical	~1B rows per dataset	Scheduled or on-demand; drops and reloads
SPICE incremental	As of last refresh	QuickSight-scaled	Sub-second typical	Same	Lookback window, merges new rows
Direct query	Live	Source-capacity-limited	Source-dependent	Source-dependent	No cache (per-query)
Direct query + result cache	Within cache TTL	Source + cache	Source-dependent	N/A	Per-query cached

Reading by dashboard shape:

High concurrency, tolerable staleness. SPICE. The ops floor with 40 concurrent viewers of a live-but-tolerable-1-min-lag dashboard is the wrong shape; 40 viewers on a daily sales review that’s refreshed once a morning is exactly SPICE.
Low concurrency, freshness critical, direct query. Exec weekly with ten viewers on a Friday afternoon hits the warehouse fine; no point paying SPICE capacity for it.
Very high cardinality, unpredictable filters, direct query usually wins. A drill-through by customer ID where each user filters to a single customer is hard to cache usefully. SPICE would pull millions of rows only to use one each per query.
Spiky concurrency with tolerable staleness. SPICE. Finance closing on month-end with thirty concurrent users is the exact profile that breaks warehouse concurrency; SPICE absorbs it.

SPICE in depth

Capacity. Purchased per-GB per-month, per-Region. A dataset’s SPICE consumption is roughly its compressed columnar size plus overhead. You can over-purchase and keep a buffer; running out of capacity means new ingestions fail until you buy more or delete datasets. Cost is predictable and non-variable (not per-query); the variable cost is warehouse time during refresh.

Full refresh. The whole dataset is dropped and re-ingested. Simple; sufficient for small datasets; expensive for large ones because it re-reads everything from the source every time. Full refresh is what you want for datasets where new data mixes with old in non-append-only ways, lookup tables, slowly-changing dimensions, anything that isn’t reliably add-only.

Incremental refresh. Configured on datasets with a date/time column. Pick a lookback window (e.g. “7 days”). Each refresh queries the source for rows with the date column within the lookback, and merges them into the SPICE dataset, new rows added, existing rows (matched by the dataset’s primary identifier) updated. Useful for event or fact tables where rows close over time but can be updated within a window. Drawbacks: rows outside the window are frozen; deletions before the window don’t propagate; requires the source to have the correct shape.

On-demand refresh. The CreateIngestion API call triggers a refresh. Combined with a Lambda on an EventBridge event (“the ETL just published a new partition”), you get cache freshness tied to data availability rather than a schedule. The dashboard refresh IS the ingest’s “published” event.

SPICE scheduling. Cron-based, per-Region, per-dataset. The usual gotcha: time zones (schedules run in UTC), and too-tight schedules causing overlapping refreshes that queue and delay each other. Monitor refresh duration and keep the schedule interval at least 2x the typical refresh time.

Incremental refresh failure modes. Missed refreshes don’t automatically catch up, if the incremental refresh runs daily with a 7-day window and a refresh fails silently for 3 days, the next refresh still only pulls 7 days back. Set up CloudWatch alarms on refresh failures; don’t assume “it’ll fix itself”.

Direct query in depth

Query-level caching. QuickSight can cache query results for 5 minutes to 24 hours per dataset. Two users hitting the dashboard with the same filters in that window share a result; two users with different filters don’t. Useful at dashboard-open time (the initial load is the common-case heavy query) but doesn’t solve the “every drill-through is unique” shape.

Concurrency with Redshift. Redshift’s WLM (Workload Management) controls concurrent query slots. If the direct-query dashboard runs in a queue with 10 slots and the dashboard has 20 concurrent users, 10 users wait. QuickSight has no insight into the warehouse queue; the viewer just sees “this dashboard is slow”. Redshift Serverless handles bursty concurrency better than provisioned Redshift; Redshift’s Concurrency Scaling adds transient clusters on demand for read queries, which helps but adds cost.

Concurrency with Athena. Athena’s per-account query concurrency default (historically 20, now higher for DDL/DML; workgroups can raise) is the constraint. Direct-query dashboards in Athena hit this quickly at scale; SPICE is usually the correct answer for Athena-backed dashboards with more than a handful of concurrent viewers.

Auto-narrow. QuickSight pushes filters into the source query where possible. A dashboard filtered to a single customer generates a WHERE customer_id = ? at the source, so the direct-query cost is the filtered query cost, not a full-table scan. If the source schema supports predicate pushdown (indexes, sort keys, partition pruning), direct-query can be very efficient for highly-selective dashboards.

The decision flow

Freshness drives the first choice; concurrency and refresh mechanics refine it. The result-cache middle ground exists for direct-query dashboards that repeat queries without tolerating staleness.

The picks, dashboard by dashboard

Daily Sales Review → SPICE, full refresh, nightly. ~50 viewers on a dataset refreshed once a morning; warehouse load during business hours is a waste. Full refresh (small dataset, closed sales) scheduled 06:00 UTC; CloudWatch alarm on refresh failures; over-provision SPICE capacity slightly so the morning read is instant.

Ops Live → Direct query + result cache (5 minute). Freshness is critical; staleness is wrong answers. 40 concurrent viewers is within Redshift’s reasonable direct-query concurrency if WLM is configured; the 5-minute result cache absorbs the repeated-identical-query case (everyone staring at the overview page) without staleness beyond the cache window.

Exec Weekly → Direct query. Ten viewers once a week; paying SPICE capacity for this is waste. Direct query against Redshift runs fine at ten concurrent.

Customer Detail Drill-through → Direct query. Per-user filter to a single customer out of a million means SPICE would need the whole million-customer dataset in memory for each viewer to filter to one, wasteful. Direct query with predicate pushdown to the warehouse’s sort key on customer_id returns the filtered data fast and cheap.

Finance Closing → SPICE, full refresh, on-demand via Lambda. The month-end 30-viewer concurrent spike is exactly what breaks warehouse queues. SPICE absorbs it. The on-demand trigger means refresh happens when the close-process is complete (the ETL emits an event), not on an arbitrary schedule that might beat the data.

A worked incremental: the orders dataset

An orders_fact table in Redshift, 80 million rows growing by 200,000 per day. A dashboard needs “rolling 365 days” aggregated. Full-refreshing 80M rows nightly is eight minutes of warehouse time and about 6 GB of SPICE. Incremental refresh with a 7-day lookback runs in 90 seconds and pulls only ~1.4M rows per refresh.

Dataset configuration: source is a Redshift query SELECT * FROM orders_fact WHERE order_date >= dateadd(day, -366, current_date). Primary key: order_id. Lookback: 7 days on order_date. Schedule: hourly (picks up late-arriving rows through the 7-day window). Ingestion bumps the existing SPICE dataset rather than replacing it.

Caveats: orders updated outside the 7-day window (returns, corrections) won’t propagate. Pair a weekly full refresh with the hourly incremental, so the weekly catches stragglers, cheap insurance at the cost of one weekly full-refresh window.

What’s worth remembering

SPICE trades freshness for concurrency and latency. Queries are fast and scale with QuickSight, not the warehouse. Data is as of the last refresh.
Direct query trades concurrency for freshness. Queries are live but scale with the source’s ability to handle them. Warehouse capacity becomes the dashboard’s concurrency ceiling.
Full vs incremental refresh is a cost and coverage trade. Full is simple and covers every row; incremental is cheap but only as recent as the lookback window. Combine them (full weekly + incremental hourly) when you need both coverage and cost control.
On-demand refresh via Lambda beats rigid schedules. When refresh should track an ETL event, use CreateIngestion from an EventBridge target. Freshness tracks the data, not the clock.
Direct query + result cache is the middle ground. 5 minutes to 24 hours of shared results across users. Softens “everyone clicks the same overview” without the freshness cost of SPICE.
Row-level security works in both modes. With SPICE, the permissions dataset has to be refreshed too. With direct query, RLS is enforced at query time against the source.
Cost is capacity-based for SPICE, warehouse-based for direct. SPICE is a known-cost buffer; direct query is warehouse time. Estimate both before picking.
Dataset size has practical limits. SPICE scales to around a billion rows per dataset; past that, consider aggregating upstream or breaking into multiple datasets with dashboard-level joins.

The correct answer per dashboard falls out of three questions: how fresh does the data need to be, how many concurrent viewers are expected, and are the queries repeated or unique-per-user. SPICE wins when staleness is tolerable and concurrency is high. Direct query wins when freshness is essential or queries are too unique to cache. Most real suites are a mix; don’t try to force every dashboard into one mode.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.