Choosing Between RA3, DC2, and Spectrum for Redshift Storage

January 19, 2028 · 15 min read

Data Engineer · DEA-C01 · part of The Exam Room

The situation

We operate a Redshift cluster that grew up alongside the company.

  • The historical core: 6 × dc2.8xlarge nodes, local-SSD storage, ~36 TB usable, 80% full. Loaded by an ETL pipeline nightly. Serves the main warehouse queries.
  • A data lake in S3 that’s grown to 80 TB of Parquet, partitioned by day, catalogued in Glue. Some of it is cold (web logs over 90 days), some of it is lightly curated data the warehouse hasn’t absorbed, some is experimental data science sandbox.
  • A newer team that’s reaching for Redshift Serverless for ad-hoc analytics without operational overhead.

The tensions are visible. The DC2 cluster is running out of local storage; scaling compute for performance also means paying for storage we don’t need, and scaling storage means paying for compute we don’t use. The data lake is accessible via Athena but every cross-product join (say, “show me warehouse sales joined against S3 web traffic”) requires a bespoke pipeline. The Serverless team’s usage is spiky and under-forecast, and the DBA wants to avoid buying a second cluster for them.

The three questions on the table:

  • Should we migrate the DC2 cluster to RA3 and stop coupling storage to compute?
  • Should we query the lake from the warehouse via Spectrum instead of loading everything in?
  • Should Serverless replace or complement the provisioned cluster?

These are three different questions, and they interact. The three storage/compute shapes Redshift offers are RA3 (managed storage, compute scales independently), DC2 (local storage tied to node size), and Spectrum (S3 storage, Redshift compute reaches out).

What actually matters

Before reaching for a shape, it’s worth being explicit about what storage/compute relationships are actually on the table.

The first thing to name is whether storage and compute are bound together. When they are, growing storage means growing nodes, and growing nodes means paying for compute capacity you don’t need just because the data grew. When they’re decoupled, storage scales on its bytes and compute scales on its query load. A warehouse whose data volume grows faster than its query load wants decoupling; one whose query load is the binding constraint can live with either.

The second is where the data physically lives at query time. The fastest path is data sitting on local SSD attached to the compute node: microseconds to first byte, no network hop. A managed-storage tier keeps the bulk of the data on remote object storage and uses local SSD as a cache; warm queries stay fast, cold queries pay one network round-trip to rehydrate. Querying data directly out of an object store skips the warehouse-storage tier entirely; the latency floor rises but the data never has to be loaded.

The third is the boundary between compute clusters. A single cluster that handles ingest, ETL, and reads is the simplest shape. Once multiple teams want their own compute against the same data, the question is whether the platform can share a single physical storage tier across many compute workgroups, or whether each cluster needs its own copy. Sharing means one ingestion path and many readers; copying means ETL pipelines between clusters and the consistency bugs that come with them.

The fourth is how compute is provisioned and billed. Some shapes ask the team to size and run a permanent cluster (good for steady workloads where you know the load). Some hide node management behind elastic capacity units that scale with traffic (good for spiky or unforecasted workloads). The trade is “pay a predictable bill for known capacity” versus “pay per second of actual work, with a floor.”

The fifth is what each shape disqualifies you from. Some capabilities (cross-cluster data sharing, serverless consumer workgroups) only show up when the storage tier supports them. A choice that ties storage to compute today is also a choice not to share that data with another compute workgroup tomorrow without re-architecting.

What we’ll filter on

Distilling into filters we can score each shape against:

  1. Storage scaling (can storage grow without adding compute?)
  2. Compute elasticity (can compute scale independently of storage?)
  3. Query locality (is data queried from local fast storage or from S3?)
  4. Cross-cluster data sharing (can data be shared read-only across clusters?)
  5. Best-fit workload (what data shape does this suit?)

The Redshift-storage landscape

1. DC2 (Dense Compute). Local SSD, fixed per-node storage. dc2.large (160 GB storage/node), dc2.8xlarge (2.56 TB storage/node). Fast query performance on data that fits in cache. Scale storage by scaling nodes (and therefore compute). Legacy; AWS has been encouraging migration to RA3 for years.

2. RA3 (Redshift Managed Storage). Separate compute and storage. Node sizes: ra3.large (for Serverless only, historically), ra3.xlplus, ra3.4xlarge, ra3.16xlarge. Local SSD is a cache (sized per node type); primary storage is RMS, S3-backed, scales automatically, billed per GB-month. Cross-cluster data sharing, data exchange via Data Sharing, AQUA (hardware acceleration) on larger sizes.

3. Spectrum. Not a node type; a query feature. CREATE EXTERNAL SCHEMA attaches a Glue catalog; SELECT against external tables pushes scan down to S3 via a separate fleet. Billed per-TB scanned (plus the Redshift cluster’s usual cost). Fits data that’s in S3 and shouldn’t be loaded into Redshift: cold data, experimental data, data that’s too big or too rarely queried to earn warehouse storage.

4. Redshift Serverless. RPU-based elastic capacity; no node management. Storage is RMS. Data sharing between provisioned and Serverless clusters is native; the same RMS storage can be read by multiple compute workgroups.

5. Data Sharing (cross-cluster). Not a separate shape but a capability: a “producer” cluster shares read-only access to RMS storage; “consumer” clusters query it directly. Lets you have one cluster loading data and another cluster (or Serverless workgroup) querying it, without ETL between them.

Side by side

Shape Storage scaling Compute elasticity Query locality Cross-cluster sharing Best fit
DC2 Coupled (add nodes) Coupled Local SSD Legacy; small cluster that fits in local storage
RA3 + RMS Independent Independent Local SSD cache + RMS Default for new clusters; growing storage
Spectrum N/A (S3) Query-time fleet S3 scanned per query ✓ (via Glue catalog) Cold/experimental data living in S3
Serverless Independent (RMS) Elastic (RPU) Local cache + RMS Spiky or unpredictable workloads

Reading by use case:

  • Existing DC2 cluster running out of storage: migrate to RA3. The compute-storage decoupling is the reason to do it; upgrade path is well-trodden.
  • Mature warehouse of a few TB, steady workload: RA3 provisioned. Best cost for a steady predictable load.
  • Spiky ad-hoc workload: Serverless. Don’t buy a cluster for a team whose demand you can’t forecast.
  • 80 TB of S3 data that’s lightly queried from the warehouse: Spectrum. Don’t load it; query it where it lives.
  • Multi-cluster story: central ETL, several analytical consumers: Data Sharing on RMS. Producer cluster loads; consumers query.

The three shapes, laid out

Three shapes of Redshift storage / compute DC2 (legacy) compute + storage fused per node dc2.8xlarge compute local SSD 2.56 TB dc2.8xlarge compute local SSD 2.56 TB dc2.8xlarge compute local SSD 2.56 TB dc2.8xlarge compute local SSD 2.56 TB Storage = node count × per-node SSD Grow storage → grow nodes → grow compute No cross-cluster sharing Legacy; migrate to RA3 when possible RA3 + Redshift Managed Storage compute scales independently of storage ra3.4xlarge compute local cache ra3.4xlarge compute local cache Redshift Managed Storage S3-backed, autoscales with data billed per GB-month used Serverless workgroup (shared RMS) consumer via Data Sharing RPU-billed, no node management Data Sharing lets multiple compute read one storage Spectrum query S3 without loading to Redshift Redshift cluster (RA3) leader + compute nodes local warehouse joins Spectrum fleet scans S3, pushes down predicates S3 + Glue Catalog Parquet, ORC, Avro, JSON partitioned by day / region billed per TB scanned Fits cold or experimental data that shouldn't earn warehouse storage All three can coexist: RA3 for the warehouse's hot curated data, Spectrum for S3-resident cold/exploratory data, Serverless workgroups for spiky or team-isolated workloads, all sharing one logical data estate via Data Sharing + Glue.
DC2 fuses compute and storage; RA3 decouples them via RMS and enables data sharing and Serverless consumers; Spectrum keeps data in S3 and queries it via a separate fleet. The three are not mutually exclusive; a well-designed stack uses each for what it fits.

The three questions, answered

Should we migrate DC2 to RA3? Yes. The cluster is at 80% local storage with growth still happening. RA3 lets storage grow without paying for more compute, which is what the cluster actually needs. The migration path: classic resize (some downtime) or elastic resize (minutes of downtime for node count, separate operation for node type). Data Sharing and Serverless are both RA3-only, so migrating unblocks those options too.

Should the warehouse reach into S3 via Spectrum? Yes, for the correct data. The 80 TB of S3 data isn’t all equally valuable to Redshift. Cold web logs that get queried once a month for compliance are exactly Spectrum’s shape: leave them in S3, create external tables, pay per TB scanned for the rare query. Actively-joined dimensional data might still be worth loading into the warehouse. The split: decide per dataset, not as a blanket policy.

Should Serverless replace the provisioned cluster? Depends. For the new team with spiky workloads, Serverless is the correct answer: create a Serverless workgroup, share the RA3 cluster’s RMS via Data Sharing, and the team queries the warehouse data without owning a cluster. For the provisioned cluster’s steady nightly ETL, Serverless isn’t cheaper than a right-sized RA3 provisioned cluster; keep it as-is.

The stack: RA3 provisioned as the producer (ETL loads here); Serverless workgroup(s) as consumers (teams with spiky read workloads); Spectrum external tables for cold S3 data queryable from either. One logical data estate, three compute shapes matched to workload.

A worked Spectrum query

An events_s3 external schema attached to the Glue catalog:

CREATE EXTERNAL SCHEMA events_s3
FROM DATA CATALOG
DATABASE 'acme_warehouse_external'
IAM_ROLE 'arn:aws:iam::111122223333:role/redshift-spectrum-role';

Glue already catalogs events_s3.web_sessions (80 billion rows across 3 years, partitioned by event_date). A query joining S3 events against warehouse dimensions:

SELECT
    d.country,
    COUNT(DISTINCT e.session_id) AS sessions,
    COUNT(DISTINCT e.user_id) AS users
FROM events_s3.web_sessions e
JOIN warehouse.dim_customer d
    ON e.user_id = d.user_id
WHERE e.event_date BETWEEN '2027-10-01' AND '2027-10-31'
  AND d.segment = 'enterprise'
GROUP BY d.country
ORDER BY sessions DESC;

What happens: the query planner sees the external table, generates a Spectrum sub-plan that pushes the event_date partition filter and the user_id join key projection down to the Spectrum fleet. The fleet reads only the October Parquet partitions, applies predicate pushdown for event_date, projects only the columns needed (session_id, user_id, event_date), and returns partial aggregates. The main cluster joins those partial results against dim_customer (a small dim, in-warehouse), applies the segment = 'enterprise' filter, and completes the aggregation.

The data in S3 is scanned once (for October), and only the columns queried. The warehouse storage isn’t asked to hold 80B rows it doesn’t need. The query cost: Spectrum per-TB-scanned for the projected columns of October’s partitions, plus the usual cluster cost.

What’s worth remembering

  1. DC2 couples storage to compute; RA3 decouples them. DC2 is legacy; RA3 with Redshift Managed Storage is the contemporary default. Migrate DC2 clusters to RA3 unless there’s a specific reason not to.
  2. Redshift Managed Storage is S3-backed and autoscales. Per-GB billing for actual usage. Local SSD on RA3 nodes is a cache, not primary storage; working sets are cached automatically.
  3. Spectrum queries S3 without loading. Separate compute fleet reads Parquet/ORC/Avro/JSON via Glue catalog. Per-TB-scanned billing; use for cold or experimental data that shouldn’t earn warehouse storage.
  4. Data Sharing enables multi-cluster, one-storage. A producer cluster shares RMS read-only to consumer clusters (provisioned or Serverless). One copy of data; multiple compute workgroups.
  5. Serverless packages RA3 without node management. RPU-based elastic capacity. Fits spiky or unpredictable workloads; doesn’t beat provisioned RA3 on cost for steady-state loads.
  6. Predicate pushdown is the Spectrum performance win. Partitioning, columnar formats, and selective queries keep per-TB scan costs manageable. Full-table scans against unpartitioned data is where Spectrum gets expensive.
  7. The three shapes coexist. A mature Redshift stack often has RA3 provisioned for the hot warehouse, Serverless workgroups for teams, Spectrum for the cold S3 tail. They’re not mutually exclusive; each fits a different shape.
  8. IAM and Glue catalog plumbing for Spectrum matters. External schemas reference Glue databases; the cluster’s IAM role needs S3 read and Glue catalog access; permissions via Lake Formation add per-column/row constraints if configured.

The question “where does the data live” has three Redshift-native answers: in the warehouse on local cache backed by RMS (RA3), in S3 queried on demand (Spectrum), or on legacy local SSD (DC2). Matching each dataset to its correct shape, rather than forcing everything into one tier, is what keeps a Redshift estate from becoming either too expensive or too slow.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.