DynamoDB On-Demand or Provisioned Capacity

The situation

A team owns three DynamoDB tables across different services:

sessions: user session state. 24/7 traffic pattern, ~400 read units and ~80 write units per second during the day, dropping to ~30 reads / 5 writes overnight. Predictable daily shape for two years.
events-stream: a write-heavy landing table for IoT device telemetry. ~2,000 writes per second average, spikes to 12,000 during morning device wake-ups, quiet overnight. Reads are minimal (downstream processors use DynamoDB Streams, not table queries). Growing 20% quarter-over-quarter.
analytics-raw: data-science ad-hoc access. Usage is zero 95% of the time, with bursts of 5,000-15,000 read units during analysis sessions that might last minutes to hours. Nobody can predict when the next burst lands.

All three are currently on-demand. The last monthly bill: sessions $840, events-stream $3,200, analytics-raw $180. The finance team asked whether provisioned capacity would be cheaper, and the engineering lead asked the more interesting question: which of the three should be on provisioned, given their traffic shapes?

What actually matters

Before pricing the options, it’s worth being precise about what each DynamoDB billing mode actually commits to.

The first thing is on-demand billing per request. You pay per read or write request unit consumed. No commitment, no minimum; the table scales to zero when idle and handles bursts without pre-provisioning. The per-request rate is several times the provisioned rate for equivalent capacity, the premium you pay for “scale by default, no thinking.”

The second thing is provisioned billing per unit-second. You commit to a sustained read and write capacity that the table can carry continuously, charged per unit-hour. Auto Scaling can adjust those values up and down between min/max bounds based on utilisation. The capacity is metered in units that correspond to a fixed-size read or write at the supported consistency model, a useful detail when sizing, but not the design choice.

The third thing is reserved capacity for provisioned tables. A one-year or three-year commitment to a baseline amount of capacity in a Region in exchange for a meaningful discount. Useful for tables whose baseline is truly baseline, if auto-scaling never drops below a certain floor, reserving that floor saves real money.

The fourth thing is when per-request wins, economically. The crossover is roughly at a table’s sustained utilisation: if the table’s average capacity usage is above ~18% of its peak, provisioned tends to be cheaper (the 7× per-request premium outweighs the waste of idle provisioned capacity); if it’s below, per-request wins. This is a rough rule; the exact crossover depends on how spiky the traffic is and whether auto-scaling can chase the shape.

The fifth thing is burst accommodation. Provisioned tables that spike beyond their capacity get throttled (the reads/writes return ProvisionedThroughputExceededException), though DynamoDB’s built-in burst bucket gives ~5 minutes of 2× capacity for free. On-demand tables can handle a 2× increase over the recent peak immediately, but a sudden 10× increase from a cold table may get throttled initially. DynamoDB’s on-demand throttling is “soft” (the service internally ramps capacity) but not instant. For bursts that matter, enable auto-scaling on provisioned or pre-warm a table that sees sudden large bursts.

The sixth thing is switching modes has throttle implications. You can switch a table between on-demand and provisioned once per 24 hours. Switching provisioned → on-demand is immediate; switching on-demand → provisioned requires specifying initial RCU/WCU, which you then grow via auto-scaling. Not free to flip back and forth, but available as a knob.

What we’ll filter on

Filters for each table:

Predictability of demand, can we commit to a capacity number?
Sustained utilisation, what percentage of peak capacity is the average?
Spike tolerance, does the table see sudden large traffic increases?
Scales to zero, are there long idle periods?
Cost at current shape, actual monthly bill in each mode.
Operational overhead, auto-scaling tuning, reserved-capacity management.

The billing-mode landscape

On-demand. Per-request pricing. No commitment. Scales to zero. Handles moderate bursts automatically; very large bursts over a cold table may throttle briefly. Default for new tables since 2024. Simplest operationally; most expensive per unit of work.
Provisioned with no auto-scaling. Pick RCU/WCU values manually. Cheap if sustained utilisation is high. Throttles if traffic exceeds the provisioned rate plus burst bucket. Requires human attention to adjust as the workload grows.
Provisioned with auto-scaling. RCU/WCU values float between min and max bounds based on target utilisation (default 70%). Auto-scaling reads CloudWatch metrics, issues UpdateTable calls every few minutes, and keeps the table roughly correctly-sized. Most operational of the provisioned options; close to on-demand in effective shape, at provisioned prices.
Provisioned + Reserved Capacity. Commit to a baseline in RCU/WCU for 1 or 3 years; get ~30-54% off that baseline. Auto-scaling still applies above the reserved baseline at on-demand provisioned rates. Good for tables with genuinely-fixed minimum capacity.
On-demand + Reserved Capacity. Reserved Capacity doesn’t apply to on-demand. You can’t mix.

Side by side

Option	Predictability required	Util threshold	Spike tolerance	Scales to zero	Cost vs provisioned	Ops overhead
On-demand	✗ (any)	Good at <18%	Soft throttling on huge bursts	✓	~7× per unit	Minimal
Provisioned no auto-scaling	✓✓	Good at >30% sustained	Throttles above rate	✗	1× (baseline)	Adjust manually
Provisioned + auto-scaling	✓	Good at >18% sustained	Throttles during scale lag	✗ (min > 0)	1.2× (auto-scale premium)	Tuning min/max, target
Provisioned + Reserved	✓✓	Good at >30% sustained	Same as above-baseline	✗	0.5-0.7× (discounted)	+ RC management

Reading this against the three tables:

sessions: 24/7, predictable shape, 400 peak / 30 trough reads. Sustained utilisation around 40-50% against a “peak-sized” provisioned allocation. Provisioned with auto-scaling, likely with RC covering the baseline 100 RCUs and 30 WCUs that never go away. Expected savings: 50-60%.
events-stream: 12,000 peak writes, ~2,000 average, quiet overnight. Sustained utilisation ~15-20%. Provisioned with auto-scaling is probably cheaper than on-demand but not by a huge margin; the 12,000 WCU spikes during morning wake-ups are the operational risk (auto-scaling needs to chase). Worth piloting either way.
analytics-raw: zero 95% of the time, unpredictable bursts. On-demand is the honest answer. Provisioning anything means paying for capacity that’s used 5% of the time.

Matching traffic shape to billing mode

Same decision framework, three different shapes. sessions has enough sustained floor to reward provisioned plus reserved capacity; events-stream is a borderline case worth piloting; analytics-raw's shape is exactly what on-demand was built for.

The `sessions` migration in depth

Target state. Provisioned capacity, auto-scaling between 100/30 (min) and 500/120 (max) for RCUs/WCUs. Reserved Capacity covering the minimum: 100 RCU and 30 WCU for a 1-year commitment.

Setting min/max. Min is the floor the table stays at even at 03:00 when traffic is quiet. Setting min too low means auto-scaling has to scale up every morning and throttles briefly during the ramp; setting min too high means paying for idle capacity. The sessions trough is around 30 reads/5 writes per second, so mins of 100 RCU / 30 WCU give a ~3× headroom for the quiet period. Max needs to be above the peak plus some buffer: peak is 400/80, so max of 500/120 gives 25-50% headroom.

Target utilisation. Default 70% means auto-scaling targets 70% utilisation; when actual utilisation sustained above 70% for ~2 minutes, it scales up. Lower values (50%) mean more headroom (and higher cost) and gentler throttling risk during spikes; higher values (85%) mean tighter fit and more risk. 70% is a reasonable default for smooth daily shapes.

Reserved Capacity purchase. RC is bought in the DynamoDB console or via API. For sessions, buy 100 RCU and 30 WCU for 1 year, no upfront (partial discount) or all-upfront (deeper discount, depends on cash-flow preference). RC applies to any capacity in the Region up to the reserved amount; the commitment is regional, not table-specific, so if another table needs 50 RCU of provisioned, that 50 draws from the same RC pool.

Switching the mode.

aws dynamodb update-table \
    --table-name sessions \
    --billing-mode PROVISIONED \
    --provisioned-throughput ReadCapacityUnits=100,WriteCapacityUnits=30

Initial provisioned capacity is the minimum; auto-scaling policy comes next:

aws application-autoscaling register-scalable-target \
    --service-namespace dynamodb \
    --resource-id table/sessions \
    --scalable-dimension dynamodb:table:ReadCapacityUnits \
    --min-capacity 100 --max-capacity 500

aws application-autoscaling put-scaling-policy \
    --service-namespace dynamodb \
    --resource-id table/sessions \
    --scalable-dimension dynamodb:table:ReadCapacityUnits \
    --policy-name sessions-read-autoscaling \
    --policy-type TargetTrackingScaling \
    --target-tracking-scaling-policy-configuration '{
      "TargetValue": 70.0,
      "PredefinedMetricSpecification": {"PredefinedMetricType": "DynamoDBReadCapacityUtilization"}
    }'

Same for WriteCapacityUnits. The policy watches ConsumedReadCapacityUnits / ProvisionedReadCapacityUnits and adjusts provisioned to keep utilisation near 70%.

Monitoring during the transition. CloudWatch alarms on ReadThrottleEvents and WriteThrottleEvents (should stay at or near zero); on ConsumedReadCapacityUnits to verify auto-scaling is chasing the shape. First month after the change is the budget-watching month.

A worked week

We flip sessions to provisioned on a Monday morning. The bill watch:

Before (on-demand week):
  Read:  400 RPS × 7 days = 242M reads × $0.25/M = $60.50
  Write: 80 WPS × 7 days  = 48M writes × $0.125/M = $6.00 (×2 for write = $12)
  Total: ~$200/week -> ~$840/month

After (provisioned + auto-scaling + RC):
  Provisioned RCU: averaged ~280 (auto-scaled)
    80 RC-covered + 200 on-demand provisioned
    RC cost: reserved upfront, amortised ~$0.000004/RCU-hour
    Extra provisioned: 200 × 720 hr × $0.000013 = $1.87/month
  Provisioned WCU: averaged ~80
    30 RC-covered + 50 on-demand provisioned
    Extra: 50 × 720 × $0.000065 = $2.34/month
  + RC amortised: ~$80/month across both tables sharing the RC
  Total: ~$320/month

The monthly bill drops from $840 to ~$320. Throttling stays at zero throughout; the auto-scaling policy adjusts capacity every few minutes, keeping utilisation in the 60-75% band. One operational caveat: the first Monday morning after the change, auto-scaling had to scale up faster than the 70% target could signal (users came back faster than CloudWatch’s metric period). A few seconds of minor throttling appeared on the metrics; users saw nothing because DynamoDB’s client SDK retries automatically. We lowered the target utilisation to 60% to give more headroom during the morning ramp.

What’s worth remembering

On-demand vs provisioned is a commitment question. Per-request billing buys simplicity; provisioned billing buys discount in exchange for telling DynamoDB what to expect.
Crossover is roughly 18% sustained utilisation. Below that, on-demand usually wins; above, provisioned + auto-scaling usually wins. Exact numbers vary; run your actual traffic through both calculators before flipping.
Auto-scaling is the usual operational shape for provisioned. Min/max bounds, target utilisation (70% is the default; 60% if spikes are sudden). The policy reacts to CloudWatch metrics; expect ~2-minute reaction time.
Reserved Capacity discounts the baseline. 1- or 3-year commits to fixed RCU/WCU; up to ~54% off. Only worth buying if that baseline is truly always-there; auto-scaling below the reserved amount still bills at the reserved rate.
Switching modes is a per-24-hour knob. You can flip between billing modes once per day. Not free to chase the traffic shape with mode switches; pick the mode that fits the shape for a quarter at a time.
Bursts on on-demand can still throttle briefly. On-demand handles routine bursts automatically but a cold table seeing a 10× jump may throttle as DynamoDB internally ramps. For workloads that need guaranteed burst behaviour, provisioned with auto-scaling and a higher max is more predictable.
DynamoDB Streams, TTL, and Global Secondary Indexes all have their own capacity. GSIs consume their own provisioned capacity; Streams reads are separate; TTL deletes don’t consume WCUs but do show up in metrics. Budget for all of them.
The “shape” is usually more informative than the average. A flat 200 RCU average and a spiky 200 RCU average behave very differently under provisioned mode. Plot the traffic before picking.

Paying by the request is the correct answer when the traffic is unpredictable or sparse; paying for the seat is the correct answer when it isn’t. Three tables, three shapes, three different billing modes, that’s the honest answer, not “provisioned is cheaper” or “on-demand is simpler.” Fit the pricing mode to the shape each table actually has.