The situation
We’re running three workloads on EC2 and the finance partner has asked the question finance partners eventually ask: can we get this bill down without losing anything?
- A web tier: roughly 40
m6i.largeinstances behind an ALB ineu-west-1, autoscaling between 30 and 60 depending on time of day. Three years of traffic data says the daily shape is going to keep looking like this. Somebody is always on call, so “always on” means always on. - A nightly ETL: around 20
c6i.xlargeinstances that spin up at 01:00 UTC, crunch a day of transactions into the warehouse, and shut down around 04:00. Finishing by 06:00 is the SLA. The pipeline already checkpoints between stages, so a killed instance means a restart, not a missed night. - A dev and demo fleet: some number of
t3.mediuminstances that engineers bring up for prototypes, experiments, and the occasional customer demo. Last month was 200 hours. The month before was 20. The quarter-end month was 2,000. Nobody is forecasting this; nobody could.
Everything is on on-demand today. Everything is billed at sticker price. The question isn’t whether that’s wrong, on-demand is the correct answer for something, it’s whether it’s correct for all three.
What actually matters
Before reaching for a pricing page, it’s worth asking what we’re actually trading.
The core trade in AWS pricing is commitment in exchange for discount. We can pay a premium hourly rate to commit to nothing, or we can promise AWS a specific amount of compute for one or three years and get a discount, sometimes a very big one. The discount is AWS buying predictability from us; predictability is worth money to them because they plan data-centre capacity years ahead. The deeper the discount we want, the more precisely we have to commit.
So the first thing to ask about each workload is: how predictable is it, really? The web tier has three years of consistent history, if we can’t commit to that workload we can’t commit to anything. The ETL runs every night at the same hours on the same instance family. The dev fleet is the opposite: we have no idea what next quarter looks like, and pretending otherwise would cost real money.
The second thing to ask is: can this workload survive being taken away mid-run? Some of the deepest discounts come from buying capacity AWS hasn’t sold yet, spare capacity that comes with roughly 90% off but a two-minute eviction notice when a paying customer turns up. That’s fine for the ETL if it checkpoints. It’s career-limiting for the web tier.
The third thing worth thinking about is coverage breadth. Some commitments lock us into a specific instance family, size, and Region; others let us slide across families, sizes, Regions, and even across EC2, Fargate, and Lambda. The broader the commitment, the less it restricts us when something changes, but the smaller the discount. The correct answer depends on how confident we are that we’ll still want an m6i.large in eu-west-1 in 2030.
The fourth is what the commitment applies to. Some commitment models follow the workload off EC2 onto serverless compute; others are pinned to a specific instance shape and silently keep billing for it even after the service migrates away. That matters if any of the three workloads might land somewhere other than EC2 inside the commitment window.
And finally, a softer one: the planning horizon of the team holding the commitment. A three-year commitment at 66% off is only cheaper than a one-year at 40% off if we actually stay on AWS, in this Region, running approximately this workload, for the full three years. Engineering doesn’t usually hold teams to bets that long. Finance does.
What we’ll filter on
Distilling that exploration into filters we can score each pricing model against:
- Commitment duration, how long are we locking ourselves in?
- Discount depth, how much off on-demand?
- Shape flexibility, does the discount follow us if we change instance family, size, or Region?
- Interruption tolerance required, does AWS ever reclaim the capacity mid-run?
- Covers Fargate and Lambda, does the commitment also apply beyond EC2?
The pricing landscape
-
On-demand. The sticker price. No commitment, no discount, no interruption. Pay per second (Linux, and some other OS families) or per hour, stop when we like. The benchmark every other model is measured against, and the correct answer for any workload we can’t predict.
-
Reserved Instances. Standard. Commit to a specific instance family, size, OS, tenancy, and Region for one or three years; get up to ~72% off on-demand for a three-year all-upfront reservation. The commitment is rigid: it’s a billing discount that applies to matching usage. If nothing in the account matches during a given hour, the money for that hour is spent and the discount goes unused. Standard RIs can be listed on the Reserved Instance Marketplace, which takes the edge off the lock-in, but the process is manual and liquidity isn’t guaranteed.
-
Reserved Instances. Convertible. Same rough shape as Standard, with the ability to exchange a reservation for a different instance family, size, OS, or tenancy as long as the new reservation is of equal or greater value. Because AWS is buying less predictability, the discount ceiling drops to roughly 54% at three-year all-upfront. Useful when we believe in the amount of compute but not the shape.
-
Savings Plans. Compute. Commit to a dollar-per-hour spend on compute for one or three years; get up to ~66% off on-demand for three-year all-upfront. Coverage is broad: any EC2 instance family, any size, any Region, any OS, plus Fargate and Lambda. Spend $10 an hour on compute however we like; the discount applies. AWS absorbs the shape risk; we absorb the total-spend risk. The deepest-discounting model that still lets us change our mind about what we’re running.
-
Savings Plans. EC2 Instance. Narrower variant: commit to a specific instance family in a specific Region, one or three years, up to ~72% off, matching Standard RI depth. Still lets us change size and OS within the family (Standard RIs lock size as well). Doesn’t cover Fargate or Lambda.
-
Spot. Bid for AWS’s unused capacity at up to ~90% off on-demand. No commitment in either direction: we pay the current Spot price for as long as the instance lives, and AWS reclaims it with a two-minute warning when the capacity is needed elsewhere. Ideal for workloads that checkpoint, tolerate restart, or run in fleets where losing one node doesn’t matter. Career-limiting for anything stateful that can’t recover.
Side by side
| Option | Commitment | Discount | Shape flexibility | Interruption | Covers Fargate / Lambda |
|---|---|---|---|---|---|
| On-demand | None | 0% | Total | None | Native |
| RI Standard | 1 or 3 yr | Up to ~72% | Locked to family, size, Region | None | ✗ |
| RI Convertible | 1 or 3 yr | Up to ~54% | Exchangeable | None | ✗ |
| SP Compute | 1 or 3 yr | Up to ~66% | Any family, any Region | None | ✓ |
| SP EC2 Instance | 1 or 3 yr | Up to ~72% | Any size in family, locked Region | None | ✗ |
| Spot | None | Up to ~90% | Total | 2-min warning | ✗ |
Reading the table by workload rather than by model:
- Web tier, always on, fixed shape, zero interruption tolerance. RI Standard and SP EC2 Instance tie on discount; SP Compute trades ~6 percentage points for mobility across families and into Fargate. Pick between them based on how sure we are that we’ll still be running
m6i.largethree years from now. - ETL, predictable hours, fixed family, survives eviction cleanly because of the existing checkpointing. Spot wins at ~90% off. A small Savings Plan underneath can cover the irreducible hours that have to complete on-demand when Spot capacity isn’t available.
- Dev fleet, unpredictable, short-lived, disposable. Committing anything to it would mean committing on a guess. On-demand is the honest answer; stopping instances when they’re idle is the cost-control lever, not the pricing model.
Matching rhythms to models
The picks in depth
Web tier → Savings Plan (Compute). The web tier has three years of history saying it burns, on average, something like 40 × m6i.large hours per hour. That’s the commit: price those 40 instance-hours on the Compute Savings Plan calculator (roughly $0.096 on-demand × 40 ≈ $3.84/hour of baseline compute) and commit to, say, $3.50/hour for three years. At ~66% off, the first 36 instance-hours per hour land at the discounted rate; anything above that runs on-demand, which is fine because the autoscaler only goes above baseline when traffic justifies it. Choosing Compute over EC2 Instance gives up roughly six percentage points of discount in exchange for the ability to move the workload, to m7i.large when it launches, to Fargate if the team containerises, to us-east-1 if we open a second Region, without stranding the commitment. That optionality is cheap insurance against the three-year planning horizon being wrong.
The commitment is a dollar-per-hour figure, not an instance count, which is the point. AWS applies the discount to whichever compute usage has the highest on-demand rate first, so the Savings Plan covers the most expensive hour we run, and any leftover applies down the list. Under-commit slightly rather than over-commit: a $3.50 commit when we run $3.84 of baseline means every dollar of commitment is used; a $4.00 commit means $0.16/hour of waste.
Nightly ETL → Spot, with a thin Savings Plan underneath. The ETL runs three hours a night on 20 c6i.xlarge instances. The pipeline already checkpoints between stages and the fleet can tolerate losing a node mid-batch, exactly the profile Spot is designed for. At ~90% off, three hours of 20 instances a night becomes cheap enough that the savings compound into something finance will notice on the quarterly review.
The risk with Spot isn’t the discount; it’s capacity availability. On a bad night the target Spot pool has no capacity and the fleet launches at on-demand prices. Two guardrails: diversify the Spot request across multiple instance families (c6i, c7i, m6i all work for the ETL’s CPU shape) and across all three AZs, and sit a small Compute Savings Plan underneath, enough to cover, say, 10 instance-hours per night at discount in case the Spot market fails. The result is an expected bill dominated by 90%-off hours with a worst-case cap of on-demand minus whatever the SP covers.
Dev/demo fleet → on-demand, and idle-shutdown automation. The fleet’s total hours vary by an order of magnitude month-to-month. Committing to 200 hours means paying for them in the 20-hour month; committing to 2,000 means paying anyway in the 200-hour month. Both lose money on average. On-demand is the honest pricing model for a workload we can’t forecast.
Where the cost control lives for this fleet is not the pricing model but the operational discipline: EventBridge Scheduler rules that stop dev instances at 18:00 Friday and start them 08:00 Monday, a tag-based policy that terminates anything older than 30 days without an exception tag, and a monthly Cost Explorer review by the team that owns the fleet. On-demand is cheap when the instances aren’t running; the pricing model isn’t the lever, idle time is.
A worked example: one month of bill shape
Steady state, after the picks have been in place for a quarter. Same three workloads; same month; on-demand reference for comparison:
Web tier
baseline (SP-covered): 40 inst × 720 h × $0.096 × 0.34 = $ 941
peak-over-baseline (OD): avg 8 inst × 180 h × $0.096 = $ 138
subtotal $1,079
Nightly ETL
Spot (on good nights): 20 inst × 90 h × $0.17 × 0.10 = $ 31
on-demand (bad-night fallback, ~5 h): 20 × 5 × $0.17 = $ 17
SP Compute underneath (10 inst-h × 30 n × $0.17 × 0.34) = $ 17
subtotal $ 65
Dev / demo
on-demand (avg month, idle-shutdown enforced):
~15 inst × 120 h × $0.0416 $ 75
subtotal $ 75
Total with new pricing mix: $1,219
Comparable on-demand-everywhere bill: $3,450
Saving: ~65%
The numbers are illustrative, real prices change, real usage varies, but the shape is the point. The biggest line on the bill is the web tier, and it’s the one paying the smallest multiplier. The ETL and the dev fleet together are a rounding error by comparison, so spending weeks optimising them before the web tier is on a Savings Plan is the wrong order of operations.
What’s worth remembering
- The pricing question is a commitment question. AWS will sell the same compute hour for roughly $0.10 or roughly $0.01 depending on how much predictability we give them. The work is matching each workload’s actual predictability to a model, not finding “the cheapest”.
- On-demand is a real answer for unpredictable workloads. Committing on a guess is worse than paying sticker, because the commitment charges whether we use it or not. Save the commitment budget for the workloads that earn it.
- Savings Plans are the default for anything predictable and un-interruptible. Compute Savings Plans for breadth and serverless coverage; EC2 Instance Savings Plans when the deepest discount matters and the family and Region are genuinely locked in for the full term.
- Spot is the default for anything predictable and interruption-tolerant. The workload has to checkpoint, retry, or run in a fleet that can lose members without failing. Diversify across instance families and AZs; keep a small on-demand or SP capacity underneath as the floor.
- Reserved Instances still exist; they’re just narrower. Standard RIs match SP EC2 Instance Plans for depth but are stricter about size. Convertible RIs gave mobility before Compute Savings Plans launched; these days Compute SPs do the same job more cleanly.
- Breadth costs discount depth. Compute SP at ~66% vs EC2 Instance SP at ~72%: about six percentage points to buy the right to move families and Regions and pick up Fargate and Lambda. Usually worth it.
- Three-year commitments only pay back if the bet holds for three years. Match the term to the confidence. A one-year Savings Plan at ~40% off is often the correct first step when a workload’s long-term shape is still being argued about.
- Pricing is not the only cost lever. Idle-shutdown automation, right-sizing, and graviton migrations all compound with pricing choices. The best Savings Plan in the world won’t rescue an instance that’s running at 5% CPU all day.
On-demand is the sticker price; Savings Plans and RIs buy discount in exchange for commitment; Spot buys the deepest discount in exchange for interruption. Each of our three workloads has a different rhythm, and each rhythm has a pricing model that fits it. The work isn’t picking a favourite, it’s pairing each workload with the model that matches its shape.