The situation
A media company runs www.example.media through CloudFront. Behind the distribution:
- Primary origin: an S3 bucket
media-prod-eu-west-1ineu-west-1, holding ~8 TB of video manifests, HLS segments, and thumbnail images. Bucket versioning on. Served via OAC (Origin Access Control). - Dynamic origin: an ALB in
eu-west-1fronting an API for auth, catalog search, and recommendation. Lives atapi.example.mediaas a separate CloudFront behavior. - Cache hit ratio: ~94% on the distribution overall, peaks near 99% for HLS segments.
Last Tuesday at 09:20 UTC: eu-west-1 S3 had a regional incident (the rare kind). CloudFront cache hits kept serving from edge locations; cache misses started returning 503. The 6% miss traffic became the 6% of viewers seeing a broken page loader. The incident lasted 43 minutes. The CDN dashboard showed “origin 5xx rate: 72%” during the event.
Leadership’s question: can we configure CloudFront so the next time eu-west-1 S3 has an incident, cache misses fall back to something else automatically?
What actually matters
Before drawing the config, it’s worth being clear about what CloudFront origin failover actually is.
The first thing is what triggers failover. CloudFront’s origin failover is a setting on an origin group, two origins, primary and secondary, plus a list of HTTP status codes that should cause CloudFront to try the secondary. The defaults cover 500, 502, 503, 504, 404, 403. Only these specific status codes trigger failover; a 200 response doesn’t, nor does a slow response that eventually succeeds. Connection-level failures (TCP timeout, DNS resolution failure) also count as “failed primary” and trigger failover.
The second thing is when failover is per-request, not per-origin-state. This is the critical nuance. CloudFront doesn’t have a persistent notion of “primary origin is unhealthy, route everything to secondary for the next five minutes.” Each request that misses cache tries the primary, and on failure (or configured status code), tries the secondary. The primary is tried on the next request too. This is different from Route 53 health checks (which have a state machine), or from load-balancer-style health checks (which evict unhealthy backends for an interval). Every cache miss pays the primary-failure cost before getting to secondary.
The third thing is what the secondary should be. Options: a replica bucket in a different Region (with CRR keeping it up to date), an ALB in a different Region, a different S3 bucket entirely (a static-maintenance page), or even a different AWS service like API Gateway. The secondary has to be able to serve the same request paths; it doesn’t have to be a byte-identical replica, but it has to be functionally able to respond.
The fourth thing is the consistency story on failover. If the secondary is a replica of the primary, there’s a replication lag, typically seconds, with paid tiers offering tighter SLAs. For object-heavy media (HLS segments named by timestamp, thumbnail hashes) that’s usually acceptable, the old segment on the secondary is still a valid segment. For scenarios where staleness matters (a ticketing system serving “inventory: 3 left” cached for 60 seconds), the replication lag can serve wrong data during failover.
The fifth thing is what origin failover doesn’t replace. Origin failover is a request-path fallback inside CloudFront. It doesn’t give you a DNS-level regional failover; if CloudFront itself is degraded, origin failover does nothing for you. It’s one tool in a larger resiliency toolbox: origin failover inside CloudFront, Route 53 failover across distributions, multi-Region active-active at the application level. The biggest distributions typically use all three in layers.
The sixth thing is cost shape. The failover feature itself is free. The cost is whatever the secondary costs when it’s serving traffic: a replica bucket adds storage cost plus inter-region replication fees on top of whatever the primary already costs. At our footprint that’s a small monthly add-on relative to the cost of a regional incident, cheap insurance compared to the impact of leaving cache misses to fail.
What we’ll filter on
Filters for each failover topology:
- Triggers on primary failure, does the setup detect an unavailable origin and switch?
- Recovery time on failover, how fast does a viewer start getting successful responses?
- Consistency during failover, is the secondary content correct, or stale/different?
- Blast radius of a broken secondary, if the secondary is also wrong, how bad?
- Covers region-level failure, does the secondary live in a different failure domain?
- Operational overhead, how much ongoing work to keep the two in sync?
The failover landscape
-
No origin failover (status quo). One origin. Primary unavailable → 5xx to viewer. Simple and what got bitten on Tuesday.
-
Origin failover to CRR replica bucket in another Region. Primary S3 in
eu-west-1, secondary S3 inus-east-1, CRR keeping them in sync. Origin failover triggers on 500/502/503/504 and connection errors. Viewer sees the secondary content; CRR lag may mean objects that were uploaded toeu-west-1in the last few seconds aren’t yet inus-east-1. For media segments, that’s usually acceptable. -
Origin failover to a maintenance-page bucket. Primary serves real content; secondary serves a static “we’ll be back shortly” page. Every cache miss during an outage lands on the maintenance page. Acceptable for some services (B2B tools); unacceptable for a media streaming service that users expect to Just Work.
-
Origin failover to an ALB in another Region. Primary is dynamic (ALB+ECS in
eu-west-1), secondary is a replica stack inus-east-1. Both serve the same API. CRR-like replication at the application layer (database replicas, eventual consistency). Higher complexity, but handles both dynamic-API and static-content paths. -
Origin failover to a different service entirely. Primary is an ALB, secondary is API Gateway with a simple Lambda that serves degraded responses (read-only, no write paths). The business logic of “what does degraded mode look like” lives in Lambda. More design work; more honest UX during a partial outage.
-
Route 53 failover on top of CloudFront. Two CloudFront distributions (or one distribution with origin failover and a failover alias), Route 53 health checks routing between them. More moving parts; covers CloudFront-level regional failures that origin failover alone can’t.
Side by side
| Option | Triggers on failure | Recovery time | Consistency | Blast radius | Region-level | Ops overhead |
|---|---|---|---|---|---|---|
| No failover | ✗ | n/a | — | — | ✗ | — |
| S3 + CRR replica | ✓ | Per-request | Seconds of lag | Low | ✓ | CRR rules |
| Static maintenance page | ✓ | Per-request | n/a (degraded) | Very visible | ✓ | None once set |
| ALB + replica stack | ✓ | Per-request | Depends on DB replication | Medium | ✓ | Full 2-region deploy |
| API Gateway + Lambda | ✓ | Per-request | Degraded by design | Low (transparent) | ✓ | Lambda code |
| Route 53 + 2× distributions | ✓ (health check) | ~TTL + detection | Depends | Low | ✓✓ | Health checks, DNS |
For the media company: options 2 and 4 in combination. Static content (S3) gets option 2. CRR replica in another Region. Dynamic content (ALB API) gets option 4, a replica stack in another Region, sized smaller, kept warm enough to absorb failover traffic.
How origin failover routes a cache miss
The setup in depth
Origin group configuration. On the distribution, create an origin group referencing two origins, the eu-west-1 primary bucket and the us-east-1 secondary bucket. The failover criteria default to 403, 404, 500, 502, 503, 504. Some teams drop 403 and 404 from the list because those can legitimately mean “not found here” rather than “origin is broken”; for this media scenario, keep 500-504 and drop 403/404 (because CRR doesn’t replicate deleted objects; a 404 on primary could be valid, and going to secondary would return a stale object that should have been removed).
{
"Id": "media-origin-group",
"FailoverCriteria": {
"StatusCodes": {
"Quantity": 4,
"Items": [500, 502, 503, 504]
}
},
"Members": {
"Quantity": 2,
"Items": [
{ "OriginId": "media-prod-eu-west-1" },
{ "OriginId": "media-replica-us-east-1" }
]
}
}
Cross-Region Replication setup. On media-prod-eu-west-1, a replication rule replicating all objects (or a filtered subset) to media-replica-us-east-1. Replication Time Control (RTC) guarantees 99.99% of objects replicated within 15 minutes, the higher-priced tier; for non-critical media objects, plain CRR (no SLA, typical seconds to minutes) is usually enough.
aws s3api put-bucket-replication \
--bucket media-prod-eu-west-1 \
--replication-configuration '{
"Role": "arn:aws:iam::111122223333:role/s3-replication-role",
"Rules": [{
"Status": "Enabled",
"Priority": 1,
"DeleteMarkerReplication": { "Status": "Enabled" },
"Filter": { "Prefix": "" },
"Destination": {
"Bucket": "arn:aws:s3:::media-replica-us-east-1",
"StorageClass": "STANDARD"
}
}]
}'
DeleteMarkerReplication: Enabled so that objects deleted on primary get deleted on secondary too, if you don’t, the secondary grows indefinitely and failover might serve objects that no longer exist.
Origin Access Control on both buckets. OAC is the modern replacement for Origin Access Identity. Each origin in the origin group needs its own OAC, and each bucket’s policy must allow CloudFront’s distribution to s3:GetObject via the aws:SourceArn condition matching the distribution’s ARN. Both buckets need this; otherwise failover hits the secondary and gets 403 (AccessDenied), which isn’t in the failover criteria anymore and surfaces as a viewer error.
Custom error response vs origin failover. CloudFront has two related-but-different features:
- Origin failover tries a second origin on specific status codes.
- Custom error response serves a specific status code with a specific body (often 200 OK, with a maintenance page body) on viewer-facing errors.
They layer. Origin failover fires first; if both origins return 5xx, custom error response catches the double-failure and serves the maintenance page. For a belt-and-braces setup, configure both: origin failover to CRR replica, custom error response on final 5xx to serve /maintenance.html from a third origin or as a fixed response.
The dynamic side. The ALB-fronted API gets a second origin group pointing at a replica ALB in us-east-1. The replica stack runs at lower capacity steady-state (2 ECS tasks instead of 20) with autoscaling that can ramp on traffic. Database layer uses Aurora Global Database with the read replica in us-east-1 promoted if a full failover is required; for transient regional S3 incidents, the read replica serves enough of the API read path to keep viewers browsing even if write paths degrade.
A worked incident
The next regional event hits at 14:47 UTC on a Friday. Ravi is on call.
14:47:12 CloudWatch alarm: OriginLatency p99 on media-prod-eu-west-1 > 5000 ms
14:47:15 CloudFront ResponseFromPrimary metric dropping
14:47:21 ResponseFromSecondary metric climbing; origin group working as designed
14:47:28 S3 Health Dashboard confirms eu-west-1 degradation
14:50:00 Cache-hit ratio on distribution: 94% (unchanged)
Cache-miss ratio served by secondary: 98.4% of misses get 200
Viewer-observed error rate: 0.6%
Ravi’s job during this window is to watch, not to intervene, origin failover is handling it. He does:
- Silence the pager for CloudFront origin latency alarms (expected during the incident).
- Scale out the
us-east-1API replica ALB to accept more traffic (from 2 tasks to 20). - Update the status page: “degraded service in eu-west-1 region, failover active, viewers may see occasional errors.”
Without origin failover, the missed-cache traffic (6% of all requests) would all have failed. With origin failover, the viewer-observed error rate stays under 1% because the failover itself misses only occasionally (when the secondary itself returns a failover-eligible code, which is rare but happens).
The incident clears at 16:10 UTC. The primary comes back; CloudFront’s origin group starts succeeding on the primary again automatically on the next miss. No human action required to “fail back.”
What’s worth remembering
- Origin failover is a CloudFront-level fallback per cache miss. An origin group with two origins, a set of failover-eligible status codes, and CloudFront tries secondary when primary returns one of them.
- The default status codes are 500, 502, 503, 504, 404, 403. Most teams keep the 5xx codes and consider 403/404 case by case, they can legitimately mean “not here” rather than “origin broken.”
- CloudFront doesn’t cache the failover decision. Every cache miss re-tries the primary first; a broken primary adds a small latency cost to every miss until it recovers. For a steady 94% hit ratio, the 6% miss traffic is where the impact lives.
- The secondary needs the same access permissions. OAC on both buckets, bucket policies allowing the distribution on both, replication role scoped to write into the secondary. Easy to miss and manifests as 403s on failover.
- Cross-Region Replication has a lag. Plain CRR typically seconds; Replication Time Control guarantees 99.99% within 15 minutes. For most media content the lag is fine; for consistency-sensitive data it isn’t.
DeleteMarkerReplication: Enabledkeeps the two buckets consistent on deletes. Without it, deletes stay on primary only and failover can serve content that’s supposed to be gone.- Origin failover and custom error responses compose. Failover handles origin errors by trying a backup; custom error response handles viewer-facing errors with a configured body. Both together catch the case where both origins fail.
- For dynamic APIs, the “backup origin” is a whole stack, not just a bucket. Replica ALB, replica ECS tasks, replica database (Aurora Global Database or DynamoDB Global Tables). Origin failover at the CloudFront layer is one piece; regional failover at the app layer is the rest.
CloudFront origin failover is the cheapest form of insurance against a regional origin outage for a well-cached distribution. It doesn’t save you from a bad week in AWS, it softens the minutes to hours when a single Region has a bad hour. The next time the origin doesn’t answer, most viewers will never notice.