How to Set Up S3 Cross-Region Replication for Compliance

January 25, 2027 · 15 min read

Solutions Architect Pro · SAP-C02 · part of The Exam Room

The situation

A platform stores PDF documents, XML statements, and JSON audit events, roughly 40TB of new data per month, growing. The regulator’s requirements are specific:

  • Jurisdictional redundancy: every object must exist in at least two Regions, and one of those Regions must be in the EU.
  • Seven-year immutability: once written, objects must not be deletable, not by the application, not by an operator, not by a compromised root account, for seven years from creation.
  • Audit trail: every write, every access attempt, every administrative action logged.
  • Recoverability from corruption: if the primary bucket is accidentally bulk-deleted or encrypted by ransomware, we must be able to recover the last known-good state.

Today the primary bucket is in eu-west-1. There is no replica. There is versioning but no object lock. Backups are a nightly Lambda that copies new objects to a second bucket in the same Region, which is neither cross-Region nor tamper-proof.

What actually matters

The core trade in S3 replication is durability breadth in exchange for cost and complexity. Single-Region S3 is already eleven nines durable. Adding a second Region doesn’t make objects more durable in the engineering sense; it makes them durable against a different failure mode, a Region-level incident, a jurisdictional takedown, a badly-scoped bucket policy that nukes the primary.

The first thing to ask is: what are we protecting against? Bit rot and single-AZ failure, S3 already handles. Whole-Region failure, replication handles. Accidental deletion by an application bug, versioning handles. Malicious deletion by a compromised role, Object Lock handles. Ransomware encrypting objects in place, Object Lock + versioning handles. Each of these is a different control.

The second is: what do we actually want replicated? All objects, some objects (by prefix or tag), some storage classes, some versions? Filtering matters; getting the filter wrong means the important objects aren’t in the replica or the unimportant ones are doubling the bill.

The third is: do we replicate deletes? If the primary bucket deletes an object and we replicate the delete, the replica loses it too, which is wrong if the replica is meant to survive a bulk-delete incident. Whether delete markers travel is a configuration choice with a real consequence.

The fourth is: what about objects that already exist? A continuous replication stream typically catches new writes only. Existing objects need a one-shot backfill. This is a common surprise: turning on replication doesn’t populate the replica; it just starts catching new writes.

The fifth is: the storage class in the destination. Hot tier in the primary, a colder tier in the replica, to save money, or match the primary, if the replica is a live failover target rather than just an archive.

What we’ll filter on

  1. Replication scope, all objects, or filtered by prefix/tag/storage class?
  2. Delete behaviour, do delete markers replicate?
  3. Existing-object handling, how do we populate the replica for what’s already there?
  4. Destination storage class, same as source, or cheaper tier?
  5. Immutability, can the replica be deleted by a compromised principal?

The replication landscape

  1. S3 Cross-Region Replication (CRR). Asynchronous replication to a bucket in a different Region. Versioning must be enabled on both source and destination. Replication runs in the background; most objects replicate within seconds, 99% within 15 minutes under normal load, with S3 RTC (Replication Time Control) guaranteeing 99.99% within 15 minutes at an additional cost. Supports prefix and tag filters, KMS encryption translation, storage class change at destination, and replica-side Object Lock.

  2. S3 Same-Region Replication (SRR). Same mechanism, same Region. Useful for log aggregation or compliance separation (e.g. a bucket with stricter access controls in the same Region), not for jurisdictional redundancy. Mentioned for completeness; not the answer here.

  3. S3 Multi-Region Access Points (MRAP). A global endpoint that routes requests to the nearest replicated bucket, with active-active replication between the members. Turns “two buckets with CRR” into “one logical bucket” from the application’s view. Costs extra per request; earns it when applications in multiple Regions read the same dataset.

  4. AWS Backup for S3. A backup-and-restore model over S3: AWS Backup creates point-in-time backups of a bucket, restorable to a new bucket. Not replication, no continuous stream, no latency on new writes. Useful as an extra layer on top of CRR for the corruption case, since the backup can be restored to before the corruption event.

  5. Batch Replication. A one-shot job that replicates existing objects (those created before CRR was enabled, or those that failed replication, or those matching a new rule). Used once, at the start, to seed the replica from the existing primary. Paid per object processed.

  6. S3 Object Lock. Not a replication feature, but the partner control for the immutability requirement. Writes a retention period or legal hold to each object version; during that period the version cannot be deleted or overwritten, not by any principal. Two modes: Governance (can be overridden by a principal with s3:BypassGovernanceRetention) and Compliance (cannot be overridden by anyone, not even the root account, until the retention period expires).

Side by side

Option Replication scope Delete behaviour Existing objects Dest storage class Tamper-proof
CRR Rule-based (prefix/tag) Configurable Batch Replication Configurable Requires Object Lock
SRR Same Configurable Batch Replication Configurable Requires Object Lock
MRAP All members Configurable Batch Replication Per member Requires Object Lock
AWS Backup for S3 Bucket-level Backup is immutable Full first backup Cold storage Native
Batch Replication One-shot per rule N/A Per rule N/A
Object Lock N/A Deletes blocked All versions N/A ✓ (Compliance mode)

The table doesn’t give us one answer because the problem has multiple parts. CRR handles jurisdictional redundancy. Object Lock on both buckets handles immutability. Batch Replication seeds the existing 40TB. AWS Backup sits underneath as the recover-from-corruption layer.

The configuration

eu-west-1 (source) S3, records-primary versioning: enabled Object Lock: Compliance, 7 years default SSE-KMS storage class: Standard public access: blocked KMS key: records-eu-west-1 grants S3 replication role kms:Decrypt CRR rule prefix: / delete markers: off eu-central-1 (dest) S3, records-dr versioning: enabled Object Lock: Compliance, 7 years SSE-KMS w/ dr key storage class: Glacier IR public access: blocked KMS key: records-eu-central-1 replication role kms:Encrypt CRR with RTC KMS translation AWS Backup (separate account) backup vault, records-backup daily point-in-time, 35-day + 7-year cold tier daily backup Logging account CloudTrail S3 data events every GetObject, PutObject, DeleteObject Batch Replication job (one-shot, existing 40TB) operates on a manifest of existing object versions replicates into the same destination with the same rules priced per object; run once, delete the job
CRR for ongoing jurisdictional redundancy, Object Lock on both sides for tamper-proof retention, AWS Backup for point-in-time recovery, CloudTrail data events for audit. Batch Replication seeds the existing 40TB once.

The picks in depth

CRR from eu-west-1 to eu-central-1. Both buckets created with versioning enabled and Object Lock in Compliance mode with a seven-year default retention. The replication rule replicates all objects (no prefix or tag filter, we want every record in both Regions), does not replicate delete markers (we don’t want a delete in the primary to cascade), uses S3 Replication Time Control (RTC) to guarantee 99.99% of objects replicate within 15 minutes, and translates the SSE-KMS encryption from the source’s KMS key to the destination’s KMS key.

The IAM role for replication is the piece people get wrong. It needs:

  • s3:GetReplicationConfiguration, s3:ListBucket on the source.
  • s3:GetObjectVersionForReplication, s3:GetObjectVersionAcl, s3:GetObjectVersionTagging on source objects.
  • s3:ReplicateObject, s3:ReplicateDelete, s3:ReplicateTags on destination objects.
  • kms:Decrypt on the source KMS key, kms:Encrypt and kms:GenerateDataKey on the destination KMS key.
  • Each KMS key’s key policy must also trust the role; KMS doesn’t defer to IAM alone.

The KMS grants are the most frequent CRR failure mode. Objects appear in the source, replication status shows FAILED, the log line says AccessDenied on the KMS action, always the destination key policy, almost never the source.

Object Lock on both buckets. Compliance mode, seven-year retention. Every object version gets the retention stamp at write time. No principal can delete a locked version during the retention period, not the application role, not an admin, not the root account of the paying account, not AWS Support. An object older than seven years can be deleted normally; the lock expires with the retention date.

Object Lock has a subtlety: the retention period travels with the object version when it’s replicated, but the default lock configuration is bucket-level and must be set on the destination bucket independently. Both buckets get the same default retention. If an attacker changes the source bucket’s default, new objects get the new retention, the replica’s default is the safety net.

Batch Replication for the existing 40TB. A one-shot job pointed at the existing objects. It’s driven by an inventory manifest (S3 Inventory can produce one of the entire bucket, or a filter can produce a targeted manifest). The job runs for hours to days depending on object count and size; throughput is typically limited by the destination bucket’s request rate, not the source. Cost is per-object-replicated plus data transfer.

Two things worth knowing: (1) Batch Replication respects existing replication rules, so running it twice is idempotent; (2) it’s the only way to replicate objects created before the rule existed, turning the rule on doesn’t retroactively replicate history.

AWS Backup for point-in-time recovery. Separate account, vault with vault-lock to prevent the backup itself from being tampered with. Daily backup of both the primary and the replica buckets (backup is a cheaper-per-object form of Batch Replication into a Backup-managed store). If the primary bucket is bulk-deleted or encrypted by a compromised role, we can restore from yesterday’s backup to a new bucket, re-enable Object Lock, repoint the application.

This is the layer CRR doesn’t cover: if the source bucket’s objects are overwritten with ransomware-encrypted content, CRR dutifully replicates the encrypted versions to the replica. Object Lock keeps the old versions un-deleted, which is the primary defence, but Backup is the “restore to bucket-level point-in-time” insurance on top.

CloudTrail data events. S3 data events (object-level GetObject, PutObject, DeleteObject) logged to a centralised trail in a separate logging account. These are the audit-trail requirement. Not cheap, per-API-call pricing, and 40TB of new objects per month means millions of PUTs, but the regulator asked for audit, and nothing else gives us the keystroke-level record.

A worked ransomware trace

A compromised IAM role encrypts every object in the primary bucket with the attacker’s key, a classic S3 ransomware pattern.

  1. CRR dutifully replicates the encrypted versions to eu-central-1. The replica bucket now has encrypted versions too.
  2. However: versioning is on, Object Lock is on, and the original versions of every object are retained and locked. The attacker cannot delete them; the retention period prevents it.
  3. To recover: list object versions (aws s3api list-object-versions), identify the latest non-corrupted version per object, PUT those versions back as new “current” versions. The encrypted versions remain but are no longer current.
  4. Or: restore from yesterday’s AWS Backup to a new bucket, repoint the application.

Recovery time is hours for the scripted version-restoration approach, or tens of minutes for the backup-restore approach. Data loss is whatever was written after the last backup (for the backup path) or nothing (for the version-restoration path).

Without Object Lock, the attacker would also delete old versions; versioning alone isn’t enough. Without versioning, overwrites replace originals; Object Lock alone isn’t enough. Both together is the pattern.

A worked jurisdictional cutover

The regulator informs us that eu-west-1 is being deprecated for our class of data and we have 90 days to serve from within the EU proper. The replica is already in eu-central-1; the cutover is:

  1. Application’s S3 client starts writing new objects to records-dr in eu-central-1 instead of records-primary.
  2. CRR rule reversed: eu-central-1 now replicates to eu-west-1 for the 90-day transition.
  3. After 90 days, decommission records-primary, keep the audit trail.

The replica was never idle, it was the insurance policy, and now it’s the primary. Batch Replication handles any objects that were written to the old primary during the cutover window.

What’s worth remembering

  1. CRR replicates new objects only, by default. Existing data needs Batch Replication. Turning on the rule and walking away leaves the replica empty for everything created before the rule.
  2. Versioning is mandatory on both sides. Without versioning, CRR cannot start; with versioning, overwrites become new versions and the old ones survive.
  3. Delete-marker replication is a choice. For DR, usually off, deletes in the primary shouldn’t cascade to the replica. For other use cases (keeping buckets in lockstep), usually on.
  4. KMS encryption translation requires key-policy grants. The replication role needs kms:Decrypt on the source key and kms:Encrypt/kms:GenerateDataKey on the destination key, and both key policies must trust the role. Most “replication failed” incidents come back to this.
  5. Object Lock Compliance mode means no one deletes. Not even the root account. Choose the retention period carefully; there’s no escape hatch. Governance mode keeps the same controls with a break-glass bypass permission.
  6. S3 RTC buys an SLA, not just faster replication. 99.99% within 15 minutes, plus CloudWatch metrics for missed objects. Costs extra per GB; earn it when the business needs to prove replication latency.
  7. Replication does not protect against overwrites. Versioning + Object Lock protect against overwrites. Replication protects against Region loss. Each control maps to a specific failure mode.
  8. AWS Backup sits below CRR as the point-in-time layer. Separate account, vault lock, compliance-mode retention. Handles the corruption-in-place case that CRR can’t.

Replication isn’t a single control; it’s one layer in a stack that includes versioning, Object Lock, encryption, audit, and backup. Get the stack right and the 40TB is both in two jurisdictions and genuinely safe from everything the regulator is asking about.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.