The situation
A platform engineering team of four owns the security baseline for a mid-sized AWS Organization. The Org has fifty member accounts today, with new accounts being spun up for product teams roughly twice a month. The baseline is the same every time: AWS Config enabled with managed rules for encryption and identity compliance, GuardDuty on in every Region, an IAM account password policy, default EBS encryption, account-level S3 block-public-access, and an existing CloudTrail organisation trail from the management account the baseline has to coexist with.
The team wants the baseline to deploy identically to every current member account, enrol new accounts automatically, detect drift from a central view, offer a remediation path, and not become the team’s full-time job. Four engineers, fifty accounts, twice-a-month new arrivals, the arithmetic decides what “reasonable overhead” has to mean.
What actually matters
Before picking a tool, it’s worth sitting with what’s actually being asked of the deployment mechanism, because several plausible options look correct until one of the requirements bites.
The first question is ownership. A security baseline is a long-lived artefact, not a one-shot script. It will be updated, audited, and occasionally blamed. A mechanism that works today but lives in someone’s workstation deploy history is a mechanism that will drift, silently, as the team changes. We want the baseline to be a declared artefact, in a repository, deployed by a reproducible process, not a sequence of console clicks or a weekly recurring Jira ticket.
Blast radius is the next thing. A bad baseline template pushed to fifty accounts simultaneously can break fifty accounts simultaneously. The deployment mechanism needs to expose the knobs that control concurrency and tolerance for failure: how many accounts at a time, how many Regions at a time, how many failures before the rollout halts. A mechanism that treats the fleet as one big button is a mechanism we daren’t push.
Cost shape here is unusual, the baseline itself is mostly feature-flags and configuration, not compute. What we’re really paying is engineering time per deployment. A mechanism that costs half a day per new account loses on arithmetic; the team has fifty accounts to start with and adds another every fortnight, so the per-account overhead has to round to zero.
Rollback is subtle because there are two rollback shapes. One is “undo the deployment”, a bad template went out, put the old one back. The other is “enforce the declared state”, someone in an account changed a baseline resource and we want it snapped back. The first is a deployment mechanism concern. The second is a drift-enforcement concern, and it may or may not be the same tool.
Observability matters across the fleet more than within any single account. “Did the rollout succeed everywhere?” should be a single query, not fifty. “Which accounts have drifted?” should be a single dashboard. If we have to ssh into each account to find out, we’ve rebuilt the problem the central mechanism was meant to solve.
Finally, coupling to the account-vending process matters. The Org gains accounts faster than the team can hand-enrol them. Any mechanism that treats enrolment as “an engineer does something, once, per new account” is going to fall out of sync within a month. The enrolment path has to be event-driven. Organizations hands the account to the deployment mechanism automatically, or we’re back to the ticket queue.
What we’ll filter on
- N-account deployment, push the same declared state to fifty accounts from one action.
- Automatic enrolment of new accounts. Organizations tells the deployment mechanism when an account joins the targeted scope.
- Drift detection, compare deployed state against declared state across the fleet in one query.
- Remediation path, enforce the declared state, or hand off cleanly to something that does.
- Low overhead, no per-account wiring, no bespoke control plane, no ticket per new arrival.
The multi-account deployment landscape
AWS offers several answers to “deploy this across many accounts.”
CloudFormation StackSets, self-managed permissions. Template plus target account list plus target Region list. Self-managed requires an execution IAM role in every target account with a trust policy pointing at an administration role. Works for any account you have the IAM footprint to touch, including accounts outside an Organization. No auto-enrolment of new accounts, adding an account means engineer action.
CloudFormation StackSets, service-managed permissions. Same StackSet primitive, but trust is hosted by AWS Organizations. Enable trusted access between CloudFormation and Organizations; AWS manages the execution roles across the Org. Targeting is by OU, not account list. Auto-deployment is first-class: any account added to a targeted OU gets its stack instances automatically.
AWS CDK in CodePipeline, one stage per account. CDK generates the baseline; CodePipeline deploys it into each account via cross-account CloudFormation stages. Flexible but not account-aware, a new account means a new stage, new cross-account roles, and a pipeline edit.
Terraform with provider aliases, per-account state files. A root module where each provider alias points at a different account, each with its own state. Same mechanical problem: adding an account means editing the root and running apply. Drift detection exists via terraform plan, but the state has to be consulted per-account.
AWS Control Tower customisations. Control Tower is AWS’s opinionated landing-zone product. Customisations for Control Tower (CfCT) and Account Factory Customization (AFC) attach custom CloudFormation or Terraform to the account-vending workflow. Elegant when Control Tower is the account-vending path; adopting it purely to solve a deployment problem is a bigger programme than a StackSet.
AWS Config conformance packs. A YAML template bundling Config rules and remediation actions. Deployed via organisation-wide APIs, lands in every member account’s Config service with a single compliance view. Scoped to Config’s domain: a pack can check GuardDuty is enabled; it cannot enable GuardDuty.
Side by side
| Mechanism | Deploy to 50 | Auto new accts | Drift detect | Remediate | Low overhead |
|---|---|---|---|---|---|
| StackSets (self-managed) | ✓ | ✗ | ✓ | , | ✗ |
| StackSets (service-managed) | ✓ | ✓ | ✓ | , | ✓ |
| CDK in CodePipeline | ✓ | ✗ | , | , | ✗ |
| Terraform (aliased providers) | ✓ | ✗ | , | , | ✗ |
| Control Tower customisations | ✓ | ✓ | , | , | ✗ |
| Config conformance packs | , | ✓ | ✓ | ✓ | ✓ |
Matching the baseline layers to mechanisms
Service-managed StackSets, in depth
A StackSet is a parent resource in the admin account describing a template, parameters, a target scope, and deployment preferences. When deployed, CloudFormation creates stack instances, one per (account, Region) pair. Each instance is a real CloudFormation stack in that account’s Region, managed by the parent StackSet.
Service-managed permissions. Enable trusted access between CloudFormation and AWS Organizations (once, from the management account). CloudFormation provisions and rotates the IAM roles it needs in every member account. The administration side runs from either the management account or a delegated administrator registered for CloudFormation. Delegated admin is the correct choice for most Orgs: it keeps the management account reserved for Org-level operations and lets the platform team run StackSets from their own account.
Targeting by OU. The target isn’t a list of account IDs; it’s a tree node, and membership is resolved by Organizations at deployment time. If all 50 member accounts sit in a Workloads OU, the target is “Workloads” and StackSets expands the account list. Accounts can be filtered out of an OU target, sandbox or break-glass accounts, typically.
Targeting by Region. Region list is separate from account list. A StackSet with Regions [eu-west-1, us-east-1] creates two stack instances per account. Account-global resources (like account-level S3 PAB) only need one Region’s stack instance to take effect.
Auto-deployment. When enabled, AWS Organizations informs CloudFormation whenever an account joins or moves into a targeted OU; CloudFormation creates stack instances across all targeted Regions automatically. When an account leaves the OU, the companion setting RetainStacksOnAccountRemoval decides whether instances are deleted or retained.
Operation preferences. MaxConcurrentCount (1–1000, default 1) for concurrent accounts; MaxConcurrentPercentage as a percentage alternative; FailureToleranceCount or FailureTolerancePercentage for how many account-level failures before abort; RegionConcurrencyType (SEQUENTIAL or PARALLEL); RegionOrder for explicit sequential Region ordering. Sensible defaults for a security baseline: low concurrency (5–10), zero failure tolerance, sequential Regions with a canary first.
Drift detection. StackSet drift detection compares each stack instance’s current resource state against the expected state and reports per-instance drift. It is on-demand: triggered from the console, from aws cloudformation detect-stack-set-drift, or from a schedule you build. AWS doesn’t run it on its own schedule. Only one drift detection operation can run on a given StackSet at a time.
StackSets does not auto-remediate drift. Detecting that someone turned off default EBS encryption in Account 42 produces a report, not a re-deployment. The enforcement move is to re-deploy the template, an UpdateStackSet that pushes declared state back over whatever’s there. StackSets is detection plus enforcement; the decision to enforce is yours.
A worked deployment trace
The team builds security-baseline.yaml with resources for GuardDuty, the Config recorder, an IAM password policy, default EBS encryption, and account-level S3 PAB. They register their platform account as a delegated administrator for CloudFormation and create a StackSet with service-managed permissions, the Workloads OU as the target, two Regions, auto-deployment enabled, MaxConcurrentCount: 10, FailureToleranceCount: 0, RegionConcurrencyType: SEQUENTIAL.
T+0. StackSet created. CloudFormation fans out to the fifty members, ten at a time, eu-west-1 first.
T+~15 minutes. eu-west-1 done. StackSets moves to us-east-1.
T+6 days. A product team vends a new account into Workloads. Organizations notifies CloudFormation; auto-deployment fires; within minutes the new account has stack instances in both Regions. No ticket.
T+30 days. The scheduled drift scan runs. EventBridge fires nightly at 02:00 UTC, invoking a Lambda that calls detect-stack-set-drift. Twenty minutes later one account reports MODIFIED on AWS::S3::AccountPublicAccessBlock; an operator disabled BlockPublicPolicy for a vendor bucket. Drift count 1 out of 102. An alarm on the drift count transitions to ALARM. The on-call engineer reads the drift diff, decides it isn’t an approved exception, and runs update-stack-set to re-push the declared state. Public-access block snaps back; drift count returns to 0.
T+60 days. A new baseline requirement: add a Config rule for IMDSv2-required. The team edits the template, merges, runs update-stack-set. Thirty minutes later every account has the new rule. No one logged into any account.
Config conformance packs as a complement
For the Config-rule portion of the baseline, conformance packs are a strong complement, not a replacement, for the StackSet.
A pack bundles Config rules and remediation actions (via SSM Automation). Deployed via put-organization-conformance-pack from the management or delegated admin account, it lands in every targeted member account’s Config service with a single aggregated compliance dashboard. Where a StackSet gives “the template is deployed and hasn’t drifted,” a pack gives “the resources the rules check are compliant, and when they aren’t, here’s the remediation action that runs.”
The dividing line: StackSets deploys resources; conformance packs deploy rules and remediations that evaluate resources. The account-level S3 PAB is a resource; “no bucket shall allow public ACLs” is a rule. StackSet for the first, pack for the second. The two layers coexist.
The pack’s auto-remediation is where the remediation attribute lands properly. StackSets’ re-push is blunt (“restore the declared state of everything”); Config rule remediation is targeted (“this specific resource, fix this specific property”). A realistic baseline programme uses both.
When CDK pipelines are the correct answer
Service-managed StackSets fits when deployment logic is a single CloudFormation template. CDK in a per-account CodePipeline earns its complexity when per-account configuration varies in ways parameters can’t cleanly express, when cross-account references need orchestration StackSets can’t produce, or when the rollout needs deploy-test-promote staging CDK’s pipeline construct handles natively. For a uniform security baseline, CDK-in-CodePipeline pays for complexity it doesn’t need.
What’s worth remembering
- StackSets has two permission models. Self-managed needs per-account IAM roles; service-managed uses trusted access with Organizations and AWS manages the roles.
- Service-managed StackSets target by OU, not by account list. Membership is resolved at deployment time.
- Auto-deployment is the feature that earns “new accounts join automatically.” Only service-managed StackSets have it.
- Operation preferences control blast radius.
MaxConcurrentCount(1–1000, default 1),FailureToleranceCount,RegionConcurrencyType,RegionOrder. - Delegated administrator lets the platform team run StackSets from their own account rather than the Org management account.
- Drift detection is on-demand, not scheduled. Only one drift operation per StackSet can run concurrently.
- StackSets does not auto-remediate drift. Re-deploying the template is the enforcement move.
- Conformance packs deploy at the Org level and cover rule-level governance. They complement StackSets, resources via StackSet, rules via pack.
- Control Tower is correct when the Org is already built on it. Adopting it purely to solve deployment is a bigger programme than a StackSet.
- CDK in CodePipeline is correct when the deployment logic exceeds CloudFormation. Not the correct answer for a uniform baseline.