The situation
Acme has 600 EC2 instances, 40 RDS instances, 15 Lambda functions of consequence, and a monthly bill that the CFO has politely asked about. FinOps wants a programme of cost reviews: what are we running, what’s it costing, what should we change, and how do we know the recommendation is real.
The console has two obvious starting points. Trusted Advisor offers a dashboard of checks across five pillars (cost, performance, security, fault tolerance, service limits) with a traffic-light rating for each. Compute Optimizer sits separately and offers right-sizing recommendations for EC2, EBS, Auto Scaling Groups, Lambda, and ECS-on-Fargate based on observed utilisation.
The FinOps lead’s first instinct is to pick one. Which one is better? That question has no useful answer, because the two services are solving different problems.
What actually matters
The mental model to hold: Trusted Advisor checks account-level configuration against AWS best practice. Compute Optimizer analyses workload-level utilisation against instance shape. They overlap only at the boundary of “this EC2 instance is idle,” and even there they give different advice: Trusted Advisor flags Low Utilization EC2 Instances as a configuration issue (“consider stopping or downsizing”); Compute Optimizer recommends a specific smaller instance type based on CPU, memory, network, and disk observations over the last 14 days.
The difference matters because the two services need different data to be useful, and the recommendations they produce are actionable at different points in the workflow.
Trusted Advisor’s checks are rule-based on account state. Does this EBS volume have a snapshot? Is this S3 bucket public? Is this security group open on 22 to 0.0.0.0/0? Is this RI under-utilised? Is this service approaching its account quota? The data it needs is the AWS API’s view of the account, it doesn’t need CloudWatch metrics or process-level utilisation; it needs the configuration of the resources, and it gets that from the same APIs the console uses. Checks run periodically (most daily, some every five minutes for service limits) and the result is a list per check with “OK”, “warning”, or “error” and a count of affected resources.
Compute Optimizer’s recommendations are utilisation-based per workload. For an EC2 instance, it pulls 14 days of default CloudWatch metrics (CPU, network, disk) plus, if the CloudWatch agent is installed, memory and disk-space-used, and produces three recommendation shapes: NotOptimized, Optimized, Over-provisioned, or Under-provisioned, along with a ranked list of alternate instance types with projected CPU, projected cost, and performance risk (Very Low, Low, Medium, High, Very High). The richer the input data, the better the recommendation, memory data in particular closes the biggest blind spot of CPU-only recommendations.
So the first question is: what kind of advice do we need? “Are we leaving money on the table via misconfiguration?” is Trusted Advisor. “Is this specific workload on the correct instance?” is Compute Optimizer.
The second question: what tier of Trusted Advisor do we have? The free tier (available to all accounts) gives a fixed subset of checks, 7 security checks and the service-limits check. The full check set (230-plus, across all five pillars) requires Business, Enterprise On-Ramp, or Enterprise Support. A FinOps programme that depends on the cost-pillar checks needs a Support plan that unlocks them.
The third question: what data has Compute Optimizer had to learn from? CPU-only data will produce recommendations that don’t understand memory-bound workloads. Enabling the enhanced-infrastructure-metrics setting and installing the CloudWatch agent for memory (and for Linux, disk-space-used) on targeted instances is the step that turns “reasonable” recommendations into “confident” recommendations. Without memory metrics, a JVM that’s using 90% of 32 GiB looks the same to Compute Optimizer as one using 10%, both show 20% CPU.
The fourth: how does this plug into an actual process? A recommendation is a suggestion, not an action. Either human judgement reviews and implements it, or an automation reads it from the API (compute-optimizer:GetEC2InstanceRecommendations) and files a ticket, or, in the most ambitious shops, runs it through a gated pipeline that right-sizes. The recommendations are the starting point; what happens next is an organisational choice.
The fifth: where does this show up in a dashboard? Both services expose findings through CloudWatch Events / EventBridge and through their own APIs. Trusted Advisor checks can be pulled into Security Hub and into custom dashboards via the Support API. Compute Optimizer exports a CSV to S3 on demand and integrates with Cost Explorer so the projected savings show up in the rightsizing view.
What we’ll filter on
- Scope of analysis, account configuration or workload utilisation?
- Input data source, resource config via API, or CloudWatch metrics over 14 days?
- Support-plan gating, does it need Business/Enterprise Support to be useful?
- Memory-aware, does it consider RAM utilisation or only CPU/network/disk?
- Breadth, how many service domains does it cover?
- Actionable output, does it suggest specific replacements or flag issues to investigate?
The advisory landscape
-
Trusted Advisor (free tier). Seven security checks (e.g. root MFA, S3 bucket public access, security group open on sensitive ports) plus the service-limits check. No cost pillar to speak of. Fine as a security floor; not a FinOps tool.
-
Trusted Advisor (Business/Enterprise On-Ramp/Enterprise Support). Full set: 230-plus checks including the entire cost pillar (
Low Utilization EC2 Instances,Idle Load Balancers,Underutilized EBS Volumes,Underutilized RDS Database Instances,Amazon RDS Idle DB Instances,Savings PlansandReserved Instance Optimizationrollups, and so on). Daily evaluation. Enterprise tiers also get the Trusted Advisor Priority view and proactive recommendations from the TAM team. -
Compute Optimizer. Opt-in service that analyses CPU, network, disk (and memory / disk-space-used with enhanced metrics), produces
Under-provisioned,Optimized, orOver-provisionedfindings per resource, and ranks up to three recommended alternative types with projected monthly cost. Covers EC2, Auto Scaling Groups, EBS volumes, Lambda, ECS services on Fargate, and RDS (MySQL/Postgres engines) for the graviton and right-sizing cases. Free. -
Cost Explorer rightsizing recommendations. Shares the Compute Optimizer engine for EC2 recommendations and layers savings estimates on top. Useful for finance-facing reports; the underlying recommendations are the same.
-
AWS Cost Anomaly Detection. Different class of tool, anomaly alerts on spend, not right-sizing. Adjacent, worth having, not what this scenario asks for.
-
Third-party tooling (Cloudability, Flexera, nOps, etc.). Wrap the AWS APIs and often add chargeback, commitment management, and multi-cloud. Out of scope unless the shop already uses them.
Side by side
| Option | Scope | Input data | Support-gated | Memory-aware | Breadth | Actionable output |
|---|---|---|---|---|---|---|
| TA (free) | Account config | Resource APIs | ✗ | ✗ | 7 security + limits | Issue list |
| TA (Business+) | Account config | Resource APIs | ✓ | ✗ | 230+ across 5 pillars | Issue list per check |
| Compute Optimizer | Workload utilisation | CloudWatch 14 days | ✗ | ✓ (with agent) | EC2 / ASG / EBS / Lambda / Fargate / RDS | Specific type recommendations |
| Cost Explorer rightsizing | Workload utilisation | CO engine | ✗ | ✓ | EC2 only | Type + cost delta |
| Cost Anomaly Detection | Spend patterns | Bill | ✗ | n/a | All services | Anomaly alerts |
| Third-party | Varies | Varies | ✗ | Varies | Varies | Varies |
Reading by question:
- “Are we leaving money on the table via misconfiguration?” Trusted Advisor (Business+). The Idle Load Balancer check alone typically pays for a year of Business Support in a month.
- “Are these specific workloads on the correct shape?” Compute Optimizer with enhanced metrics. CPU-only is a starting point; memory-aware is the FinOps tool.
- “Are we drifting on spend?” Cost Anomaly Detection, separate from either.
The programme wants all three. The two in the title are complementary, not competitive.
Where each advisor fits in the cost review
Trusted Advisor in depth
The Support API is the programmatic surface. Each check has an ID (e.g. Qch7DwouX1 for Low Utilization EC2 Instances), a category (cost_optimizing), and a status; DescribeTrustedAdvisorCheckResult returns the affected resources. A daily Lambda that pulls every cost-pillar check and writes to a Slack channel is a two-hour integration:
import boto3
support = boto3.client('support', region_name='us-east-1') # only us-east-1 for Support API
checks = support.describe_trusted_advisor_checks(language='en')['checks']
cost_checks = [c for c in checks if c['category'] == 'cost_optimizing']
for check in cost_checks:
result = support.describe_trusted_advisor_check_result(checkId=check['id'], language='en')
flagged = result['result']['flaggedResources']
if flagged:
savings = sum(r.get('metadata', [None]*10)[4] or 0 for r in flagged if str(r.get('metadata', [None]*10)[4]).replace('.', '').isdigit())
print(f"{check['name']}: {len(flagged)} resources, est. savings ${savings}/mo")
Two things worth knowing. The Support API is only available in us-east-1 regardless of which Region the checks cover, it’s a global service. And Trusted Advisor Priority (Enterprise Support) layers TAM-curated prioritisation on top of the raw check list, which is useful when the raw list runs to several hundred findings and the question is “which five should we work on first.”
For the org-wide view, Trusted Advisor also feeds a per-organisation dashboard when AWS Organizations is configured with Trusted Advisor enabled as a trusted service, findings across all accounts appear in the management (or delegated admin) account.
Compute Optimizer in depth
Opt-in is a single call from the management account: UpdateEnrollmentStatus with Status: Active and includeMemberAccounts: True. After that, Compute Optimizer needs 14 days of CloudWatch data before it produces useful recommendations; instances that have been running less than 30 hours of the last 14 days are marked InsufficientData.
The lever that transforms recommendations from useful to reliable is enhanced infrastructure metrics plus memory utilisation. Enhanced metrics is a paid tier that extends the analysis window from 14 to 93 days, which matters for workloads with monthly or quarterly peaks. Memory utilisation requires the CloudWatch agent to publish CWAgent/mem_used_percent; once Compute Optimizer sees it, recommendations that would otherwise right-size away RAM the workload actually needs go away.
aws compute-optimizer update-enrollment-status \
--status Active \
--include-member-accounts
Reading a recommendation (GetEC2InstanceRecommendations) produces a structured object:
{
"instanceArn": "arn:aws:ec2:eu-west-1:111122223333:instance/i-0abc1234",
"currentInstanceType": "m6i.2xlarge",
"finding": "Overprovisioned",
"utilizationMetrics": [
{ "name": "CPU", "statistic": "Maximum", "value": 12.4 },
{ "name": "MEMORY", "statistic": "Maximum", "value": 38.1 }
],
"recommendationOptions": [
{
"instanceType": "m6i.large",
"projectedUtilizationMetrics": [
{ "name": "CPU", "statistic": "Maximum", "value": 49.6 },
{ "name": "MEMORY", "statistic": "Maximum", "value": 76.2 }
],
"performanceRisk": 1.0,
"rank": 1,
"savingsOpportunity": { "estimatedMonthlySavings": { "value": 87.40 } }
}
]
}
performanceRisk on a 1-5 scale (Very Low through Very High) is the single most useful field. A risk-1 recommendation for an over-provisioned instance is almost always worth taking; a risk-4 recommendation on an under-provisioned workload is a sign the recommended type might not have enough headroom for the next week’s traffic and deserves a human look.
The recommendations feed Cost Explorer’s rightsizing report, which adds a monthly-savings-projection column and sorts by biggest wins. For programmatic pipelines, the S3 export (ExportEC2InstanceRecommendations) writes a CSV to a bucket on demand, useful for feeding a spreadsheet, a Jira import, or a Config conformance-pack-style review.
Handling contradictions between the two
The interesting overlap: an instance that Trusted Advisor flags as Low Utilization EC2 Instances (less than 10% CPU on 4 or more days of the last 14) is almost always the same instance Compute Optimizer flags as Over-provisioned. The advice converges but the action differs.
Trusted Advisor’s check is binary: the instance is flagged or it isn’t, with a suggested action of “consider stopping or downsizing.” Compute Optimizer is specific: “the workload would run on m6i.large instead of m6i.2xlarge with projected max CPU of 49% and estimated savings of $87/month.” Use Trusted Advisor to find the candidates; use Compute Optimizer to pick the replacement.
Where they genuinely disagree: Trusted Advisor’s check uses a 4-day threshold and only CPU; Compute Optimizer uses 14 days (or 93 with enhanced metrics) and, with the agent, memory. A workload that peaks monthly, a payroll run, an end-of-quarter ETL, can look low-utilisation to Trusted Advisor between peaks and correctly-sized to Compute Optimizer across the whole window. The longer look-back wins; downsizing something that spikes once a month because Trusted Advisor looked at the other 26 days is how right-sizing programmes earn a bad reputation.
A worked cost review
One quarter in. The FinOps programme runs on a simple cadence: monthly, pull Trusted Advisor cost findings into a dashboard; monthly, pull Compute Optimizer Over-provisioned recommendations where performanceRisk <= 2; match the two; file JIRA tickets for each match with both sources linked; implement; measure the next month’s bill.
Month 1 results, rough numbers across 600 EC2 instances:
- Trusted Advisor flagged 47 Low Utilization EC2 instances, 8 Idle Load Balancers, 23 Underutilized EBS Volumes, 4 Idle RDS Databases. Estimated savings roll-up: $4,200/month.
- Compute Optimizer produced 112 Over-provisioned findings, 9 Under-provisioned, 6 Insufficient Data. Estimated savings roll-up on the over-provisioned set: $8,900/month.
- Intersection of TA-flagged and CO-over-provisioned: 39 instances, every one of which got a specific target type from Compute Optimizer. These went into the first sprint of right-sizing.
- TA-only (flagged by TA but Compute Optimizer showed
Optimized): 8 instances, investigated; 5 were pre-production environments running below threshold by design, 3 were peak-monthly workloads the 4-day check was misreading. None got resized. - CO-only (CO flagged Over-provisioned, TA didn’t flag): 73 instances. These ran above TA’s 10% threshold but below Compute Optimizer’s peak-projection threshold. Most got ticketed; the ones with
performanceRisk >= 3got a human memory-profile check first.
Net result after implementation: the shared list and the CO-only list together saved about $6,500/month on the first pass, and the programme has a repeating monthly cadence running on the two services’ APIs.
What’s worth remembering
- Trusted Advisor checks account configuration; Compute Optimizer checks workload utilisation. Misconfiguration vs shape, different data in, different advice out.
- Full Trusted Advisor needs Business Support or higher. The free tier is 7 security checks plus service limits. Every interesting cost check is behind the Support tier.
- Compute Optimizer is free and opt-in from the management account. Enable it org-wide on day one; the 14-day clock starts as soon as it’s on.
- Memory metrics unlock real right-sizing. CPU-only recommendations will down-size RAM-bound workloads incorrectly. Install the CloudWatch agent and publish
mem_used_percenton anything that matters. - Enhanced infrastructure metrics extends the window to 93 days. For workloads with monthly or quarterly peaks, the 14-day default is too short. Enable it on production.
performanceRiskis the decision lever. Risk 1-2 Over-provisioned recommendations are low-stakes. Risk 3+ deserve a human look before implementation.- Trusted Advisor and Compute Optimizer agree more often than they disagree. When they agree, act; when they disagree, CO’s longer look-back and memory awareness usually makes it the more reliable voice.
- The Support API is
us-east-1only. Global service, single endpoint. Automation that pulls Trusted Advisor data needs to know it lives there regardless of which Region the findings concern.
Trusted Advisor is the account-level misconfiguration detector with a cost-pillar side gig; Compute Optimizer is the per-workload right-sizing engine that actually tells you what to run instead. Neither replaces the other. Both together, feeding a monthly cost-review cadence, are the foundation most FinOps programmes end up building, one bill, two advisors, each speaking about the layer they know best.