How to Investigate One Incident With GuardDuty, Flow Logs, and Detective

October 16, 2028 · 17 min read

The situation

A platform security team in a mid-size SaaS company gets paged at 05:17. The GuardDuty console in the delegated-administrator account shows a new High-severity finding attached to account 444455556666, region eu-west-1:

Trojan:EC2/DropPoint
Severity: 8 (High)
Resource: Instance i-0abc123def4567890 (role: app-worker-prod)
Action: Outbound network connection
Remote IP: 203.0.113.47  (AS64512, listed on Proofpoint ET Drop)
Observed: 02:14:32 UTC -- 05:08:11 UTC
Total bytes (approx): 4.1 GB outbound

The instance is a background-job worker processing customer uploads. Its security group normally permits outbound HTTPS only to a handful of VPC endpoints and two vendor APIs; egress goes via a NAT gateway from a private subnet. The remote IP is not on any allow-list the team has published.

Three tools are available: VPC Flow Logs at the ENI level for every subnet in the production VPC, flowing into CloudWatch Logs with 90-day retention, default v2 format; Amazon GuardDuty enabled across the organisation via the delegated administrator, with S3 Protection and Malware Protection on, sixteen member accounts’ findings converging; Amazon Detective enabled at the organisation level six weeks ago, so the behaviour graph has 42 days of CloudTrail, Flow Logs, and GuardDuty findings.

What actually matters

Before mapping services to the four questions that matter before the 06:30 incident call, worth separating what kind of work each one is.

The first observation is that an investigation like this doesn’t have a single answer, it has a sequence of questions, each with a natural home. “Is this real?” is a question about raw evidence, did the bytes actually leave the network interface on the wire? “What exactly left?” is the same raw evidence, queried differently, volume, duration, port, shape. “How did the attacker get in and what else did they touch?” is a question about relationships between entities, this instance’s IAM role, what that role did in CloudTrail, what DNS queries happened, whether the same external IP is talking to anything else. “Is anyone else in the org talking to this IP?” is the same relationship question, at organisational scale.

The mistake is to assume one tool answers all four. A log search does raw evidence well; it does relationships badly. A detection engine raises alarms well; it doesn’t hold the evidence. A graph service links entities well; it didn’t notice the attack. The work isn’t picking a favourite, it’s knowing which question each tool was designed for and running them in the right order.

The second thing worth weighing is what “raw” means. VPC Flow Logs are per-flow metadata at the ENI, source IP, destination IP, port, protocol, bytes, action, aggregated at a configurable interval. They don’t hold packet payload. They don’t see traffic to the Amazon DNS resolver, the instance metadata service, Time Sync, DHCP, or ARP. So “proof the bytes left” from Flow Logs is proof of metadata, not proof of content. That’s usually enough, if the bytes count matches and the destination matches, corroboration is strong, but not always, and it’s worth knowing the boundary.

The third is the significance of independent, duplicate ingest. Both GuardDuty and Detective pull their own copies of Flow Logs, CloudTrail, and Route 53 Resolver DNS query logs. That matters in two ways. First, detection doesn’t depend on your Flow Log configuration, if the team disabled Flow Logs last quarter to save cost, GuardDuty still sees the traffic. Second, investigation doesn’t depend on your retention, if CloudWatch rolled off 30 days ago, Detective still has up to a year of history in its graph. The tools are resilient to changes in the customer’s logging plumbing because they don’t share it.

The fourth is about timelines. Detective is useful precisely because it has history. If the team switched it on at 05:17 in response to the finding, the graph would have no baseline, no “normal behaviour for this role over the last 42 days” to compare against. Detective is a pre-investment, not a runtime tool. That’s the difference between “enable before an incident and pay a flat per-GB fee against enrolment” and “enable during an incident and pay nothing useful.”

Finally, there’s the question of attribution. Network metadata can attribute traffic to AWS services in general, but not to a specific bucket or object. For “the role used its instance profile to read objects from S3,” the API-call log is the answer, because it knows the bucket and the object. That’s why a behaviour graph pivots from network to API-call logs naturally, the graph links the network question to the audit-log question for you.

What we’ll filter on

Raw evidence. Per-flow records proving bytes moved.
Automated detection. The service itself raises the alarm.
Pivot across sources. Given one entity, walk outward to CloudTrail events, DNS queries, other findings.
Org-wide scope. The whole organisation’s data, from one place.
History. Baseline to compare today’s behaviour against.

The investigation-toolchain landscape

Five services sit in the AWS security-investigation area.

1. VPC Flow Logs. Metadata for every accepted or rejected IP flow at the elastic network interface, aggregated at a configurable interval (default 10 minutes; minimum 1 minute; Nitro always 1 minute or less). Published to CloudWatch Logs, S3, or Amazon Data Firehose. Default v2 has 14 fields; v3 adds vpc-id, subnet-id, instance-id, tcp-flags, type, pkt-srcaddr, pkt-dstaddr; v4 adds region and AZ; v5 adds pkt-src-aws-service, pkt-dst-aws-service, flow-direction, traffic-path. Metadata only, no packet payload. Misses traffic to the Amazon DNS resolver, the instance metadata service at 169.254.169.254, Time Sync, DHCP, ARP, and Windows licence activation.

2. Amazon GuardDuty. Managed threat detection consuming an independent, duplicate stream of VPC Flow Logs, Route 53 Resolver DNS query logs, and CloudTrail management events. Optional protection plans add S3 data events, EKS audit logs, runtime monitoring on EC2/EKS/ECS-Fargate, RDS login activity, Lambda Flow Logs, and malware scanning. Findings combine ML anomaly detection with threat-intel feeds. EC2 exfiltration types include Trojan:EC2/DropPoint, Backdoor:EC2/C&CActivity.B, Trojan:EC2/DNSDataExfiltration, UnauthorizedAccess:EC2/MaliciousIPCaller.Custom, UnauthorizedAccess:EC2/TorClient. Priced per GB analysed.

3. Amazon Detective. Managed behaviour-graph service that ingests VPC Flow Logs, CloudTrail management events, GuardDuty findings, and optionally EKS audit logs and Security Hub findings, through its own independent stream. Builds a linked graph of AWS accounts, IAM roles and users, EC2 instances, Kubernetes pods, IP addresses, and findings, with up to one year of history. UI pivots from a finding to the instance, from the instance to the role it assumed, from the role to the CloudTrail actions, from an external IP to every other resource that talked to it, with baseline comparisons. Tiered flat rate per GB analysed.

4. Amazon Macie. Classifies sensitive data (PII, credentials, IP) in S3. Adjacent capability for “what class of data might have been exfiltrated if the attacker reached S3?”, not for reconstructing a network exfiltration from an EC2 instance.

5. AWS Security Hub. Aggregates findings from GuardDuty, Inspector, Macie, IAM Access Analyzer, and third-party sources in ASFF, plus CSPM controls against CIS, PCI, NIST. The cross-service aggregator and dashboard. Doesn’t pull evidence together and doesn’t hold raw flows.

Side by side

Service	Raw evidence	Automated detection	Pivot across sources	Cost shape	Investigation UI
VPC Flow Logs (CloudWatch)	✓	✗	✗	Per-GB ingest + storage	✗
GuardDuty	✗	✓	✗	Per-GB analysed	✗
Detective	,	✗	✓	Per-GB analysed (tiered)	✓
Macie	✗	✓ (S3 data)	✗	Per-GB classified	✗
Security Hub	✗	, (aggregator)	✗	Per-finding	✗

No single row is all-green. The investigation needs all three of Flow Logs, GuardDuty, and Detective, in different roles. Flow Logs answer “what moved.” GuardDuty answers “is this worth investigating at all.” Detective answers “how did we get here, and what else is involved.”

The three-layer investigation

Evidence, detection, and investigation as three layers over the same underlying data. Each ingests its own independent stream; none replaces another.

Running the four questions

Q1: is the finding real? Flow Logs. Only Flow Logs hold the raw per-flow evidence; GuardDuty’s finding is a claim and Flow Logs are where the claim is corroborated. A CloudWatch Logs Insights query:

fields @timestamp, srcaddr, dstaddr, srcport, dstport, protocol, bytes, action
| filter interface_id = "eni-0123abcd"
| filter dstaddr = "203.0.113.47"
| filter start >= 1723856072 and end <= 1723866491
| stats sum(bytes) as total_bytes by dstport, action
| sort total_bytes desc

Filter by ENI, by remote IP, by time window, sum. If the total is ~4 GB on ACCEPT, the claim is corroborated. Caveats: payload isn’t in Flow Logs (bytes over tcp/443 could be anything, for payload, Traffic Mirroring); aggregation interval is 10 minutes by default (1 minute on Nitro), which gives 18-180 records over three hours; DNS traffic is invisible to Flow Logs by design.

Q2: what exactly left? Still Flow Logs, extended:

stats sum(bytes) as total_bytes,
      count(*) as record_count,
      min(start) as first_flow,
      max(end)   as last_flow
by bin(5m)

A 5-minute time series distinguishes one long HTTPS session (bytes climbing steadily) from short, repeating bursts (classic beaconing). A second query filtering by srcaddr = <instance private IP> without a dstaddr filter lists every destination the instance reached, scope check.

Q3: how did the attacker get in, and what else was touched? Detective. The answer requires pivoting across Flow Logs, CloudTrail management events, and GuardDuty findings in a way that reflects relationships between entities. CloudWatch Logs Insights could in principle join across three log groups, but the ergonomics are poor and the time to first useful answer is long.

Start on the GuardDuty finding in the Detective console. One click opens the compromised resource view: who assumed the role in the last 42 days and from where; which CloudTrail actions the role’s credentials made; which IPs the instance talked to with baselines; which DNS queries; which related findings mention the same entities. Concrete pivots: instance to IAM role (did another principal assume it during the exfil window? a spike of sts:AssumeRole from outside the VPC is credential theft); role to CloudTrail actions (a spike in s3:GetObject against buckets the instance doesn’t normally touch is data reconnaissance; Flow Logs can’t see this because S3 traffic goes to AWS endpoints); external IP to every entity that touched it; finding to related findings (Detective clusters related high-severity findings).

Detective’s value isn’t unique data, it ingests the same Flow Logs, CloudTrail events, and findings the other services expose. The value is linking the data into a graph the investigator can traverse.

Q4: anything else in the org talking to that IP? Detective again, because the graph spans every enrolled account. A CloudWatch query would need cross-account access to 16 log groups, permissions configured in each, and a federated query layer on top. Detective runs the query across the whole graph from the delegated-administrator account.

If the answer is “no other hosts,” the compromise is single-host: isolate, rotate credentials, image for forensics, review the deployment chain. If the answer is “four other hosts across two accounts,” the response is fleet-wide: quarantine all implicated hosts, assume shared credentials or CI/CD supply-chain compromise, widen the investigation to the pipeline.

Where GuardDuty sits in the chain

GuardDuty doesn’t appear in any of the four questions directly, which undersells it. Its job happened at 05:17, when the finding fired. Without GuardDuty, the Flow Logs would still be in CloudWatch, Detective would still have its graph, and nobody would be investigating because nobody would know there was anything wrong.

Three particulars worth holding. Independent stream of data. GuardDuty pulls its own copies of Flow Logs, CloudTrail management events, and Route 53 Resolver DNS query logs, doesn’t require the customer to enable them, and doesn’t charge for the read. Disabling Flow Logs to save cost doesn’t blind GuardDuty, but it does blind Flow-Log-based forensics. Threat intel plus ML – Trojan:EC2/DropPoint is a threat-intel match against known-malicious IPs; Trojan:EC2/DNSDataExfiltration combines threat-intel domain lists with statistical patterns in DNS query payload sizes; UnauthorizedAccess:IAMUser/AnomalousBehavior is pure ML. No evidence archive, a finding names the implicated resource, the remote IP, and summarises the event. It doesn’t give you the flows. For those you go to Flow Logs or Detective.

Where Macie and Security Hub fit

Macie becomes relevant if the Detective walk finds the attacker pivoted to S3, a Discovery:S3/ObjectReadExtremelyRare or similar. Enable it on the implicated buckets, run a one-time sensitive-data discovery job, and the classification will tell you whether the objects held customer PII, credentials, or neither. Macie answers “what class of data” after the fact.

Security Hub is where the GuardDuty finding appears alongside findings from Inspector, Macie, IAM Access Analyzer, and third-party products. In a mature setup the analyst’s first stop at 05:17 is Security Hub, not the GuardDuty console directly, unified view, deep links out to the source service. CSPM controls evaluate posture against CIS, PCI, NIST, useful for “was this instance exposed in a way we could have caught earlier,” not for the active investigation.

A timing note on Detective

Detective is priced as a tiered flat rate per GB analysed, per member account, per region, with a 30-day free trial. The graph needs history to be useful, and history only exists from the moment Detective was enabled. At six weeks of enrolment, the graph has enough baseline to tell “normal behaviour for app-worker-prod” from “the last 42 days plus today.” If Detective had been switched on in response to the finding, the graph would have five hours of history and the baselines would be useless. Detective is a pre-investment, not a runtime tool.

What’s worth remembering

VPC Flow Logs hold the raw per-flow metadata, 14 fields in v2, up to v5 adding AWS-service attribution and traffic direction. No payload. Delivered to CloudWatch Logs, S3, or Firehose; default aggregation 10 minutes (1 minute on Nitro).
Flow Logs miss traffic to the Amazon DNS resolver, the instance metadata service, Time Sync, DHCP, ARP, Windows licence activation. DNS-tunnelled exfiltration is invisible.
GuardDuty uses independent, duplicate streams of VPC Flow Logs, CloudTrail management events, and Route 53 Resolver DNS query logs; no customer Flow Log configuration required; GuardDuty does not charge for that underlying data.
GuardDuty finding types for exfiltration include Trojan:EC2/DropPoint, Backdoor:EC2/C&CActivity.B, Trojan:EC2/DNSDataExfiltration, UnauthorizedAccess:EC2/MaliciousIPCaller.Custom, UnauthorizedAccess:IAMUser/InstanceCredentialExfiltration.InsideAWS/OutsideAWS.
Detective ingests VPC Flow Logs, CloudTrail management events, and GuardDuty findings through its own independent streams, with up to one year of history.
Detective’s value is pivot across sources, not unique data, the graph turns a four-query manual join into one click.
The three services are layered, not alternatives: Flow Logs evidence, GuardDuty detection, Detective investigation; each ingests its own stream of the same data.
Detective needs to be pre-enabled, history starts at enrolment; switching on during an incident leaves no baseline.
Macie is adjacent, not alternative, classifies sensitive data in S3 to answer “what class of data was touched” after the investigation reaches S3.
Security Hub is the aggregator, not the evidence layer or the investigation workspace; it’s the landing page for findings across services and accounts.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.