SQS Standard or FIFO for Ordered Workloads

The situation

Two event-driven workloads live in the same account.

Click tracking, every click on the marketing site publishes a small event (~200 bytes) into a queue. A consumer fleet (Lambda) writes them to Kinesis Firehose, which lands them in S3 and Redshift for analytics. Peak: 50,000 events/second during campaigns. Occasional duplicates are fine; the analytics team deduplicates at ingest time using a client-supplied event_id. Ordering between events doesn’t matter, the timestamp on each event is the ordering signal.
Payment state updates, every state change on a payment (authorised, captured, refunded, chargeback_opened, chargeback_closed) publishes a message into a queue. A consumer service applies the state transition to the payment record. Peak: 800 messages/second, well within any queue’s capability. But: two messages for the same payment_id must be processed in the order they were sent, and a message must never be processed twice, because double-capturing a $200 charge is very much a real problem.

Two very different contracts. The first needs throughput and cheerful tolerance for mess; the second needs exactly-once-per-group ordering and idempotence at the queue level.

What actually matters

Before picking queue flavours it’s worth naming what “queue” actually guarantees.

SQS is a managed queue service. Producers send messages; consumers receive, process, and delete them. Between send and delete, the message is owned by a consumer via a visibility timeout, if the consumer crashes or doesn’t delete in time, the message becomes visible again and another consumer can pick it up. This is the at-least-once delivery model that makes queues resilient to consumer failure.

SQS comes in two flavours. Standard queues optimise for throughput: nearly unlimited messages/second, but delivery is at-least-once and order is not guaranteed even for messages sent successively by the same producer. Duplicates are rare but possible; reordering is expected. FIFO queues preserve ordering within a MessageGroupId and deduplicate based on a MessageDeduplicationId for 5 minutes after first delivery. They cap throughput per-queue at 3,000 messages/second with batching (or 300/s per message group for per-group ordering without high-throughput mode), which is plenty for many workloads but insufficient for high-volume firehose use cases.

First question per workload: does order matter within any group of related messages?. For click tracking: no, analytics queries tolerate any order because timestamps are in the events themselves. For payment updates: yes, state transitions for a given payment must apply in the order emitted by the producer. FIFO preserves that; Standard does not.

Second: does the consumer tolerate duplicates?. For click tracking: yes, the downstream dedupe using event_id turns at-least-once into effectively exactly-once. For payment updates: not without making the consumer idempotent, which is achievable but adds complexity. SQS FIFO’s deduplication window (5 minutes) catches producer-side duplicates automatically, simplifying the consumer.

Third: throughput ceiling. Standard: effectively unlimited (SQS scales per Region, not per queue). FIFO: 3,000 messages/second with batching or 300 messages/second per message group without it; High Throughput Mode extends this to 70,000 messages/second per queue for FIFO, but still at 3,000/s per message group. For click tracking at 50,000/s, even high-throughput FIFO is borderline; Standard is the simpler call. For payment updates at 800/s, FIFO is comfortably within limits.

Fourth: price per million messages. Standard: ~$0.40 per million requests. FIFO: ~$0.50 per million requests. The price difference is a small fraction of the total cost for most workloads and is not usually the deciding factor.

Fifth: message group granularity. FIFO ordering is preserved within a message group, not across the whole queue. Two messages with the same MessageGroupId are strictly ordered; two messages with different MessageGroupIds have no ordering relationship. For payment updates, the MessageGroupId is the payment_id: each payment’s transitions are ordered, but transitions for different payments can be processed in parallel.

Sixth: fan-out patterns. If the same event needs to be delivered to multiple consumers, the queue’s by itself doesn’t do that, queues are point-to-point. The SNS + SQS pattern (publish to an SNS topic, subscribe multiple SQS queues to the topic) solves fan-out. SNS has its own Standard vs FIFO distinction that mirrors SQS.

What we’ll filter on

Ordering, per-group strict, or no guarantee?
Delivery semantics, at-least-once, or exactly-once within a 5-minute dedupe window?
Throughput ceiling, per queue and per message group?
Price per million requests?
Fan-out pattern compatibility. SNS topic subscriptions, Lambda triggers, etc.?
Message size and retention, both cap at 256 KB / 14 days by default, with large-message extensions via S3?

The SQS flavour landscape

SQS Standard. At-least-once delivery; best-effort ordering (often in-order, occasionally not). Unlimited throughput per queue. $0.40 / million requests (first 1M/month free). Integrates with Lambda, EventBridge Pipes, SNS subscriptions, Step Functions activities. The default queue; used unless FIFO’s guarantees are explicitly needed.
SQS FIFO. Strict per-MessageGroupId ordering; exactly-once delivery within a 5-minute deduplication window (set via MessageDeduplicationId or content-based dedupe via SHA-256 of the body). Default throughput: 3,000 msg/s with batching (300 per message group without); High Throughput FIFO: 70,000 msg/s per queue with 3,000 msg/s per group. $0.50 / million requests. Integrates with Lambda FIFO event source (concurrency equals number of active message groups), Step Functions, EventBridge Pipes.
SQS Standard with content-based consumer dedupe. The “poor man’s FIFO”: use Standard for throughput, attach an event_id to every message at the producer, and have the consumer deduplicate against a DynamoDB table or Redis set. Works for many workloads at lower cost than FIFO but requires consumer-side storage and correctness discipline.
Kinesis Data Streams. Not SQS, but in the same problem space. Strict ordering per shard, parallel consumers via the Kinesis Client Library, longer retention (up to 365 days), replay capability. Cheaper per million messages at high volume but costs per shard-hour even when idle. The correct tool when consumers need to rewind or when multiple applications consume the same stream independently.
Amazon MQ (RabbitMQ/ActiveMQ). Managed message broker for workloads tied to AMQP, STOMP, or JMS semantics (existing on-prem applications migrating as-is). More features than SQS (topic exchanges, TTL per message, priority queues), more operational complexity, and not serverless, billed per broker instance. Not the first choice for greenfield AWS design.

Side by side

Option	Ordering	Delivery	Throughput	Price / M req	Dedupe
SQS Standard	best-effort	at-least-once	unlimited	$0.40	consumer-side
SQS FIFO (default)	strict per-group	exactly-once (5-min window)	3,000 msg/s (300 per group)	$0.50	built-in
SQS FIFO (High Throughput)	strict per-group	exactly-once (5-min window)	70,000 msg/s (3,000 per group)	$0.50	built-in
Kinesis Data Streams	strict per-shard	at-least-once	shard-capped	shard-based	consumer-side
Amazon MQ	broker-specific	broker-specific	broker-specific	per-instance	broker-specific

Reading the table by workload:

Click tracking (50,000 msg/s, order not required, duplicates deduped downstream): Standard. The throughput ceiling on FIFO would bite even with High Throughput mode, and the duplicates-are-fine contract makes Standard’s weaker guarantees a non-issue.
Payment state (800 msg/s, strict per-payment order, no duplicates): FIFO with MessageGroupId = payment_id. Default throughput (3,000 msg/s) is plenty; per-group 300 msg/s is way more than any one payment will ever need.

The guarantee matrix

Two questions per workload, does order matter, do duplicates matter, and the queue shape falls out. Most real workloads land in the top-right (FIFO) or bottom-left (Standard) cells.

The picks in depth

Click tracking → SQS Standard. The queue is created with defaults and a generous visibility timeout:

aws sqs create-queue --queue-name clicks \
    --attributes '{
        "VisibilityTimeout": "60",
        "MessageRetentionPeriod": "86400",
        "ReceiveMessageWaitTimeSeconds": "20"
    }'

ReceiveMessageWaitTimeSeconds: 20 enables long-polling on consumer receive, the API call waits up to 20 seconds for a message rather than returning immediately empty, which cuts the SQS request count by roughly an order of magnitude. VisibilityTimeout: 60 gives consumers a minute to process; MessageRetentionPeriod: 86400 (24 hours) is enough for a consumer to recover from a multi-hour outage without losing events.

A dead-letter queue catches messages that fail processing repeatedly:

aws sqs set-queue-attributes --queue-url <url> \
    --attributes '{
        "RedrivePolicy": "{\"deadLetterTargetArn\":\"<dlq-arn>\",\"maxReceiveCount\":\"5\"}"
    }'

Five receive attempts; on the sixth, SQS moves the message to the DLQ. A CloudWatch alarm on ApproximateNumberOfMessagesVisible on the DLQ wakes the on-call when poison messages accumulate.

Producer and consumer code is standard SDK calls; no MessageGroupId, no MessageDeduplicationId. Lambda receives in batches of up to 10,000 messages (or the batch-window time limit) via the SQS event source mapping.

Payment state updates → SQS FIFO, message group per payment. The queue name ends in .fifo (required) and content-based deduplication is enabled:

aws sqs create-queue --queue-name payment-state.fifo \
    --attributes '{
        "FifoQueue": "true",
        "ContentBasedDeduplication": "true",
        "VisibilityTimeout": "60",
        "MessageRetentionPeriod": "604800"
    }'

ContentBasedDeduplication: true computes a SHA-256 of the message body and rejects identical bodies within 5 minutes. Alternatively, the producer supplies a MessageDeduplicationId explicitly (e.g. a UUID per state transition); this is the safer pattern because it survives minor payload variations that shouldn’t be semantic dupes.

Send side:

sqs.send_message(
    QueueUrl=payment_queue_url,
    MessageBody=json.dumps(payload),
    MessageGroupId=payment_id,                 # ordering scope
    MessageDeduplicationId=f"{payment_id}:{state_transition_id}"
)

MessageGroupId = payment_id means every message for the same payment is strictly ordered and processed sequentially by a single consumer; different payments process in parallel. Lambda’s FIFO event source mapping concurrency is bounded by the number of active message groups, so 800 active payments per second gives plenty of concurrency without blowing past the per-group ordering guarantee.

Consumer idempotence is still a good idea (defence in depth against the 5-minute dedupe window), but the queue does the main work of keeping duplicate state transitions out of the consumer entirely.

A worked trace: one payment’s path

A customer disputes a charge; the payment system generates three state transitions over 30 seconds.

14:02:00  producer  send_message payment_id=PAY-123 state=chargeback_opened
          MessageGroupId=PAY-123  MessageDeduplicationId=PAY-123:trans-7881

14:02:15  producer  send_message payment_id=PAY-123 state=evidence_uploaded
          MessageGroupId=PAY-123  MessageDeduplicationId=PAY-123:trans-7882

14:02:30  producer  send_message payment_id=PAY-123 state=evidence_uploaded
          MessageGroupId=PAY-123  MessageDeduplicationId=PAY-123:trans-7882  [duplicate: retry]

14:02:02  consumer  processed trans-7881 → state chargeback_opened (stored)
14:02:17  consumer  processed trans-7882 → state evidence_uploaded (stored)
14:02:30  SQS dedupe: trans-7882 rejected (within 5-min window, same dedupe id)

Three sends, two processes. The third send is a producer retry (the producer’s HTTP client timed out on the second send and retried it); the queue’s deduplication catches it before any consumer sees it. Order is preserved – chargeback_opened before evidence_uploaded, always, because they share a MessageGroupId.

If the dedupe window had elapsed (more than 5 minutes) and the retry arrived anyway, the consumer’s idempotence check (transition_id stored in the payment record) would catch it on the second processing. Both layers are required; neither is sufficient alone.

When Kinesis fits better than SQS

Two workloads in the wild where the answer swings to Kinesis Data Streams rather than SQS:

Multiple consumers of the same stream. SQS queues are point-to-point; once a consumer deletes a message, it’s gone. Kinesis shards hold messages for up to 365 days and let each application maintain its own position (checkpoint). Analytics, real-time dashboards, and archive-to-S3 consuming the same stream is easier on Kinesis.
Replay capability. “Reprocess everything from 2 hours ago because the consumer had a bug” is trivial on Kinesis (rewind the iterator) and impossible on SQS (messages past the consumer are gone).

SQS is cheaper at moderate volumes and simpler. Kinesis wins when the stream is a first-class durable log rather than a transport between producer and consumer.

What’s worth remembering

Standard vs FIFO is an ordering-and-duplicates question. Standard: unlimited throughput, at-least-once, best-effort order. FIFO: strict per-group order, exactly-once within a 5-minute window, capped throughput.
MessageGroupId scopes ordering. FIFO preserves order within a group, not across the queue. Choose the group ID to match the entity whose transitions must be ordered (payment ID, customer ID, account ID).
MessageDeduplicationId scopes exactly-once. Content-based dedupe hashes the body; explicit dedupe IDs are safer because they survive benign payload variations. Window is 5 minutes.
FIFO throughput is per-queue and per-group. Default: 3,000 msg/s per queue, 300 per group. High Throughput mode: 70,000 per queue, still 3,000 per group. If per-group throughput becomes the bottleneck, split the group (e.g. per-customer-per-day).
Lambda FIFO concurrency equals active group count. Parallelism in consumption comes from more distinct groups; a single group is single-threaded.
Dead-letter queues catch poison messages. RedrivePolicy with maxReceiveCount plus a CloudWatch alarm on the DLQ size is the baseline for any production queue.
SQS + SNS for fan-out. Queues are point-to-point. For one event to reach multiple consumers, publish to an SNS topic and subscribe a queue per consumer. SNS has its own Standard vs FIFO story that pairs with SQS.
Kinesis is the answer when consumers are many, or replay is required. SQS is a transport; Kinesis is a durable log. Pick by the needs of the consumer side, not just the producer.

Two workloads, two flavours of the same service. Click tracking runs on Standard because throughput is the constraint and duplicates don’t matter; payments run on FIFO because ordering and exactly-once are the contract. The same SDK, the same console, very different guarantees.