Series
The Exam Room
Exploring AWS, one service or situation at a time.
Exam Room · Architecture
Routing to the Closest Healthy Region
A multi-region application needs to route requests to the closest healthy region, failing over automatically when the preferred one drops out -- with no client-side retries and no extra health-check plumbing to maintain. Route 53 can do all of that in a single record set. Finding the correct combination means touring all seven routing policies and the attributes that separate them.
Read articleExam Room · Architecture
Choosing an S3 Storage Class for Cold Archives
Some data exists for compliance, not for use. Tens of terabytes of records sitting untouched until an auditor wants them. S3 has eight storage classes; only one of them is built for that pattern, and getting it wrong can cost an order of magnitude in a year you weren't paying attention to the bill.
Read articleExam Room · GenAI
Picking the AWS AI Service Tier for Each Feature
A product manager with no ML background has been told to add AI to a SaaS product, and has heard of Bedrock, SageMaker, Comprehend, Translate, Textract, Rekognition. AWS has three different shapes of AI offering, and the shortest path depends entirely on whether a ready-made service already does the job.
Read articleExam Room · GenAI
How to Take a Foundation Model from Pick to Production Endpoint
A product team wants a chatbot that summarises support tickets. They have the tickets, a cloud account, and no ML background. Somebody says 'use a foundation model'. Between that sentence and a working endpoint sit roughly seven distinct stages, each with its own AWS service and its own decisions. Picking the model is the easy part; the real work is figuring out which stages this team can skip, which they absolutely cannot, and what AWS gives them at each step.
Read articleExam Room · GenAI
Choosing Between Prompting, RAG, and Fine-Tuning
A legal-ops team wants a model that answers questions about their 4,000 in-house contract templates. The first prototype, a plain Claude call with the question in the prompt, hallucinates clause numbers. Someone suggests fine-tuning; someone else suggests RAG. They solve different problems, so 'which is better' is the wrong frame; what matters is which problem the team actually has, and what each adaptation technique costs in time, data, and recurring spend.
Read articleExam Room · GenAI
Grounding a Chatbot in Your Own PDFs
A facilities team has 600 PDFs -- equipment manuals, safety procedures, maintenance schedules -- sitting on a SharePoint drive. Engineers want a chatbot that answers 'how do I reset the chiller on floor 4?' in seconds instead of a ten-minute PDF hunt. Retrieval-augmented generation can do this; whether it does it well depends on what the corpus actually looks like, what kinds of questions the engineers really ask, and which configuration knobs decide whether the answers are any good once a managed service is on the table.
Read articleExam Room · GenAI
Forecasting Without Writing Python
A category manager has 18 months of weekly sales data for 400 SKUs and a deadline to forecast next quarter. She doesn't code. The ML team is booked until Q3. The ask is a tool that lets her build a forecast herself -- importable, reviewable, explainable -- without waiting for engineering. Which AWS box she clicks matters less than what kind of problem this actually is, what features of the data can honestly feed into a model, and what the business user has to understand for the output to be defensible when finance asks ''why this number?''.
Read articleExam Room · GenAI
How to Make a Bedrock Chatbot Audit-Ready with Guardrails and Watermarks
A fintech ships a customer-facing chatbot on Bedrock. Legal asks: can it give financial advice? Risk asks: can it leak customer account numbers? Compliance asks: if an auditor requests proof a response came from our model, can we demonstrate it? Three questions, three different controls, all of them Bedrock-native. The controls exist; the work is matching the right one to each question and figuring out what the shape of a 'responsible AI' configuration actually looks like when the auditor arrives.
Read articleExam Room · GenAI
Choosing Between Chains, Retrieval, and Agents for a GenAI Assistant
A product manager wants a 'GenAI assistant' for internal operations -- something that can answer questions, look up customer records, draft emails, and file Jira tickets. Three architectural patterns keep coming up: chains, retrieval, and agents. They sound similar, they all use foundation models, and teams routinely reach for the most elaborate one when a simpler pattern would do. There's no single 'best' here; what matters is which one fits each piece of the assistant's workload, and when elaboration costs more than it earns.
Read articleExam Room · GenAI
Choosing Between SageMaker, Bedrock, and Purpose-Built AI APIs
A platform team has five AI-shaped requests landing in a single sprint: transcribe call centre audio, detect anomalies in sensor data, extract text from scanned forms, summarise customer emails, and detect faces in CCTV. Someone has already typed 'use SageMaker' into three design docs. Someone else insists Bedrock is the answer. A third voice mutters about purpose-built services. AWS has at least three answers to every AI problem, so there's no single platform that wins; what matters is how to tell which layer of the stack each request lands on, and what that choice costs in time, money, and flexibility.
Read articleExam Room · Advanced GenAI
Picking a Bedrock Model for High-Volume RAG
A million LLM requests a day, peaking at thirty per second, split across US and EU customers, with a P99 first-token target under 1.5 seconds and real reasoning over retrieved context. Bedrock has seven model families and four ways to buy capacity. Most of the landscape falls away once you name what actually decides it -- and the real trick is what you do *after* you've picked the model.
Read articleExam Room · Advanced GenAI
How to Build a Citations-Required RAG Over 50K Internal Documents
Fifty thousand internal documents, five gigabytes of text, weekly churn, a three-second latency budget, per-user access control, and a citation in every single answer. The RAG landscape on Bedrock is bigger than one product and the interesting part of the design is what falls away once you name the five things that actually decide it.
Read articleCombining RAG and Fine-Tuning for a Legal Contract Assistant
A legal-tech team wants a contract review assistant that understands two hundred thousand past matters, speaks in the firm's voice with clause-by-section citations, and refuses anything off-domain. Fifty thousand pounds, three months. RAG, fine-tuning, and continued pre-training each solve a different half of that sentence — and the interesting answer is which two to pick, not which one.
Generative AI Developer Professional · AIP-C01
Coming soon