Keeping PII Out of LLM Prompts and Logs

August 05, 2026 · 15 min read

Generative AI Developer · AIP-C01 · part of The Exam Room

The situation

The claims-processing assistant from earlier in the year now handles personally identifiable information at every step. Customer names, addresses, phone numbers, dates of birth, policy numbers, national-insurance numbers, medical diagnoses, and banking details all flow through prompts into Bedrock and back. The business requires them to flow, the assistant has to say “Hi Sarah, your claim on 15 April has been approved” to be useful. The compliance team requires them not to leak, no PII in CloudWatch logs, no PII in S3 buckets visible to the wrong principals, no PII in the training corpus if the vendor ever decides to train on customer data (Bedrock explicitly doesn’t, but the compliance team still wants layered defences).

Concrete requirements:

  • Ingress: PII in input must be classified, tagged, and tracked. Free-form customer messages (text, transcribed voicemail) can contain PII in unpredictable forms.
  • Inference: the model sees what it needs to produce a useful answer but nothing more. An assistant answering “when is my next payment due?” doesn’t need the national-insurance number in the prompt even if it’s in the session context.
  • Egress: model outputs must not hallucinate PII (e.g., inventing a policy number), must not return PII that wasn’t in scope for this user, and must route any PII through the right logging posture.
  • Logs: CloudWatch logs and S3 session archives must have PII masked before they hit storage, “after-the-fact” redaction isn’t enough if the raw data sits in a log for 10 minutes first.
  • Audit: for every PII touch, an audit record showing who (principal), what (PII class), when, and why (business reason tag).

What actually matters

PII handling in LLMA neural network trained to predict the next token in a sequence, large enough that it generalises to tasks it wasn’t explicitly trained for. pipelines isn’t a single operation; it’s a lifecycle. Every stage has different threats and different tools.

The first decision is detection. Something has to recognise a national-insurance number, a postal code, a name, an email, a phone number in raw text. Pattern-matching works for format-constrained PII (emails, SSNs, credit-card numbers) but fails for names and addresses. ML-based classifiers cover the long tail, names, locations, organisations, free-form identifiers, across multiple languages.

The second is action on detection. Detection gives locations; action is what’s done with them. Options: redact (replace with a class marker), tokenise (replace with a reversible token that can be de-tokenised later), drop (remove and refuse), or flag (annotate without changing the text). Different stages call for different actions.

The third is where in the pipeline redaction sits. Redact at ingress (before the message hits the model at all)? At invocation time, by a guardrail layer that the model sees? At egress (before logging)? All of the above? The answer depends on who sees what at each stage.

The fourth is what additional safety layer the model inference itself can carry. Many inference platforms now offer an attached policy layer that filters both input and output. PII among them, along with content policies and topic restrictions. Layering one of those on top of application-level redaction catches anything the application missed and anything the model invents.

The fifth is what the model is allowed to see in the first place. Not all PII needs to flow to the model. If the question is “when is my next payment due?” and the session has access to a customer record, the assistant doesn’t need to see the customer’s full name to answer. Minimising what goes in is cheaper than redacting what came out.

The sixth is logging and observability. Every PII touch is an audit event. Both API-call records and application logs need to be configured so PII doesn’t appear in plaintext. Encrypted storage destinations, write-time masking on log streams, and retention policies aligned to legal requirements are all part of the same posture.

And a softer one: the distinction between identifiable and sensitive. A customer’s name is identifiable but low-sensitivity; a medical diagnosis is highly sensitive but may or may not be identifiable on its own. Good handling treats these differently, masking names for log hygiene is one thing; masking diagnoses is a different-class concern.

What we’ll filter on

  1. PII coverage, which classes of PII are detected (names, addresses, emails, SSNs, medical, financial, etc.)?
  2. Stage coverage, does this tool act at ingress, during invocation, at egress, on logs?
  3. Language coverage, does detection work beyond English?
  4. Reversibility, can redacted PII be re-hydrated for authorised callers?
  5. Operational burden, what do we run and maintain?

The PII-in-LLM-pipelines landscape

  1. Amazon Comprehend (DetectPiiEntities + ContainsPiiEntities). A managed NER-based service that identifies PII in free-form text. 20+ languages, 40+ PII entity types. Returns offsets, types, and confidence scores. Caller takes action (redact, tokenise, drop). Cheap, fractions of a cent per unit of analysis. Integrates cleanly with Step Functions and Lambda pipelines.

  2. Bedrock Guardrails (PII policy). A Guardrail configured with PII detection acts at invocation time on both input and output. Can mask, block, or allow per-class. Covers the common PII classes and supports custom regex patterns. Fits when the protection needs to live with the model invocation, not in application code.

  3. Macie (for S3-stored documents). Classifies and alerts on PII found in S3 buckets. Not a real-time filter; a monitor. Useful for knowing what PII sits in content stored in S3 (chunks in Knowledge Bases, document archives, session transcripts) and for setting up guardrails against over-permissioned buckets.

  4. Custom regex-based redaction. A library in the application that applies patterns for known-format PII (emails, SSNs, credit cards). Cheap, predictable, brittle. Misses names, addresses, and anything in unusual formats. Useful as a backstop for high-confidence formats; not sufficient on its own.

  5. Prompt-level PII minimisation. Design the prompt so it doesn’t include PII the model doesn’t need. Instead of pasting the customer’s full record, pass only the fields the current question requires. Reduces PII surface at source; cheapest redaction is the bit you never send.

  6. CloudWatch Logs data protection. CloudWatch Logs supports native data-protection policies that mask PII in logs as they’re written. Configure once per log group; applies automatically. Complements application-layer redaction by catching leaks that got past the application.

Side by side

Tool PII coverage Stage Language Reversibility Ops burden
Comprehend DetectPII Broad (40+ types) Ingress, egress 20+ App-level tokenise Low (API calls)
Bedrock Guardrails PII Common types + custom regex Invocation Limited beyond EN Mask / block Low (managed)
Macie Broad (S3 content) Monitor Limited No Low (managed)
Custom regex Format-constrained only Anywhere Regex-dependent App-level Moderate (maintenance)
Prompt minimisation N/A Ingress (design) N/A N/A High (prompt design)
CloudWatch log protection Common types Logs Limited Mask-only Low (one-time config)

Every real system uses several of these together. The question is the composition, not picking one.

The redaction lifecycle, layered

Layered PII handling, ingress → logs 1. Ingress User message + context "When is Sarah Patel's next" "payment due, policy ABC123?" Comprehend DetectPii name, policy number detected offsets + types returned Tokenise [NAME:tk_abc] [POLICY:tk_xyz] mapping → DynamoDB (KMS) Prompt minimisation only needed record fields next_payment_due, plan_name 2. Invocation Bedrock Guardrails — input filter PII policy + custom regex · mask-or-block Bedrock Converse → model → output output also filtered by Guardrail PII policy 3. Egress Model response (tokens) "Payment for [NAME:tk_abc] is due 21 June" De-tokenise for authed user lookup tokens in KMS-encrypted map Deliver to user "Payment for Sarah Patel is due 21 June" 4. Logs & audit CloudWatch Logs data-protection policy masks residual PII KMS-encrypted, retention 90d S3 session archive tokenised transcripts de-tokenisation gated by IAM Macie monitors bucket Audit stream who · what class of PII · when · why (tag) immutable, separate account, long retention queryable by compliance
Four bands of protection, with the audit stream (dashed red) collecting events from every stage. Each layer protects against a different failure mode; none is sufficient alone.

The picks in depth

Ingress: Comprehend + tokenisation. Every user message and every retrieved context chunk flows through Comprehend’s DetectPiiEntities call. The call returns entity types (NAME, EMAIL, SSN, PHONE, ADDRESS, DATE_TIME, PHONE_NUMBER, BANK_ACCOUNT_NUMBER, CREDIT_DEBIT_NUMBER, and ~30 others) with offsets and confidence scores. For each detected entity above a confidence threshold (say 0.85), the application generates a reversible token ([NAME:tk_abc123]) and stores the mapping in a KMS-encrypted DynamoDB table scoped to the current session. The prompt that reaches Bedrock contains the tokens, not the PII.

This costs a Comprehend call per message, sub-millisecond, fractions of a cent. The DynamoDB table is session-scoped and TTL’d to a few hours. Detokenisation for the session’s authorised user happens on the response path.

Prompt minimisation. Before the ingress redaction even runs, the application trims the prompt to what’s necessary. If the question is “when is my next payment due?” and the customer record has 40 fields, the prompt carries only the two or three fields that answer the question, not the entire record. This is a prompt-design practice, not a tool, the cheapest PII is the bit you never include. Structured context (JSON with named fields) makes this easy; free-text blobs make it hard.

Invocation: Bedrock Guardrails. A Guardrail attached to the model invocation runs its PII policy on both input and output. This is belt-and-braces: if the ingress tokenisation missed something (a novel PII format, a name variant Comprehend didn’t catch), Guardrails catches it at the model boundary. Set Guardrails to mask rather than block for input, we’d rather strip an unexpected PII token than fail the request, and to block for output (to prevent model-hallucinated PII reaching the user).

Egress: detokenisation. The model’s response contains tokens ([NAME:tk_abc123]). The egress step looks up each token in the session’s mapping table and replaces it with the PII. This happens only for responses bound for the authorised user, responses stored in audit logs keep the tokens. The detokenisation call is gated by IAM; only principals with the session’s scope can de-hydrate the mapping.

Logs: CloudWatch data-protection policy + KMS-encrypted S3 archive. Application logs (request parameters, session state) go to CloudWatch Logs with a data-protection policy that masks common PII classes at write time, a second layer behind application-level redaction. The full session archive (tokenised) goes to a KMS-encrypted S3 bucket; Macie monitors the bucket for any leaked PII that slipped through. S3 object ACLs and bucket policies prevent direct download by unauthorised principals; de-tokenisation for audit requires a privileged pipeline.

Audit stream. Every PII touch. Comprehend call, token creation, detokenisation, Guardrail hit, log mask, emits an audit event to a separate immutable stream (Kinesis Firehose → S3 Glacier, or a dedicated audit account). Compliance can query “every access to a name or SSN in the last 90 days by principal X” without needing the raw application logs.

A worked example: Sarah’s question

Customer Sarah Patel, policy ABC123, asks via chat: “When is my next payment due?”

  1. Ingress: session context loaded from customer record. Application trims to {plan_name, next_payment_due, next_payment_amount}. User message goes through Comprehend: name “Sarah Patel” (already in session, not in message), policy “ABC123” (also in session, not in message). Message itself contains no PII this turn. Prompt assembled with tokenised context: {customer_name: [NAME:tk_1], plan: "Basic", next_payment_due: "2028-06-21"}.

  2. Invocation: Bedrock Guardrails scans input. No PII in plaintext (only tokens). Passes. Model generates response: "Hi [NAME:tk_1], your next payment is due on 21 June 2028." Output scanned by Guardrails; tokens pass; no hallucinated PII. Returned.

  3. Egress: detokenisation on response. [NAME:tk_1]Sarah. Response delivered: “Hi Sarah, your next payment is due on 21 June 2028.”

  4. Logs: CloudWatch receives the invocation log. Application already tokenised the PII, so the log line contains tokens; CloudWatch data-protection runs as a backstop and masks anything that looks like an SSN, email, or credit card if it slipped through. Log line persists for 90 days.

  5. Audit: events for this turn land in the audit stream: comprehend.detect_pii_entities (session=s123, entities_found=2), token.create (types=[NAME, POLICY]), bedrock.invoke_model (guardrail_hits=0), token.resolve (types=[NAME], principal=sarah@example.com, reason=user_response).

The customer saw her first name; the model saw a token; the logs saw a token; compliance can trace every PII-touching action back to this session.

What’s worth remembering

  1. PII handling is a lifecycle, not a feature. Ingress, invocation, egress, logs, audit, five stages, each with its own tooling.
  2. Comprehend detects; the application decides. DetectPiiEntities returns locations; redact, tokenise, or drop is an application choice.
  3. Tokenisation preserves referential integrity across a turn. The model sees placeholders, the user sees their data, and logs see placeholders, reversible for authorised callers only.
  4. Bedrock Guardrails is the invocation-time safety net. Mask or block PII at model input and output; belt-and-braces behind application-level redaction.
  5. Prompt minimisation is the cheapest PII control. Don’t send what the model doesn’t need. Design prompts structured, not dumps.
  6. CloudWatch Logs data-protection policies catch the residuals. Configure once per log group; automatic masking at write time.
  7. Macie watches S3 buckets for content-at-rest. Session archives, RAGA pattern where you retrieve relevant documents at query time and stuff them into the prompt so the model can ground its answer on them. chunks, anything that lives in S3 should be scanned.
  8. Audit is separate from application logs. An immutable stream in a separate account gives compliance the query surface without exposing application logs.

Names, addresses, policy numbers, diagnoses, the assistant uses them, nobody leaks them. Five layers of handling, each narrow enough to own and understand, together covering the lifecycle from “customer typing” to “compliance subpoena six years later.”

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.