AppSync or Apollo on Lambda

June 16, 2027 · 17 min read

Developer · DVA-C02 · part of The Exam Room

The situation

A logistics platform needs a single GraphQL endpoint that covers three resources: a Shipment type backed by a DynamoDB table keyed by shipment_id, a Customer type backed by an Aurora PostgreSQL cluster behind RDS Proxy, and a VendorStatus type backed by an internal HTTPS service with a REST API.

A shipment’s full detail view on the mobile app is one query resolving all three. A dashboard tile on the operations portal subscribes to shipment updates and must re-render within a second when a shipment’s status changes. The platform runs in eu-west-1 with a small team of four full-stack engineers.

Two candidates are on the table. AppSync with managed resolvers to DynamoDB, RDS Proxy, and an HTTP data source for the vendor service, plus AppSync subscriptions for real-time updates. Or a Lambda running Apollo Server behind API Gateway (HTTP API), with resolvers that call the three backends themselves, plus a custom WebSocket story for subscriptions. Both serve the same schema; the difference is what the team ends up owning.

What actually matters

Before picking a service it’s worth asking what a four-person team is actually buying when they pick a GraphQL runtime.

The first thing to ask is how much of the GraphQL execution surface we want to own. Parsing, validation, field resolution order, error shaping, response formatting, all of it is protocol plumbing the team neither wrote nor wants to debug at 2am. A managed runtime hides that surface; a self-hosted one exposes it. The value isn’t the code itself (Apollo and other parsers are well-tested); it’s the blast radius when something misbehaves in production and the on-call engineer has to trace a five-hop resolver chain.

The second is what subscriptions cost in practice. A GraphQL subscription is a long-lived WebSocket, and WebSockets are where serverless stops being serverless. Connection state has to live somewhere; fan-out has to reach the right subset of clients; reconnection has to be idempotent. A runtime that treats subscriptions as a schema directive is doing real operational work for us; a runtime that leaves subscriptions as “use your imagination” is handing us a separate project.

The third is how authorization lands. Per-field auth (“only Admins see the Customer.email field”) is the kind of rule product teams invent weekly. If the runtime reads that as a schema annotation, we edit the schema and redeploy. If the runtime expects middleware, we write the middleware, test it, and own every edge case, including the one where a field added in a hurry forgets to thread through the middleware at all.

The fourth is cold-start surface and latency floor. A managed resolver that talks directly to DynamoDB adds a few milliseconds; a Lambda wrapping Apollo Server adds a warm-start tax plus cold-start spikes when concurrency climbs. For a mobile app where p99 latency matters to the product team, that floor is a lever we should care about before we start shaving ms off backend queries.

The fifth is ownership vs steering. Hand-rolled Apollo is more code but also more control: custom directives, shared resolvers with existing services, unusual middleware pipelines. Managed AppSync is less code but a more opinionated container, resolvers in JS or Lambda, auth modes from a fixed list, cache keyed in a fixed way. Small teams usually benefit from opinion; large Apollo-native teams often don’t.

And finally, a softer one: the deployment story. One Lambda zip with schema + resolvers is an atomic unit; an AppSync API with ten resolvers is ten CloudFormation resources with their own data-source IAM roles. Which shape the team finds easier to reason about depends on the deployment muscle they’ve already built.

What we’ll filter on

Distilling that exploration into filters we can score each option against:

  1. Protocol plumbing ownership, does the team write or maintain the GraphQL execution runtime?
  2. Subscription mechanism, is real-time first-class, or a separate project?
  3. Per-field authorization, schema directive, or middleware code?
  4. Cold-start exposure, does every request pay a Lambda warm-start tax?
  5. Caching model, built-in managed cache, or BYO?
  6. Team size fit, scales down to a small team, or assumes an Apollo-native org?

The GraphQL runtime landscape

  1. AppSync (managed). A managed GraphQL service. The team uploads a schema; defines data sources (DynamoDB tables, RDS clusters, HTTP endpoints, Lambda functions, OpenSearch domains, EventBridge buses); attaches resolvers to schema fields; AppSync handles the GraphQL execution, the HTTP front door, the WebSocket subscription pipe, and auth enforcement. Resolvers are either pipeline JS functions (no server, no cold-start) or Lambda data sources (full language flexibility, Lambda cost and cold-start surface). Subscriptions are declared with @aws_subscribe; AppSync publishes to subscribed clients when a matching mutation completes. Authorization is a fixed menu. API key, Cognito user pools, OIDC, IAM, Lambda authoriser, with per-field tightening via schema directives. A built-in cache with per-resolver TTLs sits at the door.

  2. Lambda + Apollo Server behind API Gateway. A Lambda function wraps Apollo Server and exposes it through API Gateway (HTTP API) at POST /graphql. Resolvers are ordinary TS functions that call the AWS SDK for DynamoDB, a pg client for Aurora via RDS Proxy, and fetch for the vendor service. Deployments are CDK or SAM; observability is the Lambda + API Gateway combo; the team owns the GraphQL execution runtime (via Apollo) and every piece of middleware. Subscriptions are not first-class. Apollo’s subscription support assumes a long-running server, which Lambda is not. Options are API Gateway WebSockets with connection state in DynamoDB, IoT Core MQTT, or dropping to polling. Authorization is middleware in Apollo’s context builder plus directive implementations like graphql-shield for per-field rules. No built-in cache; the team wires Redis or DataLoader if it needs one.

  3. Neptune GraphQL / other niche managed GraphQL. Worth mentioning to dismiss: Neptune’s GraphQL path is for graph-database fronting, not general-purpose API layering, and doesn’t cover this scenario’s backends. The real choice is between AppSync and Lambda + Apollo.

Side by side

Option Protocol plumbing Subscriptions Per-field auth Cold-start exposure Cache Small-team fit
AppSync (JS resolvers) ✓ managed ✓ native ✓ directives ✗ (none) ✓ built-in
AppSync (Lambda data sources) ✓ managed ✓ native ✓ directives Partial (only DS Lambdas) ✓ built-in
Lambda + Apollo ✗ team owns ✗ separate stack ✗ middleware ✓ every request ✗ BYO ✗ stretched

Reading the table against the six attributes: AppSync covers five cleanly; Lambda + Apollo covers the shape of a GraphQL endpoint but hands the team subscriptions, per-field auth, and caching as separate projects, plus a cold-start tax on every query.

Matching scenario to runtime

Small team, subscriptions this scenario Apollo-native org heavy middleware, shared resolvers Mostly AppSync, one odd field escape-hatch shape Logistics platform 4 engineers, 3 backends real-time dashboard, 1s budget per-field Cognito auth Apollo-heavy platform existing Apollo Server estate shared resolvers, custom directives subscriptions optional Mixed-fit API most fields fit JS resolvers one field needs complex TS subscriptions still first-class Subscriptions required? yes Apollo estate already? yes Most fields fit JS resolvers? yes Small team? yes Subscriptions optional? yes One field needs escape hatch? yes Per-field auth is schema, no WebSocket code, no GraphQL runtime to maintain Full control over middleware, shared schema with other services, cold-start tax accepted JS resolvers for the 90%, Lambda data source for the odd one, subscriptions stay native AppSync (JS resolvers) shortest path @aws_subscribe for real-time Cognito directives per field no WebSocket, no cold-start Lambda + Apollo full control TS resolvers, custom directives one-zip atomic deploys subscriptions become a project AppSync + Lambda DS escape hatch JS resolvers by default Lambda DS for the odd field keeps the rest managed
Each workload shape answers a small handful of questions, subscriptions? Apollo estate? escape hatch?, and the runtime falls out the bottom.

AppSync, in depth

For this scenario, AppSync is the shorter path. Three primitives repay understanding.

Data sources. A named configuration pointing at a backend: a DynamoDB table ARN, an Aurora cluster ARN plus the RDS Data API, an HTTP endpoint, a Lambda function, an OpenSearch domain, an EventBridge bus. The data source has its own IAM role that AppSync assumes when invoking it, so the blast radius of a misbehaving resolver is scoped to what that role can do, not to a blanket Lambda execution role with access to everything.

Resolvers. Attached to schema fields. A resolver has a before step and an after step, both JS functions running in AppSync’s managed runtime. The before step builds a request to the data source; the after step formats the response. For DynamoDB, the request is a GetItem / Query / Scan / PutItem operation expressed as JSON; for HTTP, a method, path, headers, and body; for Lambda, the payload the function receives.

// Shipment.query: resolver for getShipment(id: ID!)
export function request(ctx) {
  return {
    operation: 'GetItem',
    key: { shipment_id: { S: ctx.args.id } },
  };
}
export function response(ctx) {
  if (ctx.error) { util.error(ctx.error.message, ctx.error.type); }
  return ctx.result;
}

Four lines of request plus four of response; no Lambda to cold-start, no SDK to import, no boilerplate to package. The resolver runs in AppSync and talks to DynamoDB directly.

Pipeline resolvers extend this with multiple steps: a field resolver can chain auth then loadUser then loadShipment then returnShape, each step a function, each step allowed to read the accumulated ctx. Useful for “check a permission, then fetch the data” without writing a Lambda.

Lambda data sources are the escape hatch: when the logic is too complex for the JS resolver runtime or shares code with existing Lambda-based services, a Lambda becomes the data source and gets the full GraphQL context as input. Pay Lambda’s cold-start and invocation costs; get Lambda’s flexibility. Use it when it’s earned, not by default.

Subscriptions. Declare type Subscription { onShipmentUpdated(shipmentId: ID!): Shipment @aws_subscribe(mutations: ["updateShipment"]) }. When updateShipment runs, AppSync publishes the result to every client subscribed to onShipmentUpdated with a matching shipmentId. No WebSocket code to write; the client SDK manages the socket; AppSync manages the fan-out. The contrast with the Apollo path here is stark, one line of schema versus a WebSocket stack with connection state in DynamoDB.

Authorization. Set per API (API_KEY, AWS_IAM, AMAZON_COGNITO_USER_POOLS, OPENID_CONNECT, or AWS_LAMBDA for custom). Directives on types and fields narrow further: @aws_cognito_user_pools(cognito_groups: ["Admins"]) restricts a field to a group. Per-field authorization is one line of schema; on the Apollo path, it’s a middleware library plus its test suite.

Caching. An optional AppSync cache (Redis under the hood) with per-resolver TTLs and keying. Useful for resolvers that return stable data for many readers, the VendorStatus HTTP call, for example, which the vendor updates every 30 seconds. Lambda + Apollo has no built-in cache; the team builds it.

Gotchas worth naming. JS resolvers have a real-but-limited runtime, no arbitrary npm imports, a curated set of utility functions, a size limit. When a resolver wants to do something creative, Lambda data source is the way out. The AppSync cache is a convenience but not a CDN; for heavy geographic fan-out, CloudFront in front of AppSync still earns its keep. And per-field auth via directives is strict: a field without any directive falls back to the API-level auth mode, so “I thought this was locked down” bugs come from the absence of a directive, not the presence of a wrong one.

When the hand-rolled path is right

Lambda + Apollo earns its complexity when:

  • The team already runs a substantial Apollo Server elsewhere and wants to share schema, resolvers, or directives.
  • Resolver logic is heavy TypeScript with custom middleware, directive processing, or a client library AppSync can’t model.
  • The team wants deployment atomicity, one Lambda zip contains the schema and every resolver, one change redeploys everything.
  • GraphQL subscriptions aren’t needed, or are pushed to a separate real-time product (Ably, Pusher, Pub/Sub).

The hand-rolled path trades managed convenience for direct control. For a large Apollo-native team, that trade often makes sense. For four engineers building a GraphQL API from zero with subscriptions and per-field auth, AppSync shortens the path by months.

A worked example: the shipment detail query

Client:

query FullShipment($id: ID!) {
  shipment(id: $id) {
    id status
    customer { id name email }
    vendorStatus { carrier lastPing }
  }
}

AppSync pipeline:

  1. shipment resolver. DynamoDB GetItem on shipments keyed by id. AppSync returns the item to the GraphQL runtime.
  2. Shipment.customer. RDS Data API query SELECT * FROM customers WHERE id = :customerId, binding from the parent shipment.customerId. Response shaped to the Customer type.
  3. Shipment.vendorStatus. HTTP data source GET /status/{vendorId} with a signed request. Response parsed and shaped.

Three resolvers, three data sources, no Lambda cold-starts in the hot path. Latency is dominated by the slowest backend (Aurora at roughly 20ms via RDS Proxy); the GraphQL runtime adds under 5ms. The subscription onShipmentUpdated rides the same schema, clients subscribe via the SDK, mutations publish, the dashboard tile re-renders inside the one-second budget without any WebSocket code on either side.

What’s worth remembering

  1. AppSync is GraphQL-as-a-service. Managed HTTP front door, managed WebSocket subscription pipe, JS resolvers attached to data sources, per-field auth via directives, no GraphQL runtime to own.
  2. Data sources scope the blast radius. Each one has its own IAM role AppSync assumes; a buggy resolver can’t reach beyond what that role allows.
  3. JS resolvers don’t cold-start. Lambda data sources are the escape hatch and pay Lambda costs; use them when the logic earns it, not by default.
  4. Subscriptions are a schema directive in AppSync. @aws_subscribe associates subscription fields with mutations; no WebSocket code in the application.
  5. Per-field authorization is a one-line schema edit. @aws_cognito_user_pools, @aws_iam, @aws_api_key, @aws_oidc, @aws_lambda, tag a field with the allowed identity and move on.
  6. Lambda + Apollo hands you the runtime. Same GraphQL spec, same clients, same schema, more code ownership; subscriptions become a separate WebSocket project and auth becomes middleware.
  7. Cold starts affect the hand-rolled path more. AppSync’s managed resolvers don’t cold-start; a Lambda with Apollo does. Provisioned concurrency or SnapStart is the mitigation.
  8. Caching is built into AppSync. Per-resolver TTLs on a managed cache; Lambda + Apollo builds its own out of Redis and DataLoader.
  9. For small teams, AppSync shortens the path substantially. The time saved on subscriptions and per-field auth alone buys back a feature or two per quarter.
  10. The hand-rolled path earns its keep when the team is already Apollo-heavy. Shared resolvers, custom directives, atomic zip deploys, genuine advantages, but only if that muscle already exists.

AppSync for this scenario as stated: three backends, real-time dashboard, per-field auth, four engineers. Lambda + Apollo when the team is Apollo-native and subscriptions are optional. Same protocol, different toll booths.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.