GitOps on EKS: Pull vs Push

August 07, 2028 · 14 min read

DevOps Engineer Pro · DOP-C02 · part of The Exam Room

The situation

A platform team operates three EKS clusters:

  • prod-eu-west-1: 80 nodes, ~1,200 pods across 60 application namespaces.
  • staging-eu-west-1: 30 nodes, full replica of prod topology.
  • dev-eu-west-1: 20 nodes, shared, teams spin up and tear down namespaces.

Deployments today:

  • CodePipeline stages per team, running kubectl apply -f against a kubeconfig stored in AWS Secrets Manager. The kubeconfig’s IAM identity has system:masters because nobody felt like scoping it down when there were three teams; there are now sixty.
  • Helm used where team-appropriate, but the helm upgrade happens inside the pipeline, not the cluster.
  • Drift is constant: engineers kubectl edit in dev, nobody notices, staging and prod slowly diverge from what’s in Git.

The asks:

  • Cluster is the source of truth follower, not leader. The desired state lives in Git; the cluster reconciles toward it continuously.
  • No kubeconfig-as-a-secret. Pipelines should push code to Git, full stop; no pipeline should have cluster-admin.
  • Per-team scoping. Team A’s controllers cannot change Team B’s resources.
  • Automated drift detection and alerting. If someone hand-edits a resource, the system notices and either reverts or alerts, per policy.
  • Secrets handling. Application secrets have to land in the cluster without sitting in Git as plaintext.

What actually matters

GitOps makes two claims worth examining before picking a tool.

The first: the cluster pulls; the pipeline doesn’t push. A controller runs inside the cluster, watches a Git repository, and applies changes. The pipeline’s job ends at “commit to main”; the controller’s job begins there. Effects: the cluster never needs an inbound admin credential; the pipeline never needs a cluster credential; cluster RBAC is scoped to the controller and whatever it manages on a team’s behalf.

The second: drift is a first-class signal. If the controller observes that the cluster’s live state doesn’t match the repository’s declared state, that’s a well-formed event. Depending on configuration, the controller either reverts the cluster to match the repo (self-healing) or raises an alert (alerting-only, so humans can investigate hand-edits during incidents). Either way, drift becomes visible.

Multiple controllers implement these claims, with different flavours. The choice is between an application-centric model (a custom resource per deployable unit, a built-in UI, controller-level RBAC) and a component-centric model (separate controllers per source type, no default UI, RBAC enforced at the Kubernetes API server). Both clear the GitOps tests; they differ on what’s first-class in the mental model.

The interesting decisions are not which tool is “better” but which mental model matches the team’s existing skillset, where multi-tenancy lands, and how secrets enter the cluster.

What we’ll filter on

Ranking against:

  1. Multi-tenancy model: per-team scoping with clear RBAC boundaries.
  2. Drift handling: detection, reporting, optional self-heal.
  3. Secret ingestion: how application secrets enter the cluster without sitting in Git.
  4. Operational surface: UI, observability, on-call familiarity.
  5. EKS integration: IRSA, Pod Identity, ALB/NLB, Secrets Manager.

The GitOps landscape

1. Status quo: kubectl apply from CodePipeline. Cluster-admin kubeconfig in Secrets Manager, one per cluster. Works; the pipeline has to be trusted with root, drift is invisible, sixty teams share one blast radius. Rejected by the requirements.

2. ArgoCD. Single controller (plus dex for SSO, plus a repo server and application controller). One Application per team per cluster (or one ApplicationSet that generates them). Web UI at argocd.internal; RBAC in argocd-rbac-cm maps OIDC groups to Argo projects; Argo projects constrain which destinations and sources a team can use. EKS integration via IRSA on the Argo controller’s service account; Secrets Manager integration via External Secrets Operator.

3. Flux. Multiple controllers deployed via the Flux bootstrap; one Kustomization per team per cluster, pointing at a path in a Git repository. No default UI; Prometheus metrics and Kubernetes events provide the operational picture. EKS integration identical (IRSA on each controller’s service account). Secret handling typically via Sealed Secrets, SOPS-encrypted Secrets in Git, or External Secrets Operator.

4. ArgoCD with the ApplicationSet controller. Same as ArgoCD above, with templated generation: a single ApplicationSet yields an Application per team from a list generator or Git generator. The scaling answer for multi-tenant platforms; the generators are what make sixty teams manageable.

5. Flux with the tenants pattern. A Kustomization per team where each team’s path in Git is the root of the team’s manifests; kube-rbac-proxy and service-account impersonation constrain what each team’s Flux reconciliation can touch. Sets up multi-tenancy by design but needs more YAML up front than ArgoCD’s projects.

6. CodePipeline + Helm (no GitOps). Modernise the existing pipeline: Helm releases managed in-pipeline, kubeconfig scoped per team. Solves the RBAC problem; doesn’t solve the drift or “cluster pulls” requirements. Bridges to GitOps but isn’t it.

Side by side

Option Multi-tenancy Drift handling Secret ingestion Operational surface EKS integration
kubectl apply from CodePipeline Secrets Manager at deploy Pipeline logs
ArgoCD ✓ (Projects) ✓ + self-heal option External Secrets Operator Web UI + Prometheus IRSA / Pod Identity
Flux ✓ (tenants pattern) ✓ + self-heal option ESO / SOPS / Sealed Prometheus + events IRSA / Pod Identity
ArgoCD + ApplicationSet ✓ (templated) ESO Web UI + Prometheus IRSA / Pod Identity
Flux + tenants ✓ (by design) ESO / SOPS / Sealed Prometheus IRSA / Pod Identity
CodePipeline + Helm ✓ (per team) SM at deploy Pipeline logs

Both GitOps options clear the requirements. The pick comes down to team ergonomics: the web UI and Project model of ArgoCD, versus the composable controllers and Kubernetes-native CRDs of Flux.

Two shapes of the same GitOps loop

ArgoCD: one controller, application-centric Git repo (platform) teams/*/*.yaml applicationsets/ projects/ argocd-application-controller reconciles Applications argocd-repo-server clones repo, renders manifests argocd-server (UI + API) argocd-rbac-cm, OIDC groups EKS cluster (in-cluster) Applications (CRs) Projects (CRs) team-a-ns, team-b-ns, … sync-status per Application Web UI + audit diff, sync, roll back per app RBAC by Project × OIDC group ApplicationSets for templating Notifications (Slack, webhook) Flux: multiple controllers, Kubernetes-native Git repo (platform) clusters/prod/ tenants/team-a/ infrastructure/ source-controller GitRepository, OCIRepository kustomize-controller Kustomization helm-controller HelmRelease EKS cluster (in-cluster) GitRepository (CR) Kustomization (CR per tenant) HelmRelease (CR) events on reconcile Prometheus + events gotk_reconcile_duration_seconds notification-controller → Slack image-automation updates tags Weave GitOps UI optional
Both controllers pull from the same Git repository and apply to the cluster; ArgoCD presents one Application view, Flux composes separate controllers per source type.

The picks in depth

Multi-tenancy. In ArgoCD, the unit of isolation is the AppProject. A Project declares allowed source repositories, allowed destinations (cluster + namespace), allowed resource kinds, and the OIDC groups that may read or sync Applications in the Project. The sixty teams get sixty Projects; ApplicationSets generate the Applications from a Git generator (the teams/ directory in the platform repo). RBAC enforces that team-a-admin cannot sync team-b’s Applications even through the UI.

In Flux, the tenants pattern uses a Kustomization per team with a service-account impersonation (kustomization.spec.serviceAccountName: team-a-reconciler). That service account has RoleBindings only in team-a-ns; Flux’s reconciliation runs as the team’s service account, so any attempt to create a resource outside team-a-ns fails at the Kubernetes API. The tenant boundary is enforced by Kubernetes RBAC itself, not by Flux.

Both are valid; ArgoCD’s boundary is higher in the stack (controller-level RBAC) and Flux’s is at the API-server level. Flux’s model is arguably stricter (a bug in Flux’s reconciliation cannot escape a tenant’s namespace), but it’s also more YAML to stand up.

Drift handling. Both tools detect drift by comparing live state to the rendered manifests from Git. In ArgoCD, each Application has a syncPolicy: automated.prune: true, selfHeal: true makes the controller revert hand-edits; absent those flags, drift shows in the UI as “OutOfSync” without action. In Flux, a Kustomization has prune: true, spec.force: true equivalents; drift surfaces as reconciliation events and Prometheus counters.

The team’s policy can vary per environment. Prod: selfHeal: false so drift pages an operator who investigates; dev: selfHeal: true because nobody is on call for a hand-edit in dev. Both tools express this per-resource; changing the policy is a Git commit, not a console click.

Secret ingestion. The common pattern with both tools is External Secrets Operator (ESO). ESO runs in the cluster, reads ExternalSecret custom resources that point at AWS Secrets Manager or Parameter Store, fetches the values, and creates Kubernetes Secret objects. The IAM permission to read from Secrets Manager is scoped via IRSA (or Pod Identity) on ESO’s service account; neither ArgoCD nor Flux directly handles AWS secret retrieval.

The ExternalSecret custom resource is stored in Git, but it’s a pointer to a secret, not the secret itself. The actual plaintext never leaves Secrets Manager. For teams that prefer secrets in Git, SOPS-encrypted manifests are a Flux-native option (Flux can decrypt with a KMS key in the cluster); Sealed Secrets is the tool-agnostic answer where a public key in the cluster decrypts manifests that were encrypted with the corresponding private key at commit time. ESO is the most common default on EKS because it keeps the source of truth in AWS.

Image updates. Flux ships image-reflector-controller and image-automation-controller that poll ECR (or another registry), find new image tags, and commit updated tags back to the Git repository. The effect: a CI pipeline that builds and pushes an image to ECR can stop there; Flux discovers the new tag and updates deployment.yaml in Git. ArgoCD has argocd-image-updater as a community add-on that does similar work. Either way, the CI pipeline no longer needs cluster access; it just needs to push to ECR.

Observability. ArgoCD ships a web UI that covers 80% of the on-call questions: which Applications are OutOfSync, what changed, who synced last, current drift diff. Flux produces Prometheus metrics and Kubernetes events; the dashboards are the team’s job to build, though the upstream Grafana dashboards are a good starting point. Weave GitOps is an optional UI for Flux that covers similar ground to ArgoCD’s UI.

A worked deployment

Team payments deploys a new version of their checkout service. With ArgoCD:

  1. CI builds the image, tags it checkout:v1.42.0, pushes to ECR.
  2. CI commits a change to platform-repo/teams/payments/checkout/deployment.yaml updating the image tag. Pipeline ends.
  3. argocd-application-controller notices the Git commit within its requeue.after interval (default 3 minutes) or immediately via the Git webhook to argocd-server.
  4. The controller renders the new manifests through argocd-repo-server and compares to live state. The only diff is the image tag.
  5. With syncPolicy.automated: true on this Application, the controller applies the change to the cluster. Kubernetes rolls the Deployment.
  6. The UI shows the Application as “Synced” and “Healthy” once the rollout completes.

With Flux, steps 1 and 2 are identical. Then:

  1. source-controller fetches the new Git commit (every GitRepository.spec.interval, typically 1 minute).
  2. kustomize-controller reconciles the Kustomization that points at teams/payments/, renders the manifests, and diffs against live state.
  3. Same diff, same apply, same rollout. Metrics on gotk_reconcile_duration_seconds record the time; events in the namespace record the apply.

Both cases: the pipeline does not have cluster credentials. The controller does. The cluster pulled.

Drift scenario: an engineer kubectl edits the Deployment to bump replicas. ArgoCD shows the Application as OutOfSync within the next reconcile; with selfHeal: true it reverts; without, it pages. Flux is equivalent; the Kustomization goes to Ready=False with Reason: DriftDetected (or reverts). Drift becomes a signal rather than an invisible change.

What’s worth remembering

  1. GitOps inverts the direction of control. The cluster pulls; the pipeline pushes to Git. The pipeline loses cluster credentials; the cluster gains Git credentials (usually read-only).
  2. ArgoCD and Flux solve the same problem with different models. ArgoCD is application-centric with a web UI; Flux is component-centric with Kubernetes-native CRDs and no default UI. Both support EKS equally well.
  3. Multi-tenancy in ArgoCD = AppProjects + OIDC groups. Projects declare what sources, destinations, and kinds a tenant can use; RBAC maps OIDC groups to Project permissions.
  4. Multi-tenancy in Flux = tenants pattern + service-account impersonation. Each tenant’s Kustomization reconciles as a namespace-scoped service account; Kubernetes RBAC enforces the boundary.
  5. Drift handling is configurable per-resource. syncPolicy.automated.selfHeal (ArgoCD) or Kustomization.spec.prune/force (Flux) decides whether drift is reverted or merely alerted. Prod policy typically differs from dev.
  6. External Secrets Operator is the EKS default for secret ingestion. Keep plaintext in Secrets Manager or Parameter Store; reference them via ExternalSecret custom resources in Git; IRSA on the ESO service account authorises the AWS reads.
  7. Image tag updates are a separate controller. Flux’s image-automation-controller or ArgoCD’s argocd-image-updater discover new tags in ECR and commit back to Git. CI stops at docker push; the controller handles the rest.
  8. IRSA (or Pod Identity) is how both tools reach AWS. ArgoCD’s service account, Flux’s controllers, and ESO all authenticate to AWS via OIDC-federated IAM roles. Long-lived access keys in kubeconfig are gone.

Both tools deliver the same GitOps claims: the cluster pulls, drift becomes a signal, the pipeline stops needing cluster credentials. The choice is which mental model matches the team: applications as first-class objects with a UI, or composable controllers on Kubernetes primitives. Neither answer is wrong; the wrong answer is keeping the kubeconfig-in-Secrets-Manager pipeline when sixty teams now share its blast radius.

These posts are LLM-aided. Backbone, original writing, and structure by Craig. Research and editing by Craig + LLM. Proof-reading by Craig.