Zero Trust Networking in Practice#

November 20, 2025 · 7 min read

The perimeter-based security model is dead. When your infrastructure spans three cloud providers, dozens of SaaS integrations, and a workforce that connects from coffee shops and home offices, there is no meaningful "inside" or "outside" to defend. Zero trust is not a product you buy. It is an architecture where every request is authenticated, authorized, and encrypted regardless of where it originates. Here is how we actually implemented it.

Moving Beyond VPNs

Our starting point was a classic VPN setup: engineers connected to a corporate VPN to access internal tools, and services communicated over a private network assumed to be trusted. The problems with this model compounded over time:

VPN was a single point of failure. When our VPN concentrator had issues, 200 engineers could not access anything.
Once inside the VPN, lateral movement was unrestricted. A compromised laptop had the same network access as a production database.
VPN splits complicated routing and caused performance issues for remote workers in different regions.
Onboarding new engineers required VPN client installation, certificate distribution, and firewall rule updates — a process that took 2-3 days.

We replaced the VPN with an identity-aware proxy (we use a combination of Cloudflare Access and a custom solution built on Envoy). Every internal application is now accessible over the public internet but protected by identity verification at the proxy layer. New engineers get access in minutes through our SSO provider, not days through IT tickets.

Service Identities with SPIFFE/SPIRE

Zero trust for humans is only half the problem. Service-to-service communication also needs identity. IP-based allow lists do not work in dynamic environments where pod IPs change constantly. We adopted SPIFFE (Secure Production Identity Framework for Everyone) as our workload identity standard and SPIRE as the implementation.

Every workload in our Kubernetes clusters receives a SPIFFE ID — a URI like spiffe://production.example.com/ns/payments/sa/api-server. SPIRE automatically issues and rotates X.509 certificates (SVIDs) that encode this identity. The certificates have a 1-hour TTL, so even if one is compromised, the window of exposure is small.

The SPIRE server runs as a DaemonSet on each cluster node. Registration entries define which workloads get which identities:

spire-server entry create \
  -spiffeID spiffe://production.example.com/ns/payments/sa/api-server \
  -parentID spiffe://production.example.com/node/k8s-worker \
  -selector k8s:ns:payments \
  -selector k8s:sa:api-server \
  -ttl 3600

The selectors ensure that only a pod running in the payments namespace with the api-server service account can receive that SPIFFE ID. A compromised pod in a different namespace cannot impersonate the payments service.

Mutual TLS Everywhere

With SPIFFE identities in place, we enforce mutual TLS (mTLS) on all service-to-service communication. Both sides of every connection present certificates and verify each other's identity. Plain HTTP between services is not allowed — our network policies drop any unencrypted traffic on internal ports.

We handle mTLS at the service mesh layer using Linkerd, which injects a sidecar proxy that transparently handles certificate presentation and verification. Application code does not need to change at all. From the application's perspective, it is still making plain HTTP calls to localhost — the sidecar handles the encryption on the wire.

The result: even if an attacker gains access to the internal network, they cannot eavesdrop on service traffic or impersonate a service without a valid SVID. This is a fundamental shift from the old model where internal network access implied trust.

Policy-as-Code with OPA

Authentication (proving who you are) is only half of zero trust. Authorization (proving you are allowed to do this) is the other half. We use Open Policy Agent (OPA) to define and enforce authorization policies as code.

Policies are written in Rego and stored in a Git repository alongside the services they govern. Here is a simplified example that controls which services can access the payments API:

package authz

default allow = false

allow {
    input.source.spiffe_id == "spiffe://production.example.com/ns/orders/sa/order-processor"
    input.destination.path == "/api/v1/charge"
    input.method == "POST"
}

allow {
    input.source.spiffe_id == "spiffe://production.example.com/ns/admin/sa/dashboard"
    input.destination.path == "/api/v1/transactions"
    input.method == "GET"
}

OPA runs as a sidecar alongside each service and evaluates every incoming request against the policy. The policies are distributed via a central OPA bundle server and refreshed every 30 seconds. Changes to authorization rules go through the same code review and CI process as application code.

This approach replaced our previous system of hand-maintained ACL lists in config files that no one fully understood and everyone was afraid to change.

Network Segmentation in Layers

Zero trust does not mean network controls are irrelevant. Defense in depth means layering identity-based controls on top of network-based controls:

Cloud-level: VPC peering is restricted. Production and staging VPCs cannot communicate. Cross-account access requires explicit IAM role assumption.
Kubernetes-level: NetworkPolicies enforce namespace isolation. The payments namespace can only receive traffic from the orders and admin namespaces.
Service-level: mTLS with SPIFFE identities ensures only verified workloads can establish connections.
Application-level: OPA policies control which specific API endpoints each service can access.

An attacker would need to bypass all four layers to move from a compromised service to a sensitive one. In our penetration testing exercises, this layered approach contained every simulated breach to the initially compromised service.

Trust is a vulnerability. Every assumption that "this network is safe" or "this service is internal" is an attack surface waiting to be exploited. Verify everything, every time.

Implementing zero trust is a journey, not a weekend project. We rolled it out over 8 months, starting with the most sensitive services (payments, authentication) and expanding outward. Start with service identity and mTLS — they give you the biggest security improvement for the least application-level disruption.