The conversation around the edge is usually framed as a latency story: 50ms instead of 250ms. That's true and that's nice, but it undersells what actually changed when edge runtimes matured. Edge compute lets us treat personalization, A/B testing, auth, and feature flags as part of the page, not something we apply after it loads.

The difference is architectural, not numerical. You stop shipping a generic page to every user and then rewriting it in the browser. You start shipping the right page to each user the first time. The consequences for user experience are out of proportion to the change in stack.

What we use it for in production

Personalized HTML at request time

Geo, locale, plan, experiment variants — all composed into the response itself at the edge. No client-side flicker, no "flash of wrong content" while the browser figures out who the user is.

A pricing page served from the edge can price itself in the user's currency before the first byte leaves Cloudflare. A marketing page can swap its hero copy based on where the user is coming from. This used to require server-side rendering in a specific region, or ugly client-side hydration. At the edge, it's a few lines of code.

Auth-aware caching

This is the one that changed how we build logged-in experiences. You can cache an entire page — including layout, chrome, and static content — but vary it on a signed cookie at the edge. Logged-in users get static-site speed. The origin stays idle. The content is dynamic per user.

Before edge caching, you had two choices: cache generic pages and hydrate per-user in the browser (slow, ugly), or skip caching for logged-in users (expensive, slow). Now you get both: cached and personalized, served in single-digit milliseconds.

Edge compute lets us treat personalization, A/B testing, auth, and feature flags as part of the page, not something we apply after it loads.

AI gateway

Token rate limiting, prompt logging, provider failover, and response caching — all in one place, far from your origin. Every AI-powered request flows through an edge worker that enforces quota, routes to the cheapest available provider, and logs the interaction for observability.

This means your application code never talks directly to OpenAI or Anthropic. It talks to your gateway. When a provider goes down (and they do), your gateway fails over to another. When a user exceeds their quota, the gateway rejects cleanly before the call is made. This used to be infrastructure work. At the edge, it's a weekend project.

What it isn't great for

Edge runtimes are limited. They have short startup budgets, strict memory limits, and no Node built-ins. Some things simply don't belong there:

  • Long-running work. Video encoding, large file processing, anything that takes more than a few seconds. Push it back to a regional function or a dedicated worker.
  • Heavy libraries. If your dependency tree needs fs, crypto (in legacy Node mode), or 200MB of memory, it's not an edge workload.
  • Database access across regions. Your Postgres primary is in one region. Edge functions run in dozens. The round-trip can eat the latency win completely unless you use a read-replica-per-region strategy or an edge-native DB like Neon with branches.

The rule is simple: use the edge for decisions, not for work. Decide who the user is, what they should see, whether they're allowed to do a thing — all at the edge. Then, if real work needs doing, hand off to a regional function that has the muscle to do it.

The mental model shift

Most teams still think of "the server" as a place. The edge makes it obvious that the server is actually a gradient: nearest user → regional compute → primary database → slow external services. The faster you can return a response, the closer to the user you should run.

This reorganizes how you think about every request. Authentication? Edge. Feature flag check? Edge. Personalization? Edge. Heavy business logic? Regional. Transaction? Primary database. Once the gradient is in your head, the architecture writes itself.

Where to start

If you haven't moved anything to the edge yet, start with your middleware. Auth redirects, locale detection, A/B assignment, feature flags — all perfect edge candidates, and all things that currently cost your origin on every request. Move them to a Next.js middleware or a Cloudflare Worker in an afternoon. You'll see the impact the same day.