API gateway shapes: back then, now, and beyond

September 16, 2025
|
9
min read
Joel Hans

For software engineers, the "then" mental model for an API gateway is that it's the inflexible monolith that sits between the 😈 big scary public internet and 👼 your production services.

That picture doesn't hold up anymore. The way you build and run services has inevitably changed, and you might:

  • Prototype with LLMs, write agents into your apps, or self-host models, all of which mean more endpoints to secure and monitor.
  • Run services everywhere, like laptops, in CI/CD previews, across clusters, and in multiple clouds, where a single load balancer can't cut it.
  • Handle "shift left"-ed responsibilities like security and observability early in the lifecycle, not bolt them on at the end.
  • Let services talk to each other across private networks, customer devices, and SaaS APIs in ways that sometimes feel delightful, sometimes terribly risky.

All these changes make gateways more, not less, important than ever before, but maybe not in the ways you expect. Does the traditional API gateway work for those, too?

By the end of this post, you'll learn about:

  1. What an API gateway used to be
  2. The main problems these gateways solve today
  3. A bunch of new gateway "shapes" for today's services
  4. Why you should try a gateway before you really need one

What's an API gateway?

An API gateway is the "front door" between the public internet and your network, which opens and shuts, and routes traffic to the appropriate upstream service, depending on who's knocking.

The need for API gateways evolved from a lineage of load balancers and reverse proxies. If you poked back into the networks of 20+ years ago, you had tools like HAProxy and Apache for load balancing and basic reverse proxying, but as apps modernized, they needed more Layer 7/HTTP logic.

Apache did have mod_rewrite for path-based routing to upsteams, and nginx followed up later with slightly nicer location blocks for that plus niceties like header manipulation. That got you part of the way, but didn't save you from implementing the same bundle of auth and rate limiting code in every service. That's why what could've been a single piece of infrastructure gets blown up into three or more.

The big breakthrough came in API gateways, which standardized and collapsed what was once many jobs into one. What if your API gateway, as the front door, could also perform the same auth and rate limit checks to every request, no matter its intended destination? What if you could cut out 80% of your boilerplate code and offload it to the front door? What if it could help you implement, say, usage-based pricing without building a massive piece of custom middleware?

To get us on the same page, here's the "shape" of an API gateway:

In this shape, API gateways help you complete a bunch of important jobs, like how you:

  • Put rate limits on top of your services (e.g. close the door if things get too crowded inside).
  • Handle authentication for multiple services without hand-rolling it even one (e.g. use one trusted combination of lock and key).
  • Route traffic from hostnames or paths to different upstream services (e.g. okay, I'm done stretching the metaphor).
  • Load balance between replicas of your services.

They help you move faster by offloading all those dull jobs away from your services so you're not shoe-horning auth into everything or coding up an army of Lua plugins to handle every bit of custom functionality, like signature verification. As you offload, your gateway also standardizes traffic behavior across APIs and apps, which also lets you ship faster and more securely.

Everyone should worry about the security of your network, and a well-defined API gateway makes it hard to completely mess that up with unexpected gotchas or footguns.

The many shapes of 2025's gateways

API gateways are still important 20 years later. But whether or not you were developing two decades ago, your day-to-day doesn't look anything like it did back then. 

Today, a gateway isn't a monolithic deployment pattern like API gateways of the past. Instead, it's more flexible and modular. I works no matter where you're developing new things right now or how your stack evolves in the future.

Across networks, clouds, etc...

I hope these eight shapes will give you inspiration to craft gateways tailor-made for whatever you're building today.

Shape: The 'agent-assisted' gateway

  • Where it works: Between any mix of services you're developing on localhost and the LLMs building on your behalf—for example, that might be building an API yourself and scaffolding a frontend with Lovable.
  • What it does: This gateway routes public traffic into your local stack without you port forwarding. As one better, it also lets you test real auth patterns (OAuth, API keys) and traffic transformation (rewrites, header manipulation) much earlier in your lifecycle than when you're ready to jump into prod-land.

The vibe-coding gateway matters because you no longer have to wait for production to surface bugs in how you handle traffic. Local environments stop being toys and look more like reliable testbeds for how the thing you're building will behave when it's ready to go live.

Shape: The localhost gateway (en masse edition)

  • Where it works: Between the public internet and the local development environments for you and your peers.
  • What it does: This gateway creates many public endpoints (with access control) to dev environments—whether they’re laptops, VMs, or cloud dev boxes—and routes requests by hostname, path, or header. 

The localhost gateway matters because you get to control exactly who exposes local services, under which domains, and route only valid, authenticated requests. Everyone can share their WIPs without undermining networking security with random, unsecured URLs.

Shape: The ephemeral workload gateway (read: CI jobs)

  • Where it works: Between your testing software or manual review processes and the deploy previews your CI/CD pipeline (GitHub Actions, GitLab CI, Jenkins, or an on-prem duct tape job) spits out.
  • What it does: This gateway exposes a staging version of your app or API on-demand to connect it to external test platforms or click through a deploy preview. It authenticates requests, logs everything, and cleans up when you’re done.

The ephemeral workload gateway matters because it gives you production-like environments for every branch without the pain of long-lived staging. Everyone on your team can safely poke at builds while also avoiding zombie previews that leak your roadmap to the public internet and the shared staging environment that becomes everyone's WIP junk drawer.

Shape: The 'self-hosted alternative to PaaS' gateway

  • Where it works: Between your full-stack app (e.g. Next.js, MERN, etc) you're running on your own infrastructure and the aforementioned public internet.
  • What it does: This gateway gives you the fundamentals of a traditional API gateway—TLS, custom domains, routing-by-anything—but also layers in observability and auth when you don't have a stack that brings those to the table by default.

The self-hosted gateway matters because you get the feel of a managed platform without surrendering control—perfect if you're ready to migrate off Heroku or Vercel but still want that polish. Doubly so when it comes to really gnarly problems like geo-aware load balancing, DDoS protection, or wiring up a WAF.

Shape: The microservices gateway

  • Where it works: Between any number of Kubernetes clusters and the public internet in a complex spiderweb of interconnections.
  • What it does: This gateway handles routing and auth for not only traffic being ingressed into your clusters from the public internet (north-south, using API keys or JWT validation), but also cross-cluster communication (east-west, using mTLS) between services deployed in different environments. It also simplifies how you test and deploy services (canary or blue-green, anyone?) independently.

The microservices gateway matters because you don't actually need to go "full service mesh" to coordinate your microservices across clusters. Spin up and ship new services without rewriting all your routing, and also get a central place to debug once requests start boomeranging across clusters.

Shape: The webhook gateway

  • Where it works: Between third-party webhook providers and your production services.
  • What it does: This gateway is the more robust alternative to webhook testing, in that instead of routing webhooks from Stripe, Twilio, Slack, and beyond into localhost, you accept them on a single hostname like webhooks.example.com. The gateway validates the webhook is coming from a legitimate source and hasn't been tampered with, maybe applies some kind of transformation, and routes the data to the appropriate service.

The webhook gateway matters because it standardizes a class of requests that otherwise get re-implemented poorly in every new service. Instead of reinvinting webhook validation for the thirteenth time, you centralize it and feel confident that your integrations are still secure as they pass through this "side door."

Shape: The database gateway (really!)

  • Where it works: Between your customers or external services and the database you need to (very securely) put on the internet.
  • What it does: This gateway strictly enforces auth before any requests hits your database (OAuth, API keys, mTLS), throttles well-intentioned-but-poorly-automated mistakes, and logs usage per client. Bonus points for transforming queries to prevent runaway costs or data leaks.

The database gateway matters because it makes a task, which you might normally throw an SSH tunnel at and forget about, far more robust. It's one that's recently surprised even us, and we've seen gateways in all kinds of odd and interesting forms over the years. Support customer-facing queries, or even replicate your database across clouds on a private connection, without duct-taping credentials into middleware or learning the hard way that "just IP restrict it" isn't a foolproof strategy.

Shape: The AI gateway

  • Where it works: Between whatever you're building and the LLMs you want, whether those are local, self-hosted on GPUs in the cloud, or accessible via the APIs from OpenAI, Anthropic, and beyond.
  • What it does: This gateway gives you a single public URL to send all AI-bound requests. It authenticates traffic, enforces rate limits, caches responses, redacts PII, then routes requests to the "best" model for the job (fastest, cheapest, most reliable). It integrates with provider-specific APIs (OpenAI, Groq, Deepseek, Ollama) while giving you one consistent policy layer.

The AI gateway matters because it turns a chaotic multi-model stack into something you can lean on. You get safe and controlled access while keeping a lid on costs and keeping an eye on usage across the board. Models and AI agents are already handling a massive portion of today's traffic, and we'll all soon be clamoring to find just the right way to route, secure, and manage it all.

Got this far and thought... "I wish ngrok had an AI gateway?"

ngrok.ai

A cheatsheet for your future gateway shapes

To make this easier to reference (and easier for you to decide which gateway shape fits your stack), here’s a quick summary of each type and the features that define it.

Gateway type

Purpose

Key features / techniques

Vibe-coding gateway

Safely expose local development stacks and prototypes

Routes traffic into localhost, supports OAuth/API key auth, test real headers/timeouts early (so you catch bugs before prod)

Localhost (for teams) gateway

Manage many developer environments at once

Centralized auth (OAuth/SSO), wildcard domains, routing by path/host/header (no more uncopyable ngrok URLs in Slack)

Ephemeral workload gateway

Provide secure deploy previews and CI test environments

Short-lived tokens tied to CI builds, OAuth integration, request logging, safe third-party API access (preview without creating “zombie staging” servers)

Self-hosted PaaS gateway

Wrap apps on your own infra with platform-like polish

TLS termination, custom domains, DDoS protection, observability, SSO integration (replace Vercel polish without waking up to expired certs)

Microservices gateway

Coordinate traffic across services in clusters

Routing by path/version, JWT or mTLS service-to-service auth, independent deploys (so you can ship without rewiring routes [or waiting for ops to do it])

Webhook gateway

Centralize third-party callbacks

Signature validation (HMAC/JWT/vendor-specific), transformation, routing (standardize Stripe/Slack-style webhooks instead of rewriting validation)

Database gateway

Securely expose or replicate databases

Strict auth (OAuth/API key/mTLS), query filtering, rate limiting, usage logging (so devs can give customers query access without duct-taped creds)

AI gateway

Manage access to LLMs across vendors or self-hosted models

Authentication, org-wide rate limits, cost controls, routing by vendor/latency (stop preview apps from racking up GPT-4 bills overnight)

However you slice it, gateways aren't just a box you add at the very sharpest edge of your network any more. The best way to understand is to put one in front of something small today and see how it changes the way you build.

Try a gateway before you 'need' one

Gateways aren't a box you add at the very last edge of your network anymore, but a capability you should want to start adding wherever your services live and in what stage. The faster you adopt that mindset, especially in an era of LLM-induced chaos, the less friction you’ll hit as your stack evolves.

I recommend trying one in small places:

  • Route traffic between a local API and remote LLM
  • Protect your homelab using OAuth and a list of trusted email accounts
  • Add ingress to your deploy previews

Each of these gives you a chance to see how gateways work as this flexible capability you can compose anywhere you need it. ngrok works in all these places and more—to get a sense of how you build them up with endpoints and our language for traffic management (Traffic Policy), check out our gallery of gateway examples.

If you skimmed, here's the link to the cheatsheet to bookmark it for later. And, if you have questions about any of this, there's my email or ngrok's Discord.

Share this post
Joel Hans
Joel is ngrok's DevRel lead. Away from blog posts, videos, and demo apps, you'll find him mountain biking, writing fiction, or digging holes in his yard.
API gateway
Networking
Gateways
Development
Private Edition
Production