Jul 1, 2026

The new ngrok.ai

•

1,225 words

•

ngrok.ai has a new home.

When we first introduced the AI Gateway back in December, we saw users excited about centralized management and routing across hosted and local models. But we also saw something else. For many users, getting there required learning parts of the ngrok platform that weren’t directly related to making their first request. While concepts like endpoints and traffic policy were powerful, they added a layer of complexity that slowed people down.

We took that feedback seriously. We rebuilt the experience from the ground up with a focus on people creating applications powered by AI, while still keeping the powerful functionality.

Starting today, app.ngrok.ai gives you a dedicated dashboard, API, and gateway URL for AI applications. Create an access key, point your SDK at https://gateway.ngrok.ai, and send requests to supported providers, your own provider accounts, or models running in your own environment.

With ngrok.ai, your app can call hosted models like OpenAI and models running on your laptop, a local GPU box, or a cloud machine through the same gateway URL. All requests go to one endpoint, and ngrok.ai handles access control, routing and observability for you.

Make your first request

Sign in to app.ngrok.ai with your existing ngrok credentials or a new account, add credits, grab your access key, and point your SDK at the new gateway URL:

1import OpenAI from "openai";2 3const client = new OpenAI({⋯4	baseURL: "https://gateway.ngrok.ai/v1",5	apiKey: "ng-xxxxx-g1-xxxxx", // Your AI Gateway access key6});7 8const response = await client.chat.completions.create({⋯9	model: "gpt-4o",10	messages: [{ role: "user", content: "Hello!" }],11});

You don’t need to sign up for OpenAI or bring your own OpenAI key to make this request. Your ngrok.ai credits will cover any provider costs.

In addition to this, the new experience also includes a dedicated API at https://api.ngrok.ai for configuring providers, models, and access keys. You can grab an API key on your settings page.

Call a model running on your own machine

Hosted models are only one part of the story. Many teams also want to test local models, run private models on their own infrastructure, or use GPU capacity they already control.

We make it easier to use different models without extra setup. With ngrok.ai, you can call both hosted and your own models through one gateway, and switch between them just by changing the model name.

First, you’ll create a custom provider that exposes your model. You’ll need an API key, so grab one from the settings page, then call the ngrok.ai API:

1curl -X POST https://api.ngrok.ai/providers \2  -H "Authorization: Bearer <AI_GATEWAY_API_KEY>" \3  -H "Content-Type: application/json" \4  -d '{5    "providerId": "my-workstation",6    "name": "My Workstation",7    "baseUrl": "https://my-workstation.internal",8    "supportedApiSurfaces": [{ "format": "openai", "surface": "chat-completions" }],9    "models": [{ "modelId": "llama-3.3" }]10  }'

Then run the ngrok agent locally using that same URL:

ngrok http 11434 --url https://my-workstation.internal

Now call your local model through the same OpenAI SDK:

1// Same client, only the model changes.2const response = await client.chat.completions.create({⋯3	model: "my-workstation:llama-3.3",4	messages: [{ role: "user", content: "Hello!" }],5});

Here, ngrok.ai uses ngrok internal endpoints, so your model isn’t sitting out on the public internet. It’s only reachable through your gateway, which keeps things a lot more private and controlled.

When you make a request, the model string uses a simple providerId:modelId format. It tells ngrok.ai which provider to use and which model to call. So in the example above, my-workstation:llama-3.3 sends the request to your my-workstation provider and its llama-3.3 model.

The nice part is that this setup grows with you. You might start by tinkering with a model on your own machine, and later move that same workload to a cloud GPU when you need more power. Your app doesn’t have to change—just point the gateway at the new upstream and keep going.

Use local models with hosted model fallback

You can also configure an access key to reach more than one model.

For example, route requests to your own model by default, then fall back to OpenAI or Anthropic when that model is unavailable. Your application still sends one request to ngrok.ai, and the gateway can help make sure the request gets an answer.

This is useful when you want the cost, latency, or privacy benefits of running your own model, but still need a hosted provider as a backup path.

Bring your own provider keys

While ngrok.ai credits are a simple way to get started without signing up for each provider separately, you don’t have to use them.

If you already have provider keys, you’re welcome to keep using those. Just add your keys to a provider, keep the same gateway URL, and continue making requests from your application.

This is useful if you already have negotiated provider rates, existing billing relationships, or specific accounts where you want model usage to land.

Create separate keys for apps and teammates

AI access usually starts with one shared provider key. That works at first, but it gets messy as more apps, environments, and teammates need access. Keys get copied around, and it’s hard to know who’s using what.

With ngrok.ai, you don’t have to share provider keys. You create access keys instead, and decide exactly which models each one can use. Your provider credentials stay in one place.

That means you can give each app or teammate just what they need. If something goes wrong, you can revoke a key without affecting anything else.

You can also name keys for specific environments or people, so it’s easier to see what’s happening when something breaks.

And since every request goes through ngrok.ai, you get a clear view of usage and spend in one place, broken down by key, provider, and model.

What this means if you used the early access gateway

If you built on the early access version of the ngrok AI Gateway, here’s what changes for you.

The biggest shift is that you no longer need to manage multiple endpoint URLs or learn the broader ngrok networking platform just to make a model request. The experience now lives in its own dashboard at app.ngrok.ai, with a dedicated gateway URL and API.

There are also two changes to be aware of:

Pricing. Every request now includes a small processing fee of $0.05 per million tokens, deducted from your credit balance. This rate is the same across all providers and models, including when you bring your own keys.

The old dashboard experience will be winding down. You’ll no longer be able to buy credits or create new gateways in the old dashboard. To continue, head to app.ngrok.ai and use the new experience. If you still have credits on the old gateway, reach out and we’ll help move your balance over.

Try the new ngrok.ai today

To get started:

Sign in at app.ngrok.ai
Add credits on the Credits page
Create an access key on the Access Keys page, then copy it
Point your SDK at https://gateway.ngrok.ai/v1
Send your first request

If you want to dig deeper, take a look at the docs for more details on ngrok.ai and what it can do. The quickstart walks through everything step by step.

If you build something with ngrok.ai, hit something rough, or have ideas for how it could be better, reach out. We want to hear about it.

Bookmark this sectionMake your first request

Bookmark this sectionCall a model running on your own machine

Bookmark this sectionUse local models with hosted model fallback

Bookmark this sectionBring your own provider keys

Bookmark this sectionCreate separate keys for apps and teammates

Bookmark this sectionWhat this means if you used the early access gateway

Bookmark this sectionTry the new ngrok.ai today

Bookmark this sectionMake your first request

Bookmark this sectionCall a model running on your own machine

Bookmark this sectionUse local models with hosted model fallback

Bookmark this sectionBring your own provider keys

Bookmark this sectionCreate separate keys for apps and teammates

Bookmark this sectionWhat this means if you used the early access gateway

Bookmark this sectionTry the new ngrok.ai today

Make your first request

Call a model running on your own machine

Use local models with hosted model fallback

Bring your own provider keys

Create separate keys for apps and teammates

What this means if you used the early access gateway

Try the new ngrok.ai today

Make your first request

Call a model running on your own machine

Use local models with hosted model fallback

Bring your own provider keys

Create separate keys for apps and teammates

What this means if you used the early access gateway

Try the new ngrok.ai today