The v1 API enforces a per-project sliding window. Defaults are generous for
typical builds and scale up automatically on paid plans; if you have a known
heavy workload, contact us before launch.
Defaults
| Plan | Requests / minute | Burst | Notes |
|---|
| Free | 30 | 60 | Tightest. Plenty for development. |
| Hobby | 60 | 120 | The “default project” budget. |
| Build | 240 | 480 | Sized for production sites. |
| Scale | 1,000 | 2,000 | Sized for high-traffic backends. |
| Enterprise | custom | custom | Negotiated per workload. |
The window is sliding 60 seconds, not a calendar minute. The burst
column is the maximum number of requests we’ll let you make at once before
the window kicks in (it refills steadily as time passes). For most callers
the limits are invisible; for heavy callers the headers below tell you exactly
where you stand.
Read-only GETs and writes count equally toward the limit. Async POST /jobs (blog generation, image gen, etc.) count once at submit; the resulting
job runs server-side and doesn’t add to your request budget.
Every response — including 429 errors — carries the limit headers.
| Header | Meaning |
|---|
X-RateLimit-Limit | Window ceiling for this project. |
X-RateLimit-Remaining | Requests left in the current sliding window. |
X-RateLimit-Reset | Unix seconds when the window will fully refill. |
Retry-After | Seconds to wait. Only present on 429 responses. |
Example:
HTTP/2 200 OK
content-type: application/json
x-ratelimit-limit: 60
x-ratelimit-remaining: 47
x-ratelimit-reset: 1745007660
x-request-id: req_2Nh4PqRsTuVw
Hitting the ceiling
When the window is exhausted, the API returns 429 Too Many Requests with
code: "rate_limited" and a precise Retry-After:
{
"type": "https://api.neuraldraft.io/errors/rate_limited",
"title": "Too many requests",
"status": 429,
"code": "rate_limited",
"detail": "Rate limit exceeded for project prj_2NgcaXxFqLPo. Retry after 14 seconds.",
"instance": "req_2Nh4PqRsTuVw"
}
retry-after: 14
x-ratelimit-limit: 60
x-ratelimit-remaining: 0
x-ratelimit-reset: 1745007674
Backoff strategy
The recipe is the same one we recommend for errors:
exponential backoff with full jitter, capped, and always honour
Retry-After.
async function callWithLimitAware<T>(
fn: () => Promise<Response>,
parse: (r: Response) => Promise<T>
): Promise<T> {
for (let attempt = 0; attempt < 5; attempt++) {
const res = await fn();
if (res.status !== 429) return parse(res);
const retryAfter = Number(res.headers.get("retry-after") ?? "1");
const jitter = Math.random() * 250;
await new Promise((r) => setTimeout(r, retryAfter * 1000 + jitter));
}
throw new Error("Rate-limited 5x in a row — giving up.");
}
When to use bulk endpoints
If you’re hitting the limit doing many single-key reads, switch to the bulk
endpoints — same rate-limit cost (one request) for a much larger payload:
| Avoid | Prefer |
|---|
GET /v1/content/hero.headline?lang=en
GET /v1/content/hero.subhead?lang=en
GET /v1/content/cta.label?lang=en | GET /v1/content/bulk?keys=hero.headline,hero.subhead,cta.label&lang=en |
PUT /v1/content/hero.headline (×N) | PUT /v1/content with a { updates: [...] } body |
GET /v1/blog-posts/1
GET /v1/blog-posts/2 | GET /v1/blog-posts?ids=1,2,3,4 |
A static-site builder that fetches every key on next build should issue
exactly one bulk content call per locale.
Asking for higher limits
Higher limits are part of the Build, Scale and Enterprise plans, and we’ll
also raise a per-project ceiling for one-off events (a launch, a Black
Friday) without changing your tier. Email
info@neuraldraft.io with:
- Your project id (visible in the dashboard or
GET /v1/projects/me).
- Expected peak RPS and total daily volume.
- The window of high traffic, if known.
We’ll bump the limit and confirm within one business day.
Abuse, fairness, and quiet limits
In addition to the per-project window, we enforce a few global guardrails to
keep the platform fair for everyone:
- Per-IP soft cap for unauthenticated public endpoints
(
/v1/public/*) — the limit is generous and adapts to traffic; a single
abusive IP can’t degrade your project. If you proxy public traffic behind
your own backend, set the X-Forwarded-For header — we’ll honour it for
attribution.
- AI cost ceiling per project per minute — separate from the request
ceiling, this prevents a runaway script from spending your monthly credits in
an hour. Hitting it returns
429 with code: "rate_limited" and a longer
Retry-After. If you need to do legitimately heavy AI work, talk to us in
advance.
- Webhook delivery rate is capped per endpoint at 100 deliveries/sec to
avoid stampeding your receiver.
What 429 is not
- Not an account-wide “you’re paying too much” signal — that’s
insufficient_credits.
- Not a permanent block — keys auto-recover the moment the window resets.
- Not a sign your key is compromised — but your audit log is the right place
to look if
429s happen unexpectedly. If a key you don’t control is being
used, revoke it immediately.