API reference

Rate limits

A per-key throttle protects the platform from runaway agents and noisy neighbors. Headers on every response tell you exactly how much budget you have left.

The numbers

50 requests per second per agent key, counted across every endpoint under /api/agent/v1/*. Fixed 1-second window — the counter resets every second, not on a sliding clock.
Per-key cap. The bucket is keyed by agent key id, not user. Two keys on the same user have independent budgets, which makes it safe to dedicate a key to a noisy backfill.
Per-key policy daily call cap is independent — enforced per-key on top of the per-second throttle. See Per-key policy constraints.

Headers on every response

Every response — success or error — carries three headers so you don’t have to model the budget on your side.

Successful call, ~half budget used

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 50
X-RateLimit-Remaining: 27
X-RateLimit-Reset: 1

X-RateLimit-Limit — total budget per second.
X-RateLimit-Remaining — calls left in the current window.
X-RateLimit-Reset — seconds until the bucket resets. Always 1 today, but treat it as authoritative for future changes.

whoami echoes the same numbers in its response body under rateLimit so an agent can probe its budget without consuming a non-trivial call.

When you hit the limit

429 Too Many Requests

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 50
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1
Retry-After: 1

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests on this agent key. Retry after the window resets.",
  "limit": 50,
  "resetSeconds": 1
}

We surface Retry-After in seconds. Honor it; immediate retries will just 429 again and burn your budget for the next window.

Backoff pattern

Exponential with jitter

async function withBackoff<T>(fn: () => Promise<T>): Promise<T> {
  let delay = 250;
  for (let attempt = 0; attempt < 5; attempt++) {
    try {
      return await fn();
    } catch (e) {
      const err = e as { status?: number };
      if (err.status === 429 || (err.status ?? 0) >= 500) {
        const jitter = Math.random() * delay;
        await new Promise((r) => setTimeout(r, delay + jitter));
        delay = Math.min(delay * 2, 10_000);
        continue;
      }
      throw e;
    }
  }
  throw new Error('giving up after 5 attempts');
}

Patterns that hurt

Polling /audit in a hot loop. One call every 5–10 seconds is plenty. For real-time, subscribe to webhooks instead.
Calling /mandate per trade. Read it once at session start (or pull the ac://mandate/current MCP resource); the response is current to the most recent IPS update.
Validating one trade at a time. If you have N candidates, batch them through POST /policy/validate-batch — up to 25 per call, one round-trip, one rate-limit hit.
Probing every possible amount. If you’re walking down sizes to find what the policy allows, jump in larger steps. The walk in our drawdown-response recipe does ~5 probes total, not 50.

If the Redis backend is unavailable

The throttle uses Redis to count. If Redis is down, the guard fails open — your requests still succeed, but the X-RateLimit-Remaining header will report 50 on every call. We prefer a brief loss of enforcement to dropping legitimate traffic.

Last updated 2026-06-15

Suggest an edit·Changelog