API reference

Rate limits

A per-key throttle protects the platform from runaway agents and noisy neighbors. Headers on every response tell you exactly how much budget you have left.

The numbers

  • 50 requests per second per agent key, counted across every endpoint under /api/agent/v1/*. Fixed 1-second window — the counter resets every second, not on a sliding clock.
  • Per-key cap. The bucket is keyed by agent key id, not user. Two keys on the same user have independent budgets, which makes it safe to dedicate a key to a noisy backfill.
  • Per-key policy daily call cap is independent — enforced per-key on top of the per-second throttle. See Per-key policy constraints.

Headers on every response

Every response — success or error — carries three headers so you don’t have to model the budget on your side.

Successful call, ~half budget used
HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 50
X-RateLimit-Remaining: 27
X-RateLimit-Reset: 1
  • X-RateLimit-Limit — total budget per second.
  • X-RateLimit-Remaining — calls left in the current window.
  • X-RateLimit-Reset — seconds until the bucket resets. Always 1 today, but treat it as authoritative for future changes.

whoami echoes the same numbers in its response body under rateLimit so an agent can probe its budget without consuming a non-trivial call.

When you hit the limit

429 Too Many Requests
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 50
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1
Retry-After: 1

{
  "error": "rate_limit_exceeded",
  "message": "Too many requests on this agent key. Retry after the window resets.",
  "limit": 50,
  "resetSeconds": 1
}

We surface Retry-After in seconds. Honor it; immediate retries will just 429 again and burn your budget for the next window.

Backoff pattern

Exponential with jitter
async function withBackoff<T>(fn: () => Promise<T>): Promise<T> {
  let delay = 250;
  for (let attempt = 0; attempt < 5; attempt++) {
    try {
      return await fn();
    } catch (e) {
      const err = e as { status?: number };
      if (err.status === 429 || (err.status ?? 0) >= 500) {
        const jitter = Math.random() * delay;
        await new Promise((r) => setTimeout(r, delay + jitter));
        delay = Math.min(delay * 2, 10_000);
        continue;
      }
      throw e;
    }
  }
  throw new Error('giving up after 5 attempts');
}

Patterns that hurt

  • Polling /audit in a hot loop. One call every 5–10 seconds is plenty. For real-time, subscribe to webhooks instead.
  • Calling /mandate per trade. Read it once at session start (or pull the ac://mandate/current MCP resource); the response is current to the most recent IPS update.
  • Validating one trade at a time. If you have N candidates, batch them through POST /policy/validate-batch — up to 25 per call, one round-trip, one rate-limit hit.
  • Probing every possible amount. If you’re walking down sizes to find what the policy allows, jump in larger steps. The walk in our drawdown-response recipe does ~5 probes total, not 50.

If the Redis backend is unavailable

The throttle uses Redis to count. If Redis is down, the guard fails open — your requests still succeed, but the X-RateLimit-Remaining header will report 50 on every call. We prefer a brief loss of enforcement to dropping legitimate traffic.

Last updated 2026-06-15