Skip to main content

API reference

Rate limits

Per-scope quotas, the X-RateLimit-* response header trio, Retry-After semantics on 429, and recommended back-off strategies.

Last updated

Per-token rate limits keep the API stable for everyone. Limits are per-endpoint and per-token; cost-bearing endpoints are tighter than pure reads. Every response carries a header trio so clients can implement back-off without seeing a 429 first.

Headers

Every authenticated response — success or failure — carries:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 47
X-RateLimit-Reset: 1714780000
  • X-RateLimit-Limit — max requests in the current window for this (token, endpoint) pair.
  • X-RateLimit-Remaining — requests left in the current window.
  • X-RateLimit-Reset — Unix-seconds timestamp when the window resets and the counter zeroes.

429 responses

When you hit a limit, you get HTTP 429 with the standard error envelope plus a Retry-After header (in seconds):

HTTP/1.1 429 Too Many Requests
Retry-After: 8
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1714780008
Content-Type: application/json

{
  "error": "Rate limit exceeded. Retry after 8 seconds.",
  "code": "RATE_LIMITED"
}

Per-endpoint quotas

Three buckets in current production. Exact numbers are visible per- endpoint in the OpenAPI spec; the shape:

  • Read endpoints (list events, get profile, billing status, etc.): 60/minute/token. Scoped to the token, not per-IP.
  • Write endpoints (create event, post role, update profile): 30/minute/token.
  • Cost-bearing endpoints (off-session charge, AI candidate compare, room URL issuance): 3/minute/token. Tighter because each call costs us money in upstream API calls (Stripe, Anthropic, Daily).

Recommended back-off

Three strategies in increasing sophistication:

  1. Honor Retry-After. When you see a 429, sleep for the Retry-After seconds, then retry. Simplest correct implementation.
  2. Pre-emptive back-off on Remaining. When X-RateLimit-Remaining hits 0 or 1, sleep until X-RateLimit-Reset before the next request. Avoids ever seeing a 429.
  3. Bounded exponential.If you're polling and want to balance freshness against limit consumption, use min(2^n seconds, X-RateLimit-Reset - now) as the wait between requests, with n incrementing on each empty response.

SDK behavior

The hand-written @jobbydev/sdk and jobbydev Python SDK both surface result.rateLimit on every successful response so callers can implement strategy 2 without parsing headers manually. Pass retryOn429: true to enable single-retry on 429.

Related reading