Common Questions
What algorithm do you use for rate limits?
- Leaky Bucket Algorithm rate limiting
- This technique uses a “token bucket” system based on a maximum number of available request tokens for a given timeframe. Each request consumes one token, and tokens are continuously refilled at a fixed rate. When all tokens are consumed, further requests are blocked until new tokens become available.
- Distributed across instances: the system coordinates token usage across all our servers to maintain the global limit.
What if I hit a rate limit, how long is the cool down period?
Since token refilling is a continuous process, there’s no fixed cooldown period. As soon as a single token is refilled, you can make another request. The wait time depends on how many tokens you’ve used and the refill rate - it could be just milliseconds if you’re slightly over the limit, or longer if you’ve exhausted your entire token bucket.
Rate limit example
If you have a 600 requests/minute limit:
- Tokens will refill at ~10 tokens per second (1 token every 100ms)
- If you have 0 tokens available you will wait ~100ms for 1 new token to be refilled
- For continuous requests: You can make 1 request every 100ms sustainably
- If you want to make multiple requests quickly you will need to wait longer to accumulate more tokens (e.g., 500ms to get 5 tokens)
How long will an API query run before timing out?
- All public and private API endpoints have a default timeout set to 30 seconds
- In the case of a timeout, the API will return a 503 error