Algorithm Comparison
Table of Contents
- Quick Matrix
- Sliding Window
- Fixed Window
- Token Bucket
- Leaky Bucket
- How to Choose
- Token Bucket vs Leaky Bucket
- Recommended Production Configurations
- Performance Dimensions
- FAQ
- Related Docs
Quick Matrix
Sliding Window
Sliding window counts events in the last windowMs milliseconds.
Use it when fairness matters more than raw throughput.
Good Fit
- Login, password reset, MFA, payment, refund, coupon, and other abuse-sensitive operations.
- APIs where a user should never exceed
maxrequests in any rollingwindowMs. - Multi-tenant systems where fairness is more important than squeezing out the highest possible per-process throughput.
- Business lock scenarios such as "same user + same route" limits.
Characteristics
How It Works
For each key, the algorithm keeps request timestamps that are still inside the rolling time window.
Expired timestamps are ignored or removed before the new request is evaluated. This is why the algorithm avoids the classic fixed-window boundary problem.
Pros and Cons
Practical Notes
- Use a route-aware key such as
user:${userId}:${route}when different endpoints need independent budgets. - For Redis deployments, prefer a store path that supports atomic algorithm operations.
- For extremely high-cardinality traffic, benchmark against
fixed-window,token-bucket, andleaky-bucketbefore choosing a global default.
Fixed Window
Fixed window groups requests into calendar-like buckets.
Use it for very hot endpoints where approximate limits are acceptable.
Good Fit
- High-throughput endpoints where an approximate count is acceptable.
- Coarse operational protection, such as "up to 10,000 requests per minute per API key".
- Internal APIs where the cost of an occasional boundary burst is low.
- Metrics-oriented throttling where compact state is more valuable than exact rolling fairness.
Characteristics
How It Works
The algorithm maps a request time to a bucket:
All requests in the same bucket share one counter. When time moves into the next bucket, the counter starts over.
That boundary behavior is the main tradeoff.
Pros and Cons
Practical Notes
- Use it when throughput and state cost matter more than strict fairness.
- Avoid it as the only protection for login, MFA, payment, or similar sensitive actions.
- If boundary bursts are unacceptable, use
sliding-windowinstead.
Token Bucket
Token bucket refills quota over time and allows short bursts up to capacity.
Use it for API plans, user quotas, or endpoints where bursts are acceptable.
Good Fit
- API gateway plans where users can make short bursts but must follow a long-term average.
- Integrations that naturally send traffic in bursts, such as dashboards refreshing multiple widgets.
- Product quotas such as "100 requests per minute, with short bursts allowed".
- Mobile or edge clients that may retry or reconnect in small clusters.
Characteristics
How It Works
Each key owns a bucket. The bucket has a capacity and refills over time.
In flex-rate-limit, max is the default capacity unless capacity is specified, and refillRate can be used to tune refill behavior.
Pros and Cons
Practical Notes
- Set
capacityto the largest acceptable immediate burst. - Set
refillRateto the sustainable rate overwindowMs. - If a backend needs smooth arrival instead of burst tolerance, use
leaky-bucket.
Leaky Bucket
Leaky bucket smooths traffic by draining at a steady rate.
Use it when the backend needs a steadier arrival rate.
Good Fit
- Protecting backend services that degrade when traffic arrives in spikes.
- Work queues, notification senders, webhook dispatchers, and background processing endpoints.
- APIs where a client should experience gradual admission rather than burst-friendly admission.
- Traffic shaping before a slower dependency.
Characteristics
How It Works
The bucket has a current water level. Requests add water. Time drains water at a stable rate.
This makes the effective output smoother than token bucket, especially when many clients send traffic at the same time.
Pros and Cons
Practical Notes
- Use
leakRateto match the backend's sustainable processing rate. - Keep
capacityclose to the largest queue depth you are comfortable admitting. - Combine with a route-aware key when only specific routes need shaping.
How to Choose
Use this decision path when choosing a default:
Token Bucket vs Leaky Bucket
Both algorithms keep compact state and both model a rate over time, but they optimize for different behavior.
Scenario Comparison
API plan:
Backend smoothing:
If the product promise is "100 requests per minute with short bursts", choose token bucket. If the engineering goal is "do not let this dependency receive a spike", choose leaky bucket.
Recommended Production Configurations
Login Protection
Why: login attempts need strict fairness and should not benefit from fixed-window boundaries.
API Gateway Plan
Why: plan users often send small bursts, but the average rate must remain bounded.
Queue or Webhook Protection
Why: downstream workers usually prefer smooth input over bursty input.
Very Hot Public Endpoint
Why: the endpoint is hot, the limit is coarse, and a boundary burst is acceptable.
Performance Dimensions
Performance depends on the storage backend, algorithm, key cardinality, network latency, and the shape of the request stream. Treat the matrix below as a decision aid, not as a substitute for local benchmark data.
For reproducible local numbers, run:
Then compare:
- Median and p95 latency, not only QPS.
- State growth after keys expire.
- Behavior under high-cardinality keys.
- Redis command count and network round trips.
- Whether middleware rollback options are enabled.
FAQ
What is the default algorithm?
sliding-window is the default because it is the strictest and easiest to reason about for user-facing safety.
When should I use fixed window?
Use it for high-throughput, coarse limits where boundary bursts are acceptable. Do not use it as the only protection for login, MFA, payment, or refund operations.
How do I choose between token bucket and leaky bucket?
Choose token-bucket when legitimate bursts should be allowed. Choose leaky-bucket when the backend needs traffic smoothing.
Can I switch algorithms later?
Yes. The public result shape remains the same, but the stored state semantics differ. For production systems, roll out algorithm changes by route or tenant and watch rejection rate, p95 latency, and key cardinality.
Do all algorithms work with all stores?
Yes through the common store contract. Stores with algorithm-specific atomic methods can provide stronger distributed behavior and better performance for that algorithm.