Token Bucket Rate limiter using Spring Boot and Redis, bucket4j

Why Use Rate Limiting?

Problem with Unlimited Rates

  • resource exhaustion
  • decreasing quality of the service
  • denial of service attacks

How Rate Limiting Helps

The Token Bucket Algorithm

  1. A bucket is created with a certain capacity (number of tokens).
  2. When a request comes in, the bucket is checked. If there is enough capacity, the request is allowed to proceed. Otherwise, the request is denied.
  3. When a request is allowed, the capacity is reduced.
  4. After a certain amount of time, the capacity is replenished.

How to Implement Token Bucket in a Distributed System

Project Implementation

  1. A simple REST API.
  2. A Redis cache connected to the service — using the Redisson client.
  3. The Bucket4J library wrapped around the REST API.
  4. We’ll connect Bucket4J to the JCache interface which will use the Redisson client as the implementation in the background.

Install Dependencies

<!-- To build the Rest API -->

<!-- Redisson Starter = Spring Data Redis starter(excluding other clients) and Redisson client -->

<!-- Bucket4J starter = Bucket4J + JCache -->

Cache Configuration

  1. Create a connection to this server from our application.
  2. Set up JCache to use the Redisson client as the implementation.
public class RedisConfig {

public Config config() {
Config config = new Config();
return config;

public CacheManager cacheManager(Config config) {
CacheManager manager = Caching.getCachingProvider().getCacheManager();
cacheManager.createCache("cache", RedissonConfiguration.fromConfig(config));
return cacheManager;
ProxyManager<String> proxyManager(CacheManager cacheManager) {
return new JCacheProxyManager<>(cacheManager.getCache("cache"));
  1. Creates a configuration object that we can use to create a connection.
  2. Creates a cache manager using the configuration object. This will internally create a connection to the Redis instance and create a hash called “cache” on it.
  3. Creates a proxy manager that will be used to access the cache. Whatever our application tries to cache using the JCache API, it will be cached on the Redis instance inside the hash named “cache”.

Build the API

public class RateLimitController {
public String getInfo(@PathVariable("id") String id) {
return "Hello " + id;

Bucket4J Configuration

Create a Bucket

Refill refill = Refill.intervally(10, Duration.ofMinutes(1));
Bandwidth limit = Bandwidth.classic(10, refill);
Bucket bucket = Bucket4j.builder()
  • Refill — After how much time the bucket will be refilled.
  • Bandwidth — How much bandwidth the bucket has. Basically, requests per refill period.
  • Bucket — An object configured using these two parameters. Additionally, it maintains a token counter to keep track of how many tokens are available in the bucket.

Create and Cache Buckets using ProxyManager

public class RateLimiter {
//autowiring dependencies

public Bucket resolveBucket(String key) {
Supplier<BucketConfiguration> configSupplier = getConfigSupplierForUser(key);

// Does not always create a new bucket, but instead returns the existing one if it exists.
return buckets.builder().build(key, configSupplier);
private Supplier<BucketConfiguration> getConfigSupplierForUser(String key) {

Refill refill = Refill.intervally(20, Duration.ofMinutes(1));
Bandwidth limit = Bandwidth.classic(20, refill);
return () -> (BucketConfiguration.builder()

How to Consume Tokens and Set Up Rate Limiting

public String getInfo(@PathVariable("id") String id) {
// gets the bucket for the user
Bucket bucket = rateLimiter.resolveBucket(id);

// tries to consume a token from the bucket
if (bucket.tryConsume(1)) {
return "Hello " + id;
} else {
return "Rate limit exceeded";
  • Each user has a bucket associated with them, containing a number of tokens.
  • When a user attempts to taken an action, we check the number of tokens in the bucket.
  • If the bucket is empty, the user has exceeded the rate, and the action is blocked.
  • Otherwise, we remove a token from the bucket (or more, for “expensive” actions) and allow the action.
  • Over time, we refill all buckets at a set rate, up to some maximum capacity.
  • Each user has a two keys associated with them: the token bucket, and a timestamp of the last time the bucket was refilled.
  • When a user attempts to perform an action, we fetch the stored timestamp.
  • We calculate how many tokens the user should have been granted since that last request.
  • We can then proceed with the algorithm, using this new token count.
  • The user has enough tokens left for one action.
  • Not enough time has passed since the last action for more tokens to be granted.
  • Client 1 gets the stored timestamp and token count.
  • Client 2 gets the stored timestamp and token count.
  • Client 1 calculates that there is no need to add tokens, allows the action, and tells redis to set the token count to 0.
  • Client 2 also calculates that there is no need to add tokens, allows the action, and tells redis to set the token count to 0.

A better approach — sorted sets

  • Each user has a sorted set associated with them. The keys and values are identical, and equal to the (microsecond) times when actions were attempted.
  • When a user attempts to perform an action, we first drop all elements of the set which occured before one interval ago. This can be accomplished with Redis’s ZREMRANGEBYSCORE command.
  • We fetch all elements of the set, using ZRANGE(0, -1).
  • We add the current timestamp to the set, using ZADD.
  • We set a TTL equal to the rate-limiting interval on the set (to save space).
  • After all operations are completed, we count the number of fetched elements. If it exceeds the limit, we don’t allow the action.
  • We also can compare the largest fetched element to the current timestamp. If they’re too close, we also don’t allow the action.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store