Rate Limiting an API with Cassandra

TTL inserts for fun and profit

Posted on 16 July 2016

Often there are API’s in your system that are sensitive to the amount of requests they receive per second. In our system it’s the registration flow where users can request an SMS to be sent to their cell phone. For an interesting (and fun) read on why you want to rate limit such an API check out this wonderful article. In this post I’ll show you how I implemented such a rate limit on user’s phone numbers.


So a simple approach for rate limiting would be to have the application server simply keep track of calls per-user in memory. This naive approach has a few downsides though. First of all you might not want to have to have this amount of memory in use just for this simple list. And more importantly; in a typical scalable architecture you never know if the next request to an API is going to hit the same server instance.

So a more common approach is to handle this in some kind of session store. Redis or Memcached are commonly used here but since we’re already using Cassandra in our system I will show you how to do this in Cassandra.

Time To Live

The key functionality we use here are Time To Live (TTL) inserts. Cassandra (like many other solutions) has the ability to let you insert auto-expiring data. First we need somewhere to store this data so let’s define a table:

CREATE TABLE rate_limit (
	phone_number	TEXT,
	event_time	TIMESTAMP,

	PRIMARY KEY((phone_number), event_time)

We partition on phone_number (we will be filtering on it) and add a timestamp to the key to make it unique (since we can’t insert multiple rows for the same phone-number otherwise). As you can see we don’t have to add anything special to our table.

So now we can insert data whenever someone uses the SMS functionality in our system:

INSERT INTO rate_limit (phone_number, event_time)
	VALUES ('000-000', dateof(NOW()))
	USING TTL 3600;

The USING TTL bit is all that’s different from a regular insert. The value is the Time To Live in seconds (so 60 * 60 = one hour). Our system uses a max amount of requests for an SMS per hour of 10. You’re of course free to adapt the amount of requests and timespan to your need.

So how do we know how many requests have been done in the timespan? We just select and count on phone_number:

SELECT COUNT(*) FROM rate_limit WHERE phone_number = '000-000';

This is all you need on the database side to implement this. Because we just use a TTL we don’t have to clean up after ourselves and we also don’t have to group anything into time slots. You always know exactly how many times a SMS is sent to a certain phone number.

Java Repository

So now we have done the data store part, let’s implement a Spring repository that handles the counting for us:

import com.datastax.driver.core.querybuilder.Insert;
import com.datastax.driver.core.querybuilder.QueryBuilder;
import com.datastax.driver.core.querybuilder.Select;
import com.google.common.hash.Hashing;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.cassandra.core.CassandraTemplate;
import org.springframework.stereotype.Repository;

import java.nio.charset.StandardCharsets;
import java.util.Date;

public class RateLimitRepository {
    private static final String TABLE = "rate_limit";
    private static final String FIELD_PHONE = "phone_hash";
    private static final String FIELD_TIMESTAMP = "event_timestamp";

    private static final int DEFAULT_TTL = 3600; //Seconds
    private final CassandraTemplate cassandraTemplate;

    public RateLimitRepository(CassandraTemplate cassandraTemplate) {
        this.cassandraTemplate = cassandraTemplate;

    public void add(String phoneNumber) {
        Insert insert = QueryBuilder.insertInto(TABLE)
                .value(FIELD_PHONE, hash(phoneNumber))
                .value(FIELD_TIMESTAMP, new Date());


    public int count(String phoneNumber) {
        Select select = QueryBuilder.select().countAll().from(TABLE);

        select.where(QueryBuilder.eq(FIELD_PHONE, hash(phoneNumber)));

        return cassandraTemplate.queryForObject(select, Integer.class);

    //We don't want to store privacy sensitive info unless we really need to
    private String hash(String phoneNumber) {
        return Hashing.sha256()
                .hashString(phoneNumber, StandardCharsets.UTF_8)

This is the complete repository. The cassandraTemplate get’s injected for us. As you can also see we hash the phone numbers. This is because we don’t want to store any privacy sensitive data unless we have to. The phone number of a customer is only stored in one Cassandra table; the other tables use hashes.

The only public methods in the repository are the add() and count() methods. Add does the insert, count returns the current count for that phone-number.


As demonstrated Cassandra is a great tool to use for a rate limiting back-end. It’s also one of many tools; for very high-volumes you might want to use a pure in-memory store like Redis. And like any functionality built on top of Cassandra a consistency strategy has to be picked. In our case a default consistency of LOCAL_QUORUM is fine but in high traffic scenarios where staleness is not much of a factor you can use ANY for reads and writes.