Spring Cassandra Integration Testing

With Achilles JUnit and Test Containers

Posted on 02 April 2017

In the project I’m currently working on we use Spring Boot based microservices backed by Cassandra for storage. Through the last year of this project we’ve been through a few iterations of how we handled the interfacing with Cassandra. This also had substantial impact on how we handled integration testing. In this post I’m going to show you our current approach (with Achilles JUnit) as well as a more generic approach with Test Containers.

Introduction

You might already have experience using Spring Data JPA for example to 'talk' with a relational database. In these cases my favorite approach of doing the unit and integration testing of the repositories would be to spin up an in-memory H2 database set to mimic whatever SQL dialect we would be using. While not perfect (there are numerous features of for example Oracle that H2 does not support) this works quite well.

With a service backed by Cassandra this is not as straight-forward. Spinning up an in-memory Cassandra yourself is possible (it’s a Java application after all) but it keeps global state that is hard to deal with, especially with parallel tests. Fortunately there are a few libraries that can handle this for you.

The Application

The application is a simple Spring Boot REST service. You can run ExampleApplication from your IDE or build (mvn clean package) and run it from the command line: java -jar target/cassandra-it.jar. Assuming you have a Cassandra installation running on localhost it should start the service and listen on port 8080.

Note
You can spin up a Docker container running Cassandra with docker run -p 9042:9042 --name cass -d cassandra:3

The application exposes a single route: /counter/{key}. When you POST to this route it increments a counter with the key 'key':

$ curl -i -X POST http://localhost:8080/counter/foo
HTTP/1.1 200
X-Application-Context: application
Content-Length: 0
Date: Sun, 02 Apr 2017 08:17:44 GMT

You can GET the current value too:

$ curl -i http://localhost:8080/counter/foo
HTTP/1.1 200
X-Application-Context: application
Content-Type: application/json;charset=UTF-8
Transfer-Encoding: chunked
Date: Sun, 02 Apr 2017 08:18:31 GMT

{"key":"foo","value":2}

These counters are stored in Cassandra. On start-up the application recreates the 'counter' table for you.

Repository / Integration tests

The logic of storage and retrieval is handled in a Spring @Repository component; CounterRepository. It uses the convenient DataBind mapper for retrieval and a prepared update statement for updates.

To test this functionality I have created both a Repository (just tests the Cassandra interface) and an Integration Test (tests top to bottom from Controller to database). I’ve implemented the test logic as two abstract tests that are used by / implemented in both test setups. Let’s take a look at the Repository test first:

public abstract class CounterRepositoryTest {
    private Session session;
    protected CounterRepository repository;

    public abstract Session getSession();

    @Before
    public void setup() {
        repository = new CounterRepository(getSession());
    }

    @Test
    public void getAndIncrement() {
        assertThat(repository.get("foo")).isEmpty();

        repository.increment("foo");
        repository.increment("foo");
        repository.increment("bar");

        assertThat(repository.get("foo").get().getValue())
            .isEqualTo(2);
        assertThat(repository.get("bar").get().getValue())
            .isEqualTo(1L);
    }
}
Note
I’m in the process of trying to fix the code wrapping, sorry about that

Nothing crazy here: it checks if the repo is empty (it should be), increments a couple of values and then checks if they are incremented properly.

There is also an integration test that spins up a Spring Web Application Context and tests the same functionality through the controller:

public abstract class CounterIntegrationTest {
    @Autowired
    private WebApplicationContext wac;

    @Autowired
    private Session session;

    private Mapper<Counter> mapper;
    private MockMvc mockMvc;

    @Before
    public void setup() throws Exception {
        mockMvc = MockMvcBuilders
                .webAppContextSetup(wac)
                .build();

        mapper = new MappingManager(session)
            .mapper(Counter.class);
    }

    @Test
    public void getAndIncrement() throws Exception {
        assertThat(countRows(session)).isZero();

        increment("foo");
        increment("bar");
        increment("foo");

        assertThat(countRows(session)).isEqualTo(2);

        get("foo", 2);
        get("bar",1);
    }

    private void increment(String key)
        throws Exception {
        mockMvc.perform(
            MockMvcRequestBuilders.post("/counter/" + key))
                .andDo(print())
                .andExpect(status().isOk());
    }

    private void get(String key, int expected)
        throws Exception {
        mockMvc.perform(
            MockMvcRequestBuilders.get("/counter/" + key))
                .andDo(print())
                .andExpect(status().isOk())
                .andExpect(jsonPath("$.value",
                    is(expected)));
    }

    public static long countRows(Session session) {
        ResultSet set = session.execute(
            "SELECT COUNT(*) from cassandrait.counter");

        return set.one().getLong(0);
    }
}

Integration Testing methods

CassandraUnit

As a dev whenever you try and learn something new you probably start by googling your problem: "cassandra unit testing" for example. Unfortunately this will probably lead you to CassandraUnit. This is also the first library we started using. While it does what it claims to do, spin up an in-memory Cassandra, it’s implementation leaves a lot to be desired.

The biggest problem is that it creates and destroys a Cassandra instance for every single test. While it is in fact the most 'correct' approach (you don’t want to have residual data interfering with tests) and the approach you use when you use something like H2, it doesn’t work at all with Cassandra. Cassandra instances take a long time to spin up; it’s starting a 'real' production Cassandra which takes around 20 seconds on my machine. Closing the instance takes a similar amount of time.

So even with us combining tests (which is by itself something I’d really prefer not to do) build times exploded to 20 minutes or more.

Note
I’m not up to date with the latest developments of CassandraUnit so please let me know if this is fixed!

Achilles JUnit

Achilles is a Cassandra object mapper (which we don’t use, we use the Datastax Databinder) and the team also implemented their own Cassandra integration test library. This has one huge benefit over CassandraUnit: it works on the premise that you keep a single Cassandra instance alive for the entire run and simple clean up keyspaces and tables between tests. Moving from CassandraUnit to Achilles JUnit got our build times back to roughly 5 minutes (most of which is building and transferring Docker images).

So let’s see how I run the repository test first:

public class CounterRepositoryTestEmbedded extends CounterRepositoryTest {
    @Rule
    public AchillesTestResource<AbstractManagerFactory> resource =
        embeddedCassandra() ;

    public static AchillesTestResource<AbstractManagerFactory> embeddedCassandra() {
        return AchillesTestResourceBuilder
                .forJunit()
                .withScript("schema.cql")
                .createAndUseKeyspace("cassandrait")
                .build((cluster, statementsCache) -> null);
    }

    @Override
    public Session getSession() {
        return resource.getNativeSession();
    }
}

It’s really as simple as that. You create a @Rule that spins up an embedded Cassandra if one isn’t active yet.

The integration test is, because it actually spins up a complete Spring Context, a bit more complex. First we need to provide an additional @Configuration class:

@Configuration
public class TestConfigurationEmbedded {
    @Bean
    @Primary
    public Session createSession() {
        final Cluster cluster = CassandraEmbeddedServerBuilder
                .builder()
                .cleanDataFilesAtStartup(true)
                .withKeyspaceName("cassandrait")
                .withScript("schema.cql")
                .withClusterName(CLUSTER_NAME)
                .buildNativeCluster();

        return cluster.connect("cassandrait");
    }
}
Warning
By default it creates schemas in target/cassandra_embedded and does not delete the data and keyspaces afterwards. Make sure that you take this into account.

This overrides the normal session that connects to localhost. It also takes care of creating our keyspace and executing a schema creation script that sets up our database. We can then use this in our integration test:

@ActiveProfiles("test")
@RunWith(SpringJUnit4ClassRunner.class)
@WebAppConfiguration
@SpringBootTest(classes = {ExampleApplication.class, TestConfigurationEmbedded.class})
@EnableConfigurationProperties
public class CounterIntegrationTestEmbedded extends CounterIntegrationTest {
}

And yes, the body of the class is in fact completely empty. All the necessary configuration is done through the annotations. If we now run the tests in /embedded it should spin up a single instance, execute all the tests and close the Cassandra instance.

Test Containers

Note
You need to have Docker installed and the daemon running to run the tests below!

Another neat approach is a relatively new library called test containers. It lets you spin up Docker containers for tests which then also get neatly destroyed afterwards. So let’s start with our Repository test first, a separate implementation in CounterRepositoryTestContainers:

public class CounterRepositoryTestContainers extends CounterRepositoryTest {
    @ClassRule
    public static GenericContainer cassandra =
            new GenericContainer("cassandra:3")
                    .withExposedPorts(9042);

    @Override
    public Session getSession() {
        return ApplicationConfiguration.createSession(
            cassandra.getContainerIpAddress(),
            cassandra.getMappedPort(9042));
    }
}

The main difference here is that there’s a different @Rule: this GenericContainer rule can be used to start any Docker container and have it expose a port. This is very convenient for databases that can’t be ran embedded (Postgres, MySQL) or other tasks that you would prefer to have isolated.

For our integration test we can do something similar:

@ActiveProfiles("test")
@RunWith(SpringJUnit4ClassRunner.class)
@WebAppConfiguration
@SpringBootTest(classes = {ExampleApplication.class})
@ContextConfiguration(initializers = CounterIntegrationTestContainers.Initializer.class)
@EnableConfigurationProperties
public class CounterIntegrationTestContainers extends CounterIntegrationTest {
    @ClassRule
    public static GenericContainer cassandra =
            new GenericContainer("cassandra:3")
                    .withExposedPorts(9042);

    public static class Initializer implements ApplicationContextInitializer<ConfigurableApplicationContext> {
        @Override
        public void initialize(ConfigurableApplicationContext configurableApplicationContext) {
            EnvironmentTestUtils.addEnvironment(
                "testcontainers",
                configurableApplicationContext.getEnvironment(),
                "cassandra.host=" + cassandra.getContainerIpAddress(),
                "cassandra.port=" + cassandra.getMappedPort(9042)
            );
        }
    }
}

As you can see the approach is a bit different here. We define an Initializer that overrides the standard Cassandra host and port with those from our container (port 9042 is the internal port, it gets assigned to a random high port). While this initializer is defined inside this integration test for convenience you could easily just define one and reuse it between tests. Because it uses the standard Cassandra driver to connect we don’t have to specify a separate Test configuration.

If we now run our tests in /testcontainers we can see it spins up a Docker container:

11:04:38.616 [main] INFO  org.testcontainers.DockerClientFactory - Connected to docker:
  Server Version: 17.03.0-ce
  API Version: 1.26
  Operating System: Alpine Linux v3.5
  Total Memory: 1998 MB
11:04:40.659 [main] INFO  🐳 [cassandra:3] - Creating container for image: cassandra:3
Note
By default test containers has it’s log output set to DEBUG producing a lot of logs. I fixed this with a test logback config.

A big benefit of this is that these containers are completely isolated. Once the tests are 'done' the containers get destroyed together with any test data in them. Also it’s rather convenient that you spin up a 'normal' instance and can use the same configuration to connect to it.

The downside of this approach is that it’s quite a bit slower. Some speedup could be gained from using a custom Cassandra docker image specifically tuned for testing but it would still spin up and destroy an entire container for each test class. Still it’s a huge improvement over CassandraUnit that spun up an instance for every test. Also having Docker installed and reachable is a requirement.

Conclusion

At this time I still consider Achilles JUnit the most convenient approach for us for integration testing our Cassandra backed services. It does not require Docker to be installed, is as fast as it gets and easy to set up. But still Test Containers is a close second; it’s convenient and incredibly flexible.

So that’s it for this post. I hope you enjoyed reading this post as much as I enjoyed writing it! If you have any questions or feedback feel free to raise an issue in this repository!