Why Cassandra

NoSQL database

Apache Cassandra is a high performant scalable and highly available database. Proven fault tolerance and easy scalability makes it perfect for mission-critical data. Data replication is seamless and it provides low latency response times while surviving outages.

Perfect choice for

Web analytics when you want to store click streams and IoT to store measurements as time series. It is also perfect choice for popular machine learning solutions such as fraud detection and personalization.

Anti-patterns

Cassandra has data model which, once in place, is hard to change, do not use it for early prototyping. Do not use it as substitute for RDBMS, those databases are there for a long time and serve the purpose if you need strong referential integrity. One of anti-patterns which is seen frequently is queue like storage, and Cassandra is not perfect for it since it keeps deleted data inside (only logical deletion) which piles up over time and reads become slow.

Pro Tips

Don't use Cassandra if you want to have flexible, dynamic queries. Use ElasticSearch and Solr bundled with Cassandra if your data needs to be searched in many ways. Avoid building relational data model inside of Cassandra - it does not have JOINS or GROUP BYs.