Problem: You experience outages and performance degradation in production making your cluster slow and unresponsive which directly impacts your business. You often encounter query timeouts or failures. The cluster is unstable, and performance is unreliable. Or you might have tried to do a version upgrade or some other operation, and got stuck along the way. Your business cannot afford the database to be offline. The problem needs to be tackled and fixed immediately.

Solution: Fear not. Apache Cassandra is designed to be always on and to scale linearly. There are trade-offs, of course, and we are always upfront with you about it. However, if Apache Cassandra suits your use case, with proper handling and configuration it will provide you with the scalability and availability your business needs.

These kinds of engagements, fixing the urgent problems in production where every second counts when database is offline and directly affects the business, we call fire fighting missions.

The firefighting missions usually start by you granting us access to your system, we jump in, identify and fix the problem ASAP. This step will unblock you for the time being, and provide a short-term solution or workaround. The outages and performance degradation in production clusters are usually just the manifestation of some other problem; either a suboptimal data model, wrong compaction strategy, improperly configured data distribution and cluster topology, or something similar. While working on creating this short-term solution, we write down all the red flags related to the problem and suggest the improvements which will increase the stability, performance, and scalability of the system in the long run.

This way you get unblocked and fully operational as soon as possible and your business can continue to use the data. And we also advise and write down what needs to be changed in order to prevent the problem from happening again.

