Wednesday, 25 July 2018

Cassandra - The fellowship of The Ring

In the previous article, I wrote about Cassandra Node. Although the node plays an important role by itself, things like high-availability, fault-tolerance, resiliency and scalability are only achieved when we get multiple of those nodes to work together in a cluster. Everything starts with the node, but a single node does not suffice. If that node crashes or is restarted, we are offline. Reasons for that to happen might include patching,...

Optimizing Cassandra Performance: Sometimes Two Writes Are Better Than One

SignalFx is a modern monitoring service that ingests, stores and performs real-time streaming analyticson high-volume, high-resolution metric data from companies all over the world. Providing real-time streaming analytics means that we ingest tens of billions of points of time series data per day, and we give our customers the capability to send data at one second resolution. All of this data ends up in Cassandra,...

Understanding Cassandra tombstones

We recently deployed in production a distributed system that uses Cassandra as its persistent storage. Not long after we noticed that there were many warnings about tombstones in Cassandra logs. WARN  [SharedPool-Worker-2] 2017-01-20 16:14:45,153 ReadCommand.java:508 - Read 5000 live rows and 4771 tombstone cells for query SELECT * FROM warehouse.locations WHERE token(address) >= token(D3-DJ-21-B-02) LIMIT 5000 (see tombstone_warn_threshold) We...