Tuesday, 24 July 2018

How Is Data Written in Cassandra?

How Is Data Written in Cassandra?

A data write in Cassandra passes through below stages:

Logging data in the commit log
A data is immediately written to commit log file. The commit log is shared among tables.
Writing data to the memtable
After commit log data is indexed and written to an in-memory structure, called a memtable. The memtable stores writes in sorted order until reaching a configurable limit, and then is flushed. Each time the memtable is full, the data is written to disk in SSTables and all writes are automatically partitioned and replicated throughout the cluster.
Storing data on disk in SSTables
The SSTables are files stored on disk. SSTables are immutable, not written to again after the memtable is flushed. Memtables and SSTables are maintained per table. Each SSTables consist of below files:
  • Data.db
  • Index.db
  • Filter.db
  • CompressionInfo.db
  • Statistics.db
  • Digest.crc32, Digest.adler32, Digest.sha1
  • CRC.db
  • SUMMARY.db
  • TOC.txt
  • SI_.*.db
A typical file path may be like /data/testkeyspace/cf3-25dg09277b7378ab89882728ad88056/la-1-big-Data.db where testkeyspace is the keyspace and hexadecimal string 25dg09277b7378ab89882728ad88056 is appended to table names to represent unique table IDs.

0 comments:

Post a Comment