Production Recommendations
The cassandra.yaml
and jvm.options
files have a number of notes and
recommendations for production usage. This page expands on some of this
information.
Tokens
Using more than one token-range per node (referred to as vnodes) allows for flexible expansion and more streaming peers when bootstrapping new nodes into the cluster. Limiting the negative impact of streaming (I/O and CPU overhead) enables incremental cluster expansion.
As a tradeoff, more tokens will lead to sharing data with more peers, resulting in decreased availability. To learn more, Cassandra Availability in Virtual Nodes, Joseph Lynch and Josh Snyder is recommended reading.
The number of tokens can be changed using the following setting:
num_tokens: 16
Here are the most common token counts with a brief explanation of when and why you would use each one.
Token Count | Description |
---|---|
1 |
Maximum availablility, maximum cluster size, fewest peers, but inflexible expansion. Must always double size of cluster to expand and remain balanced. |
4 |
A healthy mix of elasticity and availability. Recommended for clusters which will eventually reach over 30 nodes. Requires adding approximately 20% more nodes to remain balanced. Shrinking a cluster may result in cluster imbalance. |
16 |
Best for heavily elastic clusters which expand and shrink regularly, but may have issues availability with larger clusters. Not recommended for clusters over 50 nodes. |
In addition to setting the token count, it’s extremely important that
allocate_tokens_for_local_replication_factor
be set as well to an appropriate number of replicates, to ensure
even token allocation.
Read Ahead
Read ahead is an operating system feature that attempts to keep as much data loaded in the page cache as possible. The goal is to decrease latency by using additional throughput on reads where the latency penalty is high due to seek times on spinning disks. By leveraging read ahead, the OS can pull additional data into memory without the cost of additional seeks. This works well when the available RAM is greater than the size of the hot dataset, but can be problematic when the hot dataset is much larger than available RAM. The benefit of read ahead decreases as the size of your hot dataset gets bigger in proportion to available memory.
With small partitions (usually tables with a single partition key, but not limited to this case) and solid state drives (SSDs), read ahead can increase disk usage without any of the latency benefits, and in some cases can result in up to a 5x latency and throughput performance penalty. Read heavy, key/value tables with small (under 1KB) rows are especially prone to this problem.
The recommended read ahead settings are:
Hardware | Initial Recommendation |
---|---|
Spinning Disks |
64KB |
SSD |
4KB |
Read ahead can be adjusted on Linux systems by using the blockdev
tool.
For example, we can set the read ahead of /dev/sda1\
to 4KB by doing the following:
$ blockdev --setra 8 /dev/sda1
|
Since each system is different, use the above recommendations as a starting point and tuning based on your SLA and throughput requirements. To understand how read ahead impacts disk resource usage, we recommend carefully reading through the Diving Deep, using external tools section.
Compression
Compressed data is stored by compressing fixed size byte buffers and writing the data to disk. The buffer size is determined by the`chunk_length_in_kb` element in the compression map of the schema settings. The default setting is 16KB starting with Cassandra 4.0.
Since the entire compressed buffer must be read off disk, using a compression chunk length that is too large can lead to significant overhead when reading small records. Combined with the default read ahead setting, the result can be massive read amplification for certain workloads.
LZ4Compressor is the default and recommended compression algorithm. There is additional information on compression in The Last Pickle blogpost on compression performance.
Compaction
There are different compaction strategies available for different workloads. We recommend reading about the different strategies to understand which is the best for your environment. Different tables may (and frequently do) use different compaction strategies on the same cluster.
Encryption
It is significantly better to set up peer-to-peer encryption and client server encryption when setting up your production cluster. Setting it up once the cluster is already serving production traffic is challenging to get right. If you plan to use network encryption eventually (in any form), we recommend setting it up now. Changing these configurations down the line is not impossible, but mistakes can result in downtime or data loss.
Ensure Keyspaces are Created with NetworkTopologyStrategy
Production clusters should never use SimpleStrategy
. Production keyspaces should use the NetworkTopologyStrategy
(NTS).
For example:
CREATE KEYSPACE mykeyspace WITH replication = {
'class': 'NetworkTopologyStrategy',
'datacenter1': 3
};
NetworkTopologyStrategy
allows Cassandra to take advantage of multiple racks and data centers.
Configure Racks and Snitch
Correctly configuring or changing racks after a cluster has been provisioned is an unsupported process.
Migrating from a single rack to multiple racks is also unsupported and can result in data loss.
Using GossipingPropertyFileSnitch
is the most flexible solution for on premise or mixed cloud environments.Ec2Snitch
is reliable for AWS EC2 only environments.