Hardware Considerations

Hardware Selection

CHOOSE the right balance of the following

  • Persistent storage type
  • Memory
  • CPU
  • Number of nodes
  • Network

Persistent Storage

  • Avoid:

    • SAN storage
    • NAS devices
    • NFS

Persistent Storage

  • SSD is absolutely the preferred solution for every workload
  • Provide extremely low-latency response times for random reads while supplying ample sequential write performance for compaction operations
  • Has no moving parts to fail

Persistent Storage

  • HDD is by far the cheapest storage available
  • They still show up in new systems for a few reasons:

    • Hardware vendors mark SSDs up by 100-2000%
    • 7200RPM SATA is around $0.05/GB while a Samsung 1TB MLC is $0.50
    • High capacity is still easier to come by

Persistent Storage

  • When do HDDs make sense?

    • Usually never

  • Exceptions might be:

    • Mostly-write workloads using DateTieredCompactionStrategy (i.e., limited seeking)

  • If you have to use HDD, get as much RAM as possible to increase caching

Memory

  • For both dedicated hardware and virtual environments:

    • Production: 16GB to 64GB; the minimum is 8GB.
    • Development in non-loading testing environments: no less than 4GB.

  • More memory means

    • Better read performance due to caching
    • Memtables hold more recently written data

CPU

  • Cassandra is highly concurrent and uses as many CPU cores as available
  • Production environments:

    • For dedicated hardware, 16-core CPU processors are the current price-performance sweet spot.
    • For virtual environments, 4 to 8-core CPU processors.

  • Development in non-loading testing environments:

    • For dedicated hardware, 2-core CPU processors.
    • For virtual environments, 2-core CPU processors.

Network

  • You should bind your OS interfaces to separate Network Interface Cards (NIC).
  • Recommended bandwidth is 1000 Mbit/s (gigabit) or greater.
  • Thrift/native protocols use the rpc_address.
  • Cassandra’s internal storage protocol uses the listen_address.

Cloud

Provisioning

  • There are easy one click deployment patterns for Cassandra and DSE development and test deployments in most of the major public clouds.
  • For production use OpsCenter LCM in 6.0 + for provisioning and configuration management. You will have to self provision the instances, OpsCenter does installation and config.
  • If you don’t have access to OpsCenter there are Chef / Ansible recipes / playbooks for DSE out there.

Selecting Your Instance Type

Disks

  • Ephemeral is fastest
  • The world is moving toward elastic / non ephemeral volumes
  • If an ephemeral volume fails, terminate and get a fresh box. Tell-tale sign of bitrot / corrupted sstables.
  • AWS

    • GP2 or IO2 EBS if you can’t afford ephemerals
    • Get big ones, IO latencies get better as the disk volumes increase up to 4TB. Get at least 3 TB’s on the GP2’s

  • Google

    • Their elastic volumes always have the same tight latencies
    • Some customer have observed tests that indicate pd-ssd is actually faster than local-ssd

  • Azure

    • Today mostly ephemeral is used
    • Premium storage is rolling out pretty fast

CPU Considerations

  • Hyperthreads

    • You don’t get a real CPU most of the time, tune accordingly

  • CPU steal / noisy neighbors

    • If you see CPU steal, terminate your box and get a new one, it means you have a noisy neighbor

Neworking considerations

Multi DC options include

  • AWS

    • mutli regions can’t communicate via internal IP, you need to figure out networking yourself or use limited number of Security rules

  • Azure

    • Vnet 2 vnet — slow, good for small workloads
    • Azure has options for fatter pipes one of which can even go to your physical DC for hybrid deployments
    • Use public IPs
    • Do not use VPN gateways
    • Express Route might be viable

  • Google

    • Flat network. No config necessary, it’s great

  • Enhanced networking

    • AWS thing you need to turn on on some instance types to get performance

Security

  • AWS

    • Volume encryption is available for EBS

  • Azure

    • Our templates have some issues. Need to set up network security groups (NSG)

  • Google

    • Largely secure by default. Everything goes through their multifactor auth.

Storage

SAN

storage\just say no

  • Just say no!
  • Not recommended for on premises deployments
  • Difficult and PRICEY architecture to use with distributed databases

storage\say no

  • SAN ROI does not scale along with that of Cassandra
  • SAN generally introduces a bottleneck and single point of failure
  • Added latency for all operations
  • Heap pressure is increased
  • Potential network saturation and network availability problems
  • If you do use SAN can look forward to a (return on investment) does not scale along with that of Cassandra, in terms of capital expenses and engineering resources.
  • SAN generally introduces a bottleneck and single point of failure because Cassandra’s IO frequently surpasses the array controller’s ability to keep up in distributed archetectures
  • External storage, even when used with a high-speed network and SSD, adds latency for all operations.
  • Heap pressure is increased because pending I/O operations take longer.
  • When the SAN transport shares operations with internal and external Cassandra traffic, it can saturate the network and lead to network availability problems.

A few Cassandra specific performance issues

  • Atrocious read performance
  • Potential write performance issues
  • System instability (nodes appear to go offline and/or are ''flapping'')
  • Client side failures on read and/or write operations
  • Flushwriters are blocked
  • Compactions are backed up
  • Nodes won’t start
  • Repair/Streaming won’t complete

Simply put, shared storage cannot keep up with the amount of disk io placed onto a system from Cassandra. You will be happier and better of not using shared storage with Cassandra.

storage\await

  • These are metrics collected at a consumer site where shared storage was used while running Cassandra
  • This table provides some exceptional metrics on poor disk performance caused by the use of shared storage.
  • As you can see, there is a 28 second, almost 29 second await
  • Cassandra actually considered this node “down” because it was unresponsive during the high await periods.
  • The load was actually minimal compared to those produced by cassandra-stress

NAS

  • Storing SSTables on a network attached storage (NAS) device is not recommended
  • Network related bottlenecks resulting from high levels of I/O wait time on both reads and writes
  • These are due to:

    • Router latency
    • The Network Interface Card (NIC) in the node
    • The NIC in the NAS device

NFS

  • Has exhibited inconsistent behavior with its abilities to delete and move files
  • This configuration is not supported in Cassandra and it is not recommend to use

Hadoop Style Nodes

=Fat Nodes

hadoop style nodes\elephant

  • An expensive way to make your Cassandra clusters inefficient
  • Hadoop style nodes are bias heavily towards large and slow storage and low memory
  • Optimized for slow analytic workloads and not fast transactions