Adding/Removing Nodes

Bootstrapping

Adding nodes

Why would we want to add more nodes to our cluster?

bootstrap\cluster

You might want to consider adding a new node if you have:

  • Reached data capacity problem

    • Your data has outgrown the node’s hardware capacity

  • Reached traffic capacity

    • Your application needs more rapid response with less latency

  • To increase operational headroom

    • Need more resources for node repair, compaction, and other resource intensive operations

Adding Nodes: Best Practices

Single-token Nodes

bootstrap\Group 2

  • Double the size of a cluster (single-token nodes)

Can minimize latency impact to a production load, where token recalculation and data movement can affect performance.

Hot spots are minimized during data movement to new nodes.

For single-token clusters, adding one node at a time isn’t a great idea.

This may cause cluster to become unbalanced.

Don’t leave the cluster unbalanced.

Adding a single node creates a much more complex token generation scheme.

A token inserted between two others will leave cluster unbalanced.

Have to regenerate all tokens and move data to rebalance ring.

Vnodes

  • For vnode clusters, we can increment the size of the cluster if more nodes are needed

bootstrap\new nodes

For node clusters, we can increment the size of the cluster if more nodes are needed

Token ranges are distributed and token range assignment is automatic

Several token ranges will be split on each node, therefore each node will stream some data

  • Adding a single node at a time will:

    • Result in more data movement
    • Will have a gradual impact on cluster performance
    • Will take longer to grow cluster

  • Adding multiple nodes at the same time will:

    • Shorten the overall time to resolve the cluster
    • Will have a significant performance impact until streaming quiesces

If you can take the performance hit by starting all nodes at once, things should complete more quickly. However, this is a luxury - it is probably usually better to start nodes patiently - one at a time.

Bootstrapping

Bootstrapping nodes

  • Bootstrapping is the process of a new node joining a cluster:

    • The joining node node contacts a seed node
    • The seed node communicates cluster info and the joining node tells tokens
    • Cluster nodes prepare to stream necessary SSTables
    • Cluster nodes stream SSTables to the joining node (can be time consuming)
    • Existing cluster nodes continue to satisfy writes, but also forward write to joining node
    • When streaming is complete, joining node changes to normal state and handles read/write requests

Bootstrapping nodes

  • To bootstrap a node:

    • Set up the node’s configuration files (cassandra.yaml, etc.)
    • Start up the node normally

Four main parameters of a node for bootstrapping

  • These are configured in the cassandra.yaml file:

    • cluster_name
    • rpc_address
    • listen_address
    • - seeds

What if bootstrap fails?

Two scenarios

  • Bootstrapping node could not even connect to cluster

    • Fairly easy to deal with
    • Something fundamental, like couldn’t find cluster
    • Examine the log file to understand what’s going on
    • Change config and try again

  • Streaming portion fails

    • Node exists in cluster in joining state
    • Try deleting data directories and rebooting
    • Worst case, remove the node from the cluster and try again

Nodetool Cleanup

bootstrap\broom

  • Perform cleanup after a bootstrap on the OTHER nodes
  • You don’t have to do this
  • Reads all SSTables to make sure there is no token out of range for that particular node
  • If the SSTable is not out of range, cleanup just does a copy
  • If you don’t run clean up, will get picked up through compaction over time

Since we’ve been considering adding new nodes to our cluster, we should probably talk about nodetool cleanup in the same breathe

When a node is added to a Cassandra cluster or an existing node is moved to a new position on the token ring, other systems still retain copies of data they are not responsible for

Nodetool cleanup removes data that does not belong on this node. We basically are recovering disk space.

Cleanup is intensive because it has to examine large portions of the data on disk

What does a cleanup operation look like?

bootstrap\cleanup

Cleanup is basically a compaction!

Clean keyspace by writing new SSTables by skipping the keys that no longer belong on the node

Every SStable is rewritten, therefore largest SSTable dictates the amount of space needed for operation

All SSTables are cleaned one table at a time, and if an SSTable is already clean, cleanup operation skips it

How do we run a cleanup operation

  • The nodetool cleanup command cleans up all data in a keyspace and table(s) that are specified
bin/nodetool [options] cleanup -- <keyspace> (<table>)
  • Use flags to specify:

    • -h [host] | [IP address]
    • -p port
    • -pw password
    • -u username

  • nodetool cleanup command will clean all keyspaces if no keyspace is specified

Removing a Node

Removing a node from the cluster

Two very different scenarios:

remove node\node cracked

  • You’re going to reduce capacity, need to decommission (due to some sort of operational requirement)
  • The node is offline and will never come back online

What has to happen to remove a node?

remove node\dicommission cluster

  • Other nodes need to pickup the removed-node’s data
  • The cluster needs to know the node is gone

Three options for dealing with the data

  • Redistribute the data from the node that is going away

    • nodetool decommission

  • Redistribute the data from replicas

    • nodetool removenode

  • Don’t redistribute the data, just make the node go away

    • nodetool assassinate

Decommissioning a node

  • Do this if you want to decrease the size of the cluster
  • The node must still be active
  • Decommissioning will transfer the data from the decommissioned node to other active nodes in the cluster

    • With VNodes, the rebalance happens automatically
    • With Single-token nodes, you will need to manually rebalance the token ranges on the remaining nodes

Decommissioning a node

  • After running the nodetool decommission command:

    • The node is offline
    • The JVM process is still running (use dse casssandra-stop to kill the process)
    • The data is not deleted from the decommissioned node
    • If you want to add the node back to the cluster, delete the data first!

      • Not deleting the data may cause data resurrection issues

Decommissioning a node

nodetool [options] decommission

Flags:

  • -h [host]|[IP address]
  • -p port
  • -pw password
  • -u username

Monitor progress with nodetool netstats and nodetool status

Removing a node

  • Do this if the node is offline and never coming back
  • Obviously, you run this command from a different node than the one you are removing
  • nodetool removenode will:

    • Make the remaining nodes in the cluster aware that the node is gone
    • Copy data from online nodes to the appropriate replicas to satisfy the replication factor

Removing a node

nodetool [options] removenode <host id>
  • Flags:

    • -h [host] | [IP address]
    • -p port
    • -pw password
    • -u username

  • Additional arguments:

    • status
    • force

Assassinating a node

  • Do this as a last resort if the node is offline and never coming back
  • nodetool assassinate will:

    • Make the remaining nodes in the cluster aware that the node is gone
    • NOT copy any data

  • You should use nodetool repair on the remaining nodes to fix the data replication

Assassinate a node

nodetool [options] assassinate <ip_address>
  • Flags:

    • -h [host] | [IP address]
    • -p port
    • -pw password
    • -u username

Replace Downed Nodes

The pros of replacing a downed node

replace node\swap node

  • You don’t have to move the data twice
  • Backup for a node will work for a replaced node, because same tokens are used to bring replaced node into cluster

You can replace a downed node

And there are cool benefits to doing it

You don’t have to move the data twice

Backup for a node will work for a replaced node, same tokens are used to bring replaced node into cluster

These are important operations considerations

Replacing a downed node using nodetool

replace node\downed  node

  • First, find the ip address of the down node using nodetool status.
  • Configure a new node for the cluster normally with one additional step:

    • In cassandra-env.sh add a replace_address JVM option
    • This option should use the IP address of the node you are replacing

  • In cassandra-env.sh add this line near the bottom:
JVM_OPTS="$JVM_OPTS -Dcassandra.replace_address=<IP_ADDRESS_OF_DEAD_NODE>"
  • This line tells the new node to use the same token as the old node
  • Once you have configured the node, bootstrap the node into the cluster
  • Use nodetool removenode to remove the dead node
  • Monitor the bootstrapping process using nodetool netstats

What if the node was also a seed node?

  • The process is similar except:

    • You will need to add a different seed in the node’s cassandra.yaml
    • Also, modify the other nodes' cassandra.yaml file to include the new seed
    • Seed nodes do not auto-bootstrap, so you will need to run nodetool repair on the new seed node
    • Remove the old seed node using nodetool removenode
    • Finally, you may want to run nodetool cleanup on the other nodes of the cluster

Exercise 4—​Add and Remove a Node to/from the Cluster