cassandra-chef-cookbook: Cassandra fails to restart after setting the cluster name

Hello there,

I’ve set up a small Cassandra Cluster using your cookbook. I’m using the default datastax installation method to install Cassandra 2. Unfortunately, I ran into a problem when setting a custom cluster name:

node.set[:cassandra][:cluster_name] = 'my-little-cassandra-cluster'
node.set[:cassandra][:start_native_transport] = true
...
include_recipe 'cassandra::default'

Here’s my environment:

Chef: 11.10.4 Vagrant: 1.6.3 and Current Opsworks stack Berkshelf: 3.1.4 Ubuntu ubuntu-14.04

When deploying Cassandra on my node, the installation works as expected. But when rebooting Cassandra to activate the changes (or with node[:cassandra][:notify_restart] enabled), Cassandra dies after the restart. The logs contain the following Stacktrace:

ERROR [main] 2014-08-08 11:12:56,619 CassandraDaemon.java (line 265) Fatal exception during initialization
org.apache.cassandra.exceptions.ConfigurationException: Saved cluster name Test Cluster != configured name hockey-telemetry-cassandra
        at org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:564)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:261)
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)

As far as I can tell the problem is caused by Cassandra being started automatically after installing the .deb-package. Therefore, the cluster-name is alrady set to “Test Cluster” and changing it results in the error above.

I’ve found a way to circumvent this problem, but it feels like a hack to me:

# Work around cassandra starting with the wrong cluster name
execute 'cqlsh' do
  command %(
  listen_address = `lsof -iTCP:#{node[:cassandra][:rpc_port]} -sTCP:LISTEN -n -F n | grep -oE "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"`
  cqlsh $listen_address -e "update system.local set cluster_name='#{node[:cassandra][:cluster_name]}' where key='local';"
  nodetool flush
  )

  not_if %(cqlsh #{node[:ipaddress]} -e 'select cluster_name from system.local' | grep #{node[:cassandra][:cluster_name]}$)

  action :run
end

Is there any way you can prevent Cassandra from starting before deploying the config file?

Cheers, pfleidi

About this issue

  • Original URL
  • State: closed
  • Created 10 years ago
  • Reactions: 1
  • Comments: 15 (7 by maintainers)

Commits related to this issue

Most upvoted comments

I’m seeing this issue show up again when trying to install Cassandra 2.2.0 via the datastax package dsc22.

The cqlsh command doesn’t fail, however without clearing out the data directory Cassandra will not restart and throws a cluster name mismatch error.

I’m using the following attributes:

node.set['cassandra']['version'] = '2.2.0'
node.set['cassandra']['package_name'] = 'dsc22'

@michaelklishin @pfleidi Just deleting system key space directory to change cluster_name is also a valid solution, it is also mentioned in DSC Debian install guide.

http://www.datastax.com/documentation/cassandra/2.0/cassandra/install/installDeb_t.html

6. Because the Debian packages start the Cassandra service automatically, you 
must stop the server and clear the data:

Doing this removes the default cluster_name (Test Cluster) from the system table. 
All nodes must use the same cluster name.

$ sudo service cassandra stop
$ sudo rm -rf /var/lib/cassandra/data/system/*

The distribution of Cassandra is ready for configuration.

As the C* service will be running with default configuration after package installation, removing system keyspace does/should not cause any impact.

However, i think C* package install resource can simply notifies cqlsh execute resource to update the cluster_name with post action C* service restart, like:

  yum_package node.cassandra.package_name" do
    version  "#{node.cassandra.version}-#{node.cassandra.release}"
    allow_downgrade
    options  node['cassandra']['yum']['options']
    notified  :run, "execute[set_cluster_name]"
  end

  execute "set_cluster_name" do
    cwd      "/tmp"
    command  "cqlsh -e \"update system.local set cluster_name='#{node[:cassandra][:cluster_name]}' where key='local';\""
    notifies :restart, "service[cassandra]"
    action   :nothing
  end

I am not entirely sure what other configuration gets stored into system key space which could be better off deleting it.