You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Alain RODRIGUEZ <ar...@gmail.com> on 2014/10/23 11:18:08 UTC

Operating on large cluster

Hi,

I was wondering about how do you guys handle a large cluster (50+ machines).

I mean there is sometime you need to change configuration (cassandra.yaml)
or send a command to one, some or all nodes (cleanup, upgradesstables,
setstramthoughput or whatever).

So far we have been using things like custom scripts for repairs or any
routine maintenance and cssh for specific and one shot actions on the
cluster. But I guess this doesn't really scale, I guess we coul use pssh
instead. For configuration changes we use Capistrano that might scale
properly.

So I would like to known, what are the methods that operators use on large
cluster out there ? Have some of you built some open sourced "cluster
management" interfaces or scripts that could make things easier while
operating on large Cassandra clusters ?

Alain

Re: Operating on large cluster

Posted by Otis Gospodnetic <ot...@gmail.com>.

Hi Alain,

We use Puppet and introducing Ansible at Sematext.  Not for Cassandra, but
for other similar tech.

Otis
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


On Thu, Oct 23, 2014 at 5:18 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:

> Hi,
>
> I was wondering about how do you guys handle a large cluster (50+
> machines).
>
> I mean there is sometime you need to change configuration (cassandra.yaml)
> or send a command to one, some or all nodes (cleanup, upgradesstables,
> setstramthoughput or whatever).
>
> So far we have been using things like custom scripts for repairs or any
> routine maintenance and cssh for specific and one shot actions on the
> cluster. But I guess this doesn't really scale, I guess we coul use pssh
> instead. For configuration changes we use Capistrano that might scale
> properly.
>
> So I would like to known, what are the methods that operators use on large
> cluster out there ? Have some of you built some open sourced "cluster
> management" interfaces or scripts that could make things easier while
> operating on large Cassandra clusters ?
>
> Alain
>

Re: Operating on large cluster

Posted by Roni Balthazar <ro...@gmail.com>.

Hi,

We use Puppet to manage our Cassandra configuration. (http://puppetlabs.com)

You can use Cluster SSH to send commands to the server as well.

Another good choice is Saltstack.

Regards,

Roni

On Thu, Oct 23, 2014 at 5:18 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:

> Hi,
>
> I was wondering about how do you guys handle a large cluster (50+
> machines).
>
> I mean there is sometime you need to change configuration (cassandra.yaml)
> or send a command to one, some or all nodes (cleanup, upgradesstables,
> setstramthoughput or whatever).
>
> So far we have been using things like custom scripts for repairs or any
> routine maintenance and cssh for specific and one shot actions on the
> cluster. But I guess this doesn't really scale, I guess we coul use pssh
> instead. For configuration changes we use Capistrano that might scale
> properly.
>
> So I would like to known, what are the methods that operators use on large
> cluster out there ? Have some of you built some open sourced "cluster
> management" interfaces or scripts that could make things easier while
> operating on large Cassandra clusters ?
>
> Alain
>

Re: Operating on large cluster

Posted by Ranjib Dey <de...@gmail.com>.

We use chef for configuration management and blender for on demand jobs

https://github.com/opscode/chef
https://github.com/PagerDuty/blender
 On Oct 23, 2014 2:18 AM, "Alain RODRIGUEZ" <ar...@gmail.com> wrote:

> Hi,
>
> I was wondering about how do you guys handle a large cluster (50+
> machines).
>
> I mean there is sometime you need to change configuration (cassandra.yaml)
> or send a command to one, some or all nodes (cleanup, upgradesstables,
> setstramthoughput or whatever).
>
> So far we have been using things like custom scripts for repairs or any
> routine maintenance and cssh for specific and one shot actions on the
> cluster. But I guess this doesn't really scale, I guess we coul use pssh
> instead. For configuration changes we use Capistrano that might scale
> properly.
>
> So I would like to known, what are the methods that operators use on large
> cluster out there ? Have some of you built some open sourced "cluster
> management" interfaces or scripts that could make things easier while
> operating on large Cassandra clusters ?
>
> Alain
>

Re: Operating on large cluster

Posted by Jens Rantil <je...@tink.se>.

Hi,


While I am nowhere close to 50+ machines I've been using Saltstack for both configuration management as well as remote execution. I has worked great for me and supposedly scales to 1000+ machines.




Cheers,

Jens


—
Sent from Mailbox

On Thu, Oct 23, 2014 at 11:18 AM, Alain RODRIGUEZ <ar...@gmail.com>
wrote:

> Hi,
> I was wondering about how do you guys handle a large cluster (50+ machines).
> I mean there is sometime you need to change configuration (cassandra.yaml)
> or send a command to one, some or all nodes (cleanup, upgradesstables,
> setstramthoughput or whatever).
> So far we have been using things like custom scripts for repairs or any
> routine maintenance and cssh for specific and one shot actions on the
> cluster. But I guess this doesn't really scale, I guess we coul use pssh
> instead. For configuration changes we use Capistrano that might scale
> properly.
> So I would like to known, what are the methods that operators use on large
> cluster out there ? Have some of you built some open sourced "cluster
> management" interfaces or scripts that could make things easier while
> operating on large Cassandra clusters ?
> Alain

Re: Operating on large cluster

Posted by Eric Plowe <er...@gmail.com>.

I am a big fan of perl-ssh-tools (https://github.com/tobert/perl-ssh-tools)
to let me manage my nodes and SVN to store configs.

~Eric Plowe


On Thu, Oct 23, 2014 at 3:07 PM, Michael Shuler <mi...@pbandjelly.org>
wrote:

> On 10/23/2014 04:18 AM, Alain RODRIGUEZ wrote:
>
>> I was wondering about how do you guys handle a large cluster (50+
>> machines).
>>
>
> Configuration management tools are awesome, until they aren't. Having used
> or played with all the popular ones, and having been bitten by failures of
> those tools on large clusters, my long-time preference has been using a VCS
> to check configs and scripts in/out and parallel ssh (whichever one you
> like). Simple is good. If you don't deeply understand the config management
> system you have chosen, the unexpected may(will?) eventually happen. To all
> the servers at once.
>
> Even when you are careful, we are human. No tool can prevent *all*
> mistakes. Test everything in a staging environment, first!
>
> --
> Kind regards,
> Michael
>
> PS. even staging doesn't prevent fallibility.. :)
> https://twitter.com/mshuler/status/520667739615395840
>

Re: Operating on large cluster

Posted by Michael Shuler <mi...@pbandjelly.org>.

On 10/23/2014 04:18 AM, Alain RODRIGUEZ wrote:
> I was wondering about how do you guys handle a large cluster (50+ machines).

Configuration management tools are awesome, until they aren't. Having 
used or played with all the popular ones, and having been bitten by 
failures of those tools on large clusters, my long-time preference has 
been using a VCS to check configs and scripts in/out and parallel ssh 
(whichever one you like). Simple is good. If you don't deeply understand 
the config management system you have chosen, the unexpected may(will?) 
eventually happen. To all the servers at once.

Even when you are careful, we are human. No tool can prevent *all* 
mistakes. Test everything in a staging environment, first!

-- 
Kind regards,
Michael

PS. even staging doesn't prevent fallibility.. :)
https://twitter.com/mshuler/status/520667739615395840