You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Dan Kinder <dk...@turnitin.com> on 2015/05/21 19:32:26 UTC

Multiple cassandra instances per physical node

Hi, I'd just like some clarity and advice regarding running multiple
cassandra instances on a single large machine (big JBOD array, plenty of
CPU/RAM).

First, I am aware this was not Cassandra's original design, and doing this
seems to unreasonably go against the "commodity hardware" intentions of
Cassandra's design. In general it seems to be recommended against (at least
as far as I've heard from @Rob Coli and others).

However maybe this term "commodity" is changing... my hardware/ops team
argues that due to cooling, power, and other datacenter costs, having
slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a
better price point. Now, I am not a hardware guy, so if this is not
actually true I'd love to hear why, otherwise I pretty much need to take
them at their word.

Now, Cassandra features seemed to have improved such that JBOD works fairly
well, but especially with memory/GC this seems to be reaching its limit.
One Cassandra instance can only scale up so much.

So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes
(each with 5 data disks, 1 commit log disk) and either give each its own
container & IP or change the listen ports. Will this work? What are the
risks? Will/should Cassandra support this better in the future?

Re: Multiple cassandra instances per physical node

Posted by Horký, Jiří <ho...@avast.com>.
Hi,
we do operate multiple instances (of possibly different versions) of
Cassandra on rather thick nodes. The only problem we encountered so far was
sharing same physical data disk among multiple instances - it proved to not
be the best idea.Sharing of commitlog disks caused no troubles so far.
Other than that, it works without any problems. We manage the instances by
a set of helper scripts (which change the env variables, so "nodetool" and
such operates on right instance) and puppet templates.

Jiri Horky

On Thu, May 21, 2015 at 11:06 PM, Dan Kinder <dk...@turnitin.com> wrote:

> @James Rothering yeah I was thinking of container in a broad sense: either
> full virtual machines, docker containers, straight LXC, or whatever else
> would allow the Cassandra nodes to have their own IPs and bind to default
> ports.
>
> @Jonathan Haddad thanks for the blog post. To ensure the same host does
> not replicate its own data, would I basically need the nodes on a single
> host to be labeled as one rack? (Assuming I use vnodes)
>
> On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez <
> sebastian.estevez@datastax.com> wrote:
>
>> JBOD --> just a bunch of disks, no raid.
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>>
>> <http://cassandrasummit-datastax.com/>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Thu, May 21, 2015 at 4:00 PM, James Rothering <jr...@codojo.me>
>> wrote:
>>
>>> Hmmm ... Not familiar with JBOD. Is that just RAID-0?
>>>
>>> Also ... wrt  the container talk, is that a Docker container you're
>>> talking about?
>>>
>>>
>>>
>>> On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad <jo...@jonhaddad.com>
>>> wrote:
>>>
>>>> If you run it in a container with dedicated IPs it'll work just fine.
>>>> Just be sure you aren't using the same machine to replicate it's own data.
>>>>
>>>> On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar <
>>>> khangaonkar@gmail.com> wrote:
>>>>
>>>>> +1.
>>>>>
>>>>> I agree we need to be able to run multiple server instances on one
>>>>> physical machine. This is especially necessary in development and test
>>>>> environments where one is experimenting and needs a cluster, but do not
>>>>> have access to multiple physical machines.
>>>>>
>>>>> If you google , you  can find a few blogs that talk about how to do
>>>>> this.
>>>>>
>>>>> But it is less than ideal. We need to be able to do it by changing
>>>>> ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache
>>>>> Kafka or Redis and many other distributed systems)
>>>>>
>>>>>
>>>>> regards
>>>>>
>>>>>
>>>>>
>>>>> On Thu, May 21, 2015 at 10:32 AM, Dan Kinder <dk...@turnitin.com>
>>>>> wrote:
>>>>>
>>>>>> Hi, I'd just like some clarity and advice regarding running multiple
>>>>>> cassandra instances on a single large machine (big JBOD array, plenty of
>>>>>> CPU/RAM).
>>>>>>
>>>>>> First, I am aware this was not Cassandra's original design, and doing
>>>>>> this seems to unreasonably go against the "commodity hardware" intentions
>>>>>> of Cassandra's design. In general it seems to be recommended against (at
>>>>>> least as far as I've heard from @Rob Coli and others).
>>>>>>
>>>>>> However maybe this term "commodity" is changing... my hardware/ops
>>>>>> team argues that due to cooling, power, and other datacenter costs, having
>>>>>> slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a
>>>>>> better price point. Now, I am not a hardware guy, so if this is not
>>>>>> actually true I'd love to hear why, otherwise I pretty much need to take
>>>>>> them at their word.
>>>>>>
>>>>>> Now, Cassandra features seemed to have improved such that JBOD works
>>>>>> fairly well, but especially with memory/GC this seems to be reaching its
>>>>>> limit. One Cassandra instance can only scale up so much.
>>>>>>
>>>>>> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
>>>>>> nodes (each with 5 data disks, 1 commit log disk) and either give each its
>>>>>> own container & IP or change the listen ports. Will this work? What are the
>>>>>> risks? Will/should Cassandra support this better in the future?
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> http://khangaonkar.blogspot.com/
>>>>>
>>>>
>>>
>>
>
>
> --
> Dan Kinder
> Senior Software Engineer
> Turnitin – www.turnitin.com
> dkinder@turnitin.com
>

RE: Multiple cassandra instances per physical node

Posted by SE...@homedepot.com.
We run 2 nodes (from 2 different rings) on the same physical host. One is for a random ring; the other is byteordered to support some alphabetic range queries. Each instance has its own binary install, data directory and ports. One limitation - with one install of OpsCenter agent, it can only connect to one of the rings. We haven’t tried two OpsCenter agent installs, yet.


Sean Durity

From: Jonathan Haddad [mailto:jon@jonhaddad.com]
Sent: Thursday, May 21, 2015 5:26 PM
To: user@cassandra.apache.org
Subject: Re: Multiple cassandra instances per physical node

Yep, that would be one way to handle it.
On Thu, May 21, 2015 at 2:07 PM Dan Kinder <dk...@turnitin.com>> wrote:
@James Rothering yeah I was thinking of container in a broad sense: either full virtual machines, docker containers, straight LXC, or whatever else would allow the Cassandra nodes to have their own IPs and bind to default ports.

@Jonathan Haddad thanks for the blog post. To ensure the same host does not replicate its own data, would I basically need the nodes on a single host to be labeled as one rack? (Assuming I use vnodes)

On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez <se...@datastax.com>> wrote:
JBOD --> just a bunch of disks, no raid.


All the best,



[Image removed by sender. datastax_logo.png]<http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615<tel:954%20905%208615> | sebastian.estevez@datastax.com<ma...@datastax.com>

[Image removed by sender. linkedin.png]<https://www.linkedin.com/company/datastax>[Image removed by sender. facebook.png]<https://www.facebook.com/datastax>[Image removed by sender. twitter.png]<https://twitter.com/datastax>[Image removed by sender. g+.png]<https://plus.google.com/+Datastax/about>[Image removed by sender.]<http://feeds.feedburner.com/datastax>

[Image removed by sender.]<http://cassandrasummit-datastax.com/>

DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, May 21, 2015 at 4:00 PM, James Rothering <jr...@codojo.me>> wrote:
Hmmm ... Not familiar with JBOD. Is that just RAID-0?

Also ... wrt  the container talk, is that a Docker container you're talking about?



On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad <jo...@jonhaddad.com>> wrote:
If you run it in a container with dedicated IPs it'll work just fine.  Just be sure you aren't using the same machine to replicate it's own data.

On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar <kh...@gmail.com>> wrote:
+1.
I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines.
If you google , you  can find a few blogs that talk about how to do this.

But it is less than ideal. We need to be able to do it by changing ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or Redis and many other distributed systems)

regards



On Thu, May 21, 2015 at 10:32 AM, Dan Kinder <dk...@turnitin.com>> wrote:
Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM).

First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the "commodity hardware" intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others).

However maybe this term "commodity" is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word.

Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much.

So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container & IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future?


--
http://khangaonkar.blogspot.com/





--
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com<http://www.turnitin.com>
dkinder@turnitin.com<ma...@turnitin.com>

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: Multiple cassandra instances per physical node

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
Yep, that would be one way to handle it.

On Thu, May 21, 2015 at 2:07 PM Dan Kinder <dk...@turnitin.com> wrote:

> @James Rothering yeah I was thinking of container in a broad sense: either
> full virtual machines, docker containers, straight LXC, or whatever else
> would allow the Cassandra nodes to have their own IPs and bind to default
> ports.
>
> @Jonathan Haddad thanks for the blog post. To ensure the same host does
> not replicate its own data, would I basically need the nodes on a single
> host to be labeled as one rack? (Assuming I use vnodes)
>
> On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez <
> sebastian.estevez@datastax.com> wrote:
>
>> JBOD --> just a bunch of disks, no raid.
>>
>> All the best,
>>
>>
>> [image: datastax_logo.png] <http://www.datastax.com/>
>>
>> Sebastián Estévez
>>
>> Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com
>>
>> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
>> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
>> <https://twitter.com/datastax> [image: g+.png]
>> <https://plus.google.com/+Datastax/about>
>> <http://feeds.feedburner.com/datastax>
>>
>> <http://cassandrasummit-datastax.com/>
>>
>> DataStax is the fastest, most scalable distributed database technology,
>> delivering Apache Cassandra to the world’s most innovative enterprises.
>> Datastax is built to be agile, always-on, and predictably scalable to any
>> size. With more than 500 customers in 45 countries, DataStax is the
>> database technology and transactional backbone of choice for the worlds
>> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>>
>> On Thu, May 21, 2015 at 4:00 PM, James Rothering <jr...@codojo.me>
>> wrote:
>>
>>> Hmmm ... Not familiar with JBOD. Is that just RAID-0?
>>>
>>> Also ... wrt  the container talk, is that a Docker container you're
>>> talking about?
>>>
>>>
>>>
>>> On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad <jo...@jonhaddad.com>
>>> wrote:
>>>
>>>> If you run it in a container with dedicated IPs it'll work just fine.
>>>> Just be sure you aren't using the same machine to replicate it's own data.
>>>>
>>>> On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar <
>>>> khangaonkar@gmail.com> wrote:
>>>>
>>>>> +1.
>>>>>
>>>>> I agree we need to be able to run multiple server instances on one
>>>>> physical machine. This is especially necessary in development and test
>>>>> environments where one is experimenting and needs a cluster, but do not
>>>>> have access to multiple physical machines.
>>>>>
>>>>> If you google , you  can find a few blogs that talk about how to do
>>>>> this.
>>>>>
>>>>> But it is less than ideal. We need to be able to do it by changing
>>>>> ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache
>>>>> Kafka or Redis and many other distributed systems)
>>>>>
>>>>>
>>>>> regards
>>>>>
>>>>>
>>>>>
>>>>> On Thu, May 21, 2015 at 10:32 AM, Dan Kinder <dk...@turnitin.com>
>>>>> wrote:
>>>>>
>>>>>> Hi, I'd just like some clarity and advice regarding running multiple
>>>>>> cassandra instances on a single large machine (big JBOD array, plenty of
>>>>>> CPU/RAM).
>>>>>>
>>>>>> First, I am aware this was not Cassandra's original design, and doing
>>>>>> this seems to unreasonably go against the "commodity hardware" intentions
>>>>>> of Cassandra's design. In general it seems to be recommended against (at
>>>>>> least as far as I've heard from @Rob Coli and others).
>>>>>>
>>>>>> However maybe this term "commodity" is changing... my hardware/ops
>>>>>> team argues that due to cooling, power, and other datacenter costs, having
>>>>>> slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a
>>>>>> better price point. Now, I am not a hardware guy, so if this is not
>>>>>> actually true I'd love to hear why, otherwise I pretty much need to take
>>>>>> them at their word.
>>>>>>
>>>>>> Now, Cassandra features seemed to have improved such that JBOD works
>>>>>> fairly well, but especially with memory/GC this seems to be reaching its
>>>>>> limit. One Cassandra instance can only scale up so much.
>>>>>>
>>>>>> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
>>>>>> nodes (each with 5 data disks, 1 commit log disk) and either give each its
>>>>>> own container & IP or change the listen ports. Will this work? What are the
>>>>>> risks? Will/should Cassandra support this better in the future?
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> http://khangaonkar.blogspot.com/
>>>>>
>>>>
>>>
>>
>
>
> --
> Dan Kinder
> Senior Software Engineer
> Turnitin – www.turnitin.com
> dkinder@turnitin.com
>

Re: Multiple cassandra instances per physical node

Posted by Dan Kinder <dk...@turnitin.com>.
@James Rothering yeah I was thinking of container in a broad sense: either
full virtual machines, docker containers, straight LXC, or whatever else
would allow the Cassandra nodes to have their own IPs and bind to default
ports.

@Jonathan Haddad thanks for the blog post. To ensure the same host does not
replicate its own data, would I basically need the nodes on a single host
to be labeled as one rack? (Assuming I use vnodes)

On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez <
sebastian.estevez@datastax.com> wrote:

> JBOD --> just a bunch of disks, no raid.
>
> All the best,
>
>
> [image: datastax_logo.png] <http://www.datastax.com/>
>
> Sebastián Estévez
>
> Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com
>
> [image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
> facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
> <https://twitter.com/datastax> [image: g+.png]
> <https://plus.google.com/+Datastax/about>
> <http://feeds.feedburner.com/datastax>
>
> <http://cassandrasummit-datastax.com/>
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Thu, May 21, 2015 at 4:00 PM, James Rothering <jr...@codojo.me>
> wrote:
>
>> Hmmm ... Not familiar with JBOD. Is that just RAID-0?
>>
>> Also ... wrt  the container talk, is that a Docker container you're
>> talking about?
>>
>>
>>
>> On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad <jo...@jonhaddad.com>
>> wrote:
>>
>>> If you run it in a container with dedicated IPs it'll work just fine.
>>> Just be sure you aren't using the same machine to replicate it's own data.
>>>
>>> On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar <
>>> khangaonkar@gmail.com> wrote:
>>>
>>>> +1.
>>>>
>>>> I agree we need to be able to run multiple server instances on one
>>>> physical machine. This is especially necessary in development and test
>>>> environments where one is experimenting and needs a cluster, but do not
>>>> have access to multiple physical machines.
>>>>
>>>> If you google , you  can find a few blogs that talk about how to do
>>>> this.
>>>>
>>>> But it is less than ideal. We need to be able to do it by changing
>>>> ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache
>>>> Kafka or Redis and many other distributed systems)
>>>>
>>>>
>>>> regards
>>>>
>>>>
>>>>
>>>> On Thu, May 21, 2015 at 10:32 AM, Dan Kinder <dk...@turnitin.com>
>>>> wrote:
>>>>
>>>>> Hi, I'd just like some clarity and advice regarding running multiple
>>>>> cassandra instances on a single large machine (big JBOD array, plenty of
>>>>> CPU/RAM).
>>>>>
>>>>> First, I am aware this was not Cassandra's original design, and doing
>>>>> this seems to unreasonably go against the "commodity hardware" intentions
>>>>> of Cassandra's design. In general it seems to be recommended against (at
>>>>> least as far as I've heard from @Rob Coli and others).
>>>>>
>>>>> However maybe this term "commodity" is changing... my hardware/ops
>>>>> team argues that due to cooling, power, and other datacenter costs, having
>>>>> slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a
>>>>> better price point. Now, I am not a hardware guy, so if this is not
>>>>> actually true I'd love to hear why, otherwise I pretty much need to take
>>>>> them at their word.
>>>>>
>>>>> Now, Cassandra features seemed to have improved such that JBOD works
>>>>> fairly well, but especially with memory/GC this seems to be reaching its
>>>>> limit. One Cassandra instance can only scale up so much.
>>>>>
>>>>> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
>>>>> nodes (each with 5 data disks, 1 commit log disk) and either give each its
>>>>> own container & IP or change the listen ports. Will this work? What are the
>>>>> risks? Will/should Cassandra support this better in the future?
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> http://khangaonkar.blogspot.com/
>>>>
>>>
>>
>


-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkinder@turnitin.com

Re: Multiple cassandra instances per physical node

Posted by Sebastian Estevez <se...@datastax.com>.
JBOD --> just a bunch of disks, no raid.

All the best,


[image: datastax_logo.png] <http://www.datastax.com/>

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.estevez@datastax.com

[image: linkedin.png] <https://www.linkedin.com/company/datastax> [image:
facebook.png] <https://www.facebook.com/datastax> [image: twitter.png]
<https://twitter.com/datastax> [image: g+.png]
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax>

<http://cassandrasummit-datastax.com/>

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, May 21, 2015 at 4:00 PM, James Rothering <jr...@codojo.me>
wrote:

> Hmmm ... Not familiar with JBOD. Is that just RAID-0?
>
> Also ... wrt  the container talk, is that a Docker container you're
> talking about?
>
>
>
> On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad <jo...@jonhaddad.com>
> wrote:
>
>> If you run it in a container with dedicated IPs it'll work just fine.
>> Just be sure you aren't using the same machine to replicate it's own data.
>>
>> On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar <kh...@gmail.com>
>> wrote:
>>
>>> +1.
>>>
>>> I agree we need to be able to run multiple server instances on one
>>> physical machine. This is especially necessary in development and test
>>> environments where one is experimenting and needs a cluster, but do not
>>> have access to multiple physical machines.
>>>
>>> If you google , you  can find a few blogs that talk about how to do this.
>>>
>>> But it is less than ideal. We need to be able to do it by changing ports
>>> in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka
>>> or Redis and many other distributed systems)
>>>
>>>
>>> regards
>>>
>>>
>>>
>>> On Thu, May 21, 2015 at 10:32 AM, Dan Kinder <dk...@turnitin.com>
>>> wrote:
>>>
>>>> Hi, I'd just like some clarity and advice regarding running multiple
>>>> cassandra instances on a single large machine (big JBOD array, plenty of
>>>> CPU/RAM).
>>>>
>>>> First, I am aware this was not Cassandra's original design, and doing
>>>> this seems to unreasonably go against the "commodity hardware" intentions
>>>> of Cassandra's design. In general it seems to be recommended against (at
>>>> least as far as I've heard from @Rob Coli and others).
>>>>
>>>> However maybe this term "commodity" is changing... my hardware/ops team
>>>> argues that due to cooling, power, and other datacenter costs, having
>>>> slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a
>>>> better price point. Now, I am not a hardware guy, so if this is not
>>>> actually true I'd love to hear why, otherwise I pretty much need to take
>>>> them at their word.
>>>>
>>>> Now, Cassandra features seemed to have improved such that JBOD works
>>>> fairly well, but especially with memory/GC this seems to be reaching its
>>>> limit. One Cassandra instance can only scale up so much.
>>>>
>>>> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
>>>> nodes (each with 5 data disks, 1 commit log disk) and either give each its
>>>> own container & IP or change the listen ports. Will this work? What are the
>>>> risks? Will/should Cassandra support this better in the future?
>>>>
>>>
>>>
>>>
>>> --
>>> http://khangaonkar.blogspot.com/
>>>
>>
>

Re: Multiple cassandra instances per physical node

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
You could use docker but it's not required.  You could use LXC if you
wanted.

Shameless self promo:
http://rustyrazorblade.com/2013/08/advanced-devops-with-vagrant-and-lxc/


On Thu, May 21, 2015 at 1:00 PM James Rothering <jr...@codojo.me>
wrote:

> Hmmm ... Not familiar with JBOD. Is that just RAID-0?
>
> Also ... wrt  the container talk, is that a Docker container you're
> talking about?
>
>
>
> On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad <jo...@jonhaddad.com>
> wrote:
>
>> If you run it in a container with dedicated IPs it'll work just fine.
>> Just be sure you aren't using the same machine to replicate it's own data.
>>
>> On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar <kh...@gmail.com>
>> wrote:
>>
>>> +1.
>>>
>>> I agree we need to be able to run multiple server instances on one
>>> physical machine. This is especially necessary in development and test
>>> environments where one is experimenting and needs a cluster, but do not
>>> have access to multiple physical machines.
>>>
>>> If you google , you  can find a few blogs that talk about how to do this.
>>>
>>> But it is less than ideal. We need to be able to do it by changing ports
>>> in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka
>>> or Redis and many other distributed systems)
>>>
>>>
>>> regards
>>>
>>>
>>>
>>> On Thu, May 21, 2015 at 10:32 AM, Dan Kinder <dk...@turnitin.com>
>>> wrote:
>>>
>>>> Hi, I'd just like some clarity and advice regarding running multiple
>>>> cassandra instances on a single large machine (big JBOD array, plenty of
>>>> CPU/RAM).
>>>>
>>>> First, I am aware this was not Cassandra's original design, and doing
>>>> this seems to unreasonably go against the "commodity hardware" intentions
>>>> of Cassandra's design. In general it seems to be recommended against (at
>>>> least as far as I've heard from @Rob Coli and others).
>>>>
>>>> However maybe this term "commodity" is changing... my hardware/ops team
>>>> argues that due to cooling, power, and other datacenter costs, having
>>>> slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a
>>>> better price point. Now, I am not a hardware guy, so if this is not
>>>> actually true I'd love to hear why, otherwise I pretty much need to take
>>>> them at their word.
>>>>
>>>> Now, Cassandra features seemed to have improved such that JBOD works
>>>> fairly well, but especially with memory/GC this seems to be reaching its
>>>> limit. One Cassandra instance can only scale up so much.
>>>>
>>>> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
>>>> nodes (each with 5 data disks, 1 commit log disk) and either give each its
>>>> own container & IP or change the listen ports. Will this work? What are the
>>>> risks? Will/should Cassandra support this better in the future?
>>>>
>>>
>>>
>>>
>>> --
>>> http://khangaonkar.blogspot.com/
>>>
>>
>

Re: Multiple cassandra instances per physical node

Posted by James Rothering <jr...@codojo.me>.
Hmmm ... Not familiar with JBOD. Is that just RAID-0?

Also ... wrt  the container talk, is that a Docker container you're talking
about?



On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> If you run it in a container with dedicated IPs it'll work just fine.
> Just be sure you aren't using the same machine to replicate it's own data.
>
> On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar <kh...@gmail.com>
> wrote:
>
>> +1.
>>
>> I agree we need to be able to run multiple server instances on one
>> physical machine. This is especially necessary in development and test
>> environments where one is experimenting and needs a cluster, but do not
>> have access to multiple physical machines.
>>
>> If you google , you  can find a few blogs that talk about how to do this.
>>
>> But it is less than ideal. We need to be able to do it by changing ports
>> in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka
>> or Redis and many other distributed systems)
>>
>>
>> regards
>>
>>
>>
>> On Thu, May 21, 2015 at 10:32 AM, Dan Kinder <dk...@turnitin.com>
>> wrote:
>>
>>> Hi, I'd just like some clarity and advice regarding running multiple
>>> cassandra instances on a single large machine (big JBOD array, plenty of
>>> CPU/RAM).
>>>
>>> First, I am aware this was not Cassandra's original design, and doing
>>> this seems to unreasonably go against the "commodity hardware" intentions
>>> of Cassandra's design. In general it seems to be recommended against (at
>>> least as far as I've heard from @Rob Coli and others).
>>>
>>> However maybe this term "commodity" is changing... my hardware/ops team
>>> argues that due to cooling, power, and other datacenter costs, having
>>> slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a
>>> better price point. Now, I am not a hardware guy, so if this is not
>>> actually true I'd love to hear why, otherwise I pretty much need to take
>>> them at their word.
>>>
>>> Now, Cassandra features seemed to have improved such that JBOD works
>>> fairly well, but especially with memory/GC this seems to be reaching its
>>> limit. One Cassandra instance can only scale up so much.
>>>
>>> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
>>> nodes (each with 5 data disks, 1 commit log disk) and either give each its
>>> own container & IP or change the listen ports. Will this work? What are the
>>> risks? Will/should Cassandra support this better in the future?
>>>
>>
>>
>>
>> --
>> http://khangaonkar.blogspot.com/
>>
>

Re: Multiple cassandra instances per physical node

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
If you run it in a container with dedicated IPs it'll work just fine.  Just
be sure you aren't using the same machine to replicate it's own data.

On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar <kh...@gmail.com>
wrote:

> +1.
>
> I agree we need to be able to run multiple server instances on one
> physical machine. This is especially necessary in development and test
> environments where one is experimenting and needs a cluster, but do not
> have access to multiple physical machines.
>
> If you google , you  can find a few blogs that talk about how to do this.
>
> But it is less than ideal. We need to be able to do it by changing ports
> in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka
> or Redis and many other distributed systems)
>
>
> regards
>
>
>
> On Thu, May 21, 2015 at 10:32 AM, Dan Kinder <dk...@turnitin.com> wrote:
>
>> Hi, I'd just like some clarity and advice regarding running multiple
>> cassandra instances on a single large machine (big JBOD array, plenty of
>> CPU/RAM).
>>
>> First, I am aware this was not Cassandra's original design, and doing
>> this seems to unreasonably go against the "commodity hardware" intentions
>> of Cassandra's design. In general it seems to be recommended against (at
>> least as far as I've heard from @Rob Coli and others).
>>
>> However maybe this term "commodity" is changing... my hardware/ops team
>> argues that due to cooling, power, and other datacenter costs, having
>> slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a
>> better price point. Now, I am not a hardware guy, so if this is not
>> actually true I'd love to hear why, otherwise I pretty much need to take
>> them at their word.
>>
>> Now, Cassandra features seemed to have improved such that JBOD works
>> fairly well, but especially with memory/GC this seems to be reaching its
>> limit. One Cassandra instance can only scale up so much.
>>
>> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
>> nodes (each with 5 data disks, 1 commit log disk) and either give each its
>> own container & IP or change the listen ports. Will this work? What are the
>> risks? Will/should Cassandra support this better in the future?
>>
>
>
>
> --
> http://khangaonkar.blogspot.com/
>

Re: Multiple cassandra instances per physical node

Posted by Manoj Khangaonkar <kh...@gmail.com>.
+1.

I agree we need to be able to run multiple server instances on one physical
machine. This is especially necessary in development and test environments
where one is experimenting and needs a cluster, but do not have access to
multiple physical machines.

If you google , you  can find a few blogs that talk about how to do this.

But it is less than ideal. We need to be able to do it by changing ports in
cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or
Redis and many other distributed systems)


regards



On Thu, May 21, 2015 at 10:32 AM, Dan Kinder <dk...@turnitin.com> wrote:

> Hi, I'd just like some clarity and advice regarding running multiple
> cassandra instances on a single large machine (big JBOD array, plenty of
> CPU/RAM).
>
> First, I am aware this was not Cassandra's original design, and doing this
> seems to unreasonably go against the "commodity hardware" intentions of
> Cassandra's design. In general it seems to be recommended against (at least
> as far as I've heard from @Rob Coli and others).
>
> However maybe this term "commodity" is changing... my hardware/ops team
> argues that due to cooling, power, and other datacenter costs, having
> slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a
> better price point. Now, I am not a hardware guy, so if this is not
> actually true I'd love to hear why, otherwise I pretty much need to take
> them at their word.
>
> Now, Cassandra features seemed to have improved such that JBOD works
> fairly well, but especially with memory/GC this seems to be reaching its
> limit. One Cassandra instance can only scale up so much.
>
> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes
> (each with 5 data disks, 1 commit log disk) and either give each its own
> container & IP or change the listen ports. Will this work? What are the
> risks? Will/should Cassandra support this better in the future?
>



-- 
http://khangaonkar.blogspot.com/

Re: Multiple cassandra instances per physical node

Posted by Ben Bromhead <be...@instaclustr.com>.
@Sean -> You can manually change the ports used by Datastax agent using the
address.yaml file in the agent install directory.

+1 on using racks to separate it out... but it will increase operational
complexity somewhat

On 26 May 2015 at 08:11, Nate McCall <na...@thelastpickle.com> wrote:

>
>> If you're running multiple nodes on a single server, vnodes give you no
>> control over which instance has which key (whereas you can assign initial
>> tokens).  Therefore you could have two of your three replicas on the same
>> physical server which, if it goes down, you can't read or write at quorum.
>>
>
> Yep. You *will* have overlapping ranges on each physical server so long as
> Vnodes > 'number of nodes in the cluster'.
>
>
>
>>
>> However, can't you use the topology snitch to put both nodes in the same
>> rack?  Won't that prevent the issue and still allow you to maintain quorum
>> if a single server goes down?  If I have a 20-node cluster with 2 nodes on
>> each physical server, can I use 10 racks to properly segment my partitions?
>>
>
> That's a good point, yes. I'd still personally prefer the operational
> simplicity of simply spacing out token assignments though, but YMMV.
>
>
>
> --
> -----------------
> Nate McCall
> Austin, TX
> @zznate
>
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>



-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

Re: Multiple cassandra instances per physical node

Posted by Nate McCall <na...@thelastpickle.com>.
>
>
> If you're running multiple nodes on a single server, vnodes give you no
> control over which instance has which key (whereas you can assign initial
> tokens).  Therefore you could have two of your three replicas on the same
> physical server which, if it goes down, you can't read or write at quorum.
>

Yep. You *will* have overlapping ranges on each physical server so long as
Vnodes > 'number of nodes in the cluster'.



>
> However, can't you use the topology snitch to put both nodes in the same
> rack?  Won't that prevent the issue and still allow you to maintain quorum
> if a single server goes down?  If I have a 20-node cluster with 2 nodes on
> each physical server, can I use 10 racks to properly segment my partitions?
>

That's a good point, yes. I'd still personally prefer the operational
simplicity of simply spacing out token assignments though, but YMMV.



-- 
-----------------
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: Multiple cassandra instances per physical node

Posted by Jake Luciani <ja...@gmail.com>.
>
>  If I have a 20-node cluster with 2 nodes on each physical server, can I
> use 10 racks to properly segment my partitions?
>
>
Yes.


>
>
> On Sun, May 24, 2015 at 5:38 PM, Jonathan Haddad <jo...@jonhaddad.com>
> wrote:
>
>> What impact would vnodes have on strong consistency?  I think the problem
>> you're describing exists with or without them.
>>
>> On Sat, May 23, 2015 at 2:30 PM Nate McCall <na...@thelastpickle.com>
>> wrote:
>>
>>>
>>>> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
>>>> nodes (each with 5 data disks, 1 commit log disk) and either give each its
>>>> own container & IP or change the listen ports. Will this work? What are the
>>>> risks? Will/should Cassandra support this better in the future?
>>>>
>>>
>>> Don't use vnodes if any operations need strong consistency (reading or
>>> writing at quorum). Otherwise, at RF=3, if you loose a single node you will
>>> only have one 1 replica left for some portion of the ring.
>>>
>>>
>>>
>>> --
>>> -----------------
>>> Nate McCall
>>> Austin, TX
>>> @zznate
>>>
>>> Co-Founder & Sr. Technical Consultant
>>> Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>>>
>>
>
>
> --
> *Ken Hancock *| System Architect, Advanced Advertising
> SeaChange International
> 50 Nagog Park
> Acton, Massachusetts 01720
> ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
> <http://www.schange.com/en-US/Company/InvestorRelations.aspx>
> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hancock@schange.com
>  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn]
> <http://www.linkedin.com/in/kenhancock>
>
> [image: SeaChange International]
> <http://www.schange.com/>This e-mail and any attachments may contain
> information which is SeaChange International confidential. The information
> enclosed is intended only for the addressees herein and may not be copied
> or forwarded without permission from SeaChange International.
>



-- 
http://twitter.com/tjake

Re: Multiple cassandra instances per physical node

Posted by Ken Hancock <ke...@schange.com>.
I had the exact same question, but I think this is what Nate was thinking:

If you're running multiple nodes on a single server, vnodes give you no
control over which instance has which key (whereas you can assign initial
tokens).  Therefore you could have two of your three replicas on the same
physical server which, if it goes down, you can't read or write at quorum.

However, can't you use the topology snitch to put both nodes in the same
rack?  Won't that prevent the issue and still allow you to maintain quorum
if a single server goes down?  If I have a 20-node cluster with 2 nodes on
each physical server, can I use 10 racks to properly segment my partitions?



On Sun, May 24, 2015 at 5:38 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> What impact would vnodes have on strong consistency?  I think the problem
> you're describing exists with or without them.
>
> On Sat, May 23, 2015 at 2:30 PM Nate McCall <na...@thelastpickle.com>
> wrote:
>
>>
>>> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
>>> nodes (each with 5 data disks, 1 commit log disk) and either give each its
>>> own container & IP or change the listen ports. Will this work? What are the
>>> risks? Will/should Cassandra support this better in the future?
>>>
>>
>> Don't use vnodes if any operations need strong consistency (reading or
>> writing at quorum). Otherwise, at RF=3, if you loose a single node you will
>> only have one 1 replica left for some portion of the ring.
>>
>>
>>
>> --
>> -----------------
>> Nate McCall
>> Austin, TX
>> @zznate
>>
>> Co-Founder & Sr. Technical Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>


-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
<http://www.schange.com/en-US/Company/InvestorRelations.aspx>
Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hancock@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn]
<http://www.linkedin.com/in/kenhancock>

[image: SeaChange International]
<http://www.schange.com/>This e-mail and any attachments may contain
information which is SeaChange International confidential. The information
enclosed is intended only for the addressees herein and may not be copied
or forwarded without permission from SeaChange International.

Re: Multiple cassandra instances per physical node

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
What impact would vnodes have on strong consistency?  I think the problem
you're describing exists with or without them.

On Sat, May 23, 2015 at 2:30 PM Nate McCall <na...@thelastpickle.com> wrote:

>
>> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
>> nodes (each with 5 data disks, 1 commit log disk) and either give each its
>> own container & IP or change the listen ports. Will this work? What are the
>> risks? Will/should Cassandra support this better in the future?
>>
>
> Don't use vnodes if any operations need strong consistency (reading or
> writing at quorum). Otherwise, at RF=3, if you loose a single node you will
> only have one 1 replica left for some portion of the ring.
>
>
>
> --
> -----------------
> Nate McCall
> Austin, TX
> @zznate
>
> Co-Founder & Sr. Technical Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

Re: Multiple cassandra instances per physical node

Posted by Nate McCall <na...@thelastpickle.com>.
>
>
> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes
> (each with 5 data disks, 1 commit log disk) and either give each its own
> container & IP or change the listen ports. Will this work? What are the
> risks? Will/should Cassandra support this better in the future?
>

Don't use vnodes if any operations need strong consistency (reading or
writing at quorum). Otherwise, at RF=3, if you loose a single node you will
only have one 1 replica left for some portion of the ring.



-- 
-----------------
Nate McCall
Austin, TX
@zznate

Co-Founder & Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: Multiple cassandra instances per physical node

Posted by Carlos Rolo <ro...@pythian.com>.
Hi,

I also advice against multiple instances on the same hardware. If you have
really big boxes why not virtualize?

Other option is experiment with CCM. Although there are some limitations
with CCM (ex: JNA is disabled)

If you follow up on this I would to hear how it went.
Em 21/05/2015 19:33, "Dan Kinder" <dk...@turnitin.com> escreveu:

> Hi, I'd just like some clarity and advice regarding running multiple
> cassandra instances on a single large machine (big JBOD array, plenty of
> CPU/RAM).
>
> First, I am aware this was not Cassandra's original design, and doing this
> seems to unreasonably go against the "commodity hardware" intentions of
> Cassandra's design. In general it seems to be recommended against (at least
> as far as I've heard from @Rob Coli and others).
>
> However maybe this term "commodity" is changing... my hardware/ops team
> argues that due to cooling, power, and other datacenter costs, having
> slightly larger nodes (>=32G RAM, >=24 CPU, >=8 disks JBOD) is actually a
> better price point. Now, I am not a hardware guy, so if this is not
> actually true I'd love to hear why, otherwise I pretty much need to take
> them at their word.
>
> Now, Cassandra features seemed to have improved such that JBOD works
> fairly well, but especially with memory/GC this seems to be reaching its
> limit. One Cassandra instance can only scale up so much.
>
> So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes
> (each with 5 data disks, 1 commit log disk) and either give each its own
> container & IP or change the listen ports. Will this work? What are the
> risks? Will/should Cassandra support this better in the future?
>

-- 


--