You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Mimi Aluminium <mi...@gmail.com> on 2011/02/16 17:43:25 UTC

cluster size, several cluster on one node for multi-tenancy

Hi,
We are interested in a multi-tenancy environment, that may consist of up to
hundreds of data centers. The current design requires cross rack and cross
DC replication. Specifically, the per-tenant CFs will be replicated 6 times:
in three racks,  with 2 copies inside a rack, the racks will be located in
at least two different DCs. In the future other replication policies will be
considered. The application will decide where (which racks and DC)  to place
each tenant's replicas.  and it might be that one rack can hold more than
one tenant.

Separating each tenant in a different keyspace, as was suggested
in  previous mail thread in this subject, seems to be a good approach
(assuming the memtable problem will be solved somehow).
But then we had concern with regard to the cluster size.
and here are my questions:
1) Given the above, should I define one Cassandra cluster that hold all the
DCs? sounds not reasonable  given hundreds DCs tens of servers in each DC
etc. Where is the bottleneck here? keep-alive messages, the gossip, request
routing? what is the largest number of servers a cluster can bear?
2) Now assuming that I can create the per-tenant  keyspace only for  the
servers that in the three racks where the replicas are held,  does such
definition reduces the messaging transfer among the other servers. Does
Cassandra optimizes the message transfer in such case?
3) Additional possible solution was to create a separate clusters per each
tenant. But it can cause a situation where one server has to run two or more
Cassandra's clusters. Can we run more than one cluster in parallel, does it
means two cassandra daemons / instances on one server? what will be the
overhead? do you have a link that explains how to deal with it?

Please can you help me to decide which of these solution can work or you are
welcome to suggest something else.
Thanks a lot,
Mimi

Re: cluster size, several cluster on one node for multi-tenancy

Posted by Mimi Aluminium <mi...@gmail.com>.
Nick,
Assuming I have a tenant that has only one CF, and I am using NetworkAware
repliaction strategy where the keys of this  CF are replicated 3 times, each
copy in a different DC (DC1,DC2,DC3)
Now lets assume the cluster holds 5 DCs. As far as I understand only the
servers that belong to the three DCs that hold a copy will build this
CF's memtable. The servers that belong to the other 2 DCs  (DC4,DC5) wont
have evidence to these CF nor this keyspace, am I correct?

I have additional more basic question as follows:
Is there a way to define two clusters on the same node?  Is it by
configuration in the storage-conf file or does it means additional Cassandra
daemon?
Thanks a lot,
Miriam

On Fri, Feb 18, 2011 at 12:08 PM, Nick Telford <ni...@gmail.com>wrote:

> Large numbers of keyspaces/column-families are not a good ideas as each
> column-family memtable requires it's own memory. If you have 1000 tenants in
> the same cluster, each with only 1 CF, regardless of the cluster size
> *every* node will require 1 memtable per tenant CF - 1000 memtables.
>
> This limitation is the primary reason for workarounds (such as "virtual
> keyspaces") to enable multi-tenant setups.
>
> You might have more luck partitioning tenants in to different clusters, but
> then you end up with potential hot-spots (where more active tenants generate
> more load on a specific cluster).
>
> Regards,
> Nick
>
>
> On 18 February 2011 09:55, Mimi Aluminium <mi...@gmail.com>wrote:
>
>>  Thanks a lot for you suggestions,
>> I will check the virtual keyspace solution - btw, currently I am using
>> Thrift client with Pycassa, I am not familiar with Hector - does it mean
>> we'll need to move to Hector client?
>>
>> I thought of using keyspaces for each tenant, but I dont understand how to
>> define the whole cluster. Meaning, assuming the tenants are distributed
>> (replicated) across hundreds  of DCs each consists of tens of racks and
>> servers, so can I define a single cassandra cluster for all the servers? it
>> does not seem to be reasonable , this is the reason I thought of sepearating
>> the clusters. Please let me know how would you solve it?
>> Thanks,
>> Miriam
>>
>>
>>
>> On Thu, Feb 17, 2011 at 10:30 PM, Nate McCall <na...@datastax.com> wrote:
>>
>>> Hector's virtual keyspaces would work well for what you describe. Ed
>>> Anuff, who added this feature to Hector, showed me a working
>>> multi-tennancy based app the other day and it worked quite well.
>>>
>>> On Thu, Feb 17, 2011 at 1:44 PM, Norman Maurer <no...@apache.org>
>>> wrote:
>>> > Maybe you could make use of "Virtual Keyspaces".
>>> >
>>> > See this wiki for the idea:
>>> > https://github.com/rantav/hector/wiki/Virtual-Keyspaces
>>> >
>>> > Bye,
>>> > Norman
>>> >
>>> > 2011/2/17 Frank LoVecchio <fr...@isidorey.com>:
>>> >> Why not just create some sort of ACL on the client side and use one
>>> >> Keyspace?  It's a lot less management.
>>> >>
>>> >> On Thu, Feb 17, 2011 at 12:34 PM, Mimi Aluminium <
>>> mimi.aluminium@gmail.com>
>>> >> wrote:
>>> >>>
>>> >>> Hi,
>>> >>> I really need your help in this matter.
>>> >>> I will try to simplify my problem and ask specific questions
>>> >>>
>>> >>> I am thinking of solving the multi-tenancy problem by providing a
>>> separate
>>> >>> cluster per each tenant. Does it sound reasonable?
>>> >>> I can end-up with one node belongs to several clusters.
>>> >>> Does Cassandra support several clusters per node? Does it mean
>>> several
>>> >>> Cassandra daemons on each node? Do you recommend doing that ? what is
>>> the
>>> >>> overhead? is there any link that explain how to do that?
>>> >>>
>>> >>> Thanks a lot,
>>> >>> Mimi
>>> >>>
>>> >>>
>>> >>> On Wed, Feb 16, 2011 at 6:43 PM, Mimi Aluminium <
>>> mimi.aluminium@gmail.com>
>>> >>> wrote:
>>> >>>>
>>> >>>> Hi,
>>> >>>> We are interested in a multi-tenancy environment, that may consist
>>> of up
>>> >>>> to hundreds of data centers. The current design requires cross rack
>>> and
>>> >>>> cross DC replication. Specifically, the per-tenant CFs will be
>>> replicated 6
>>> >>>> times: in three racks,  with 2 copies inside a rack, the racks will
>>> be
>>> >>>> located in at least two different DCs. In the future other
>>> replication
>>> >>>> policies will be considered. The application will decide where
>>> (which racks
>>> >>>> and DC)  to place each tenant's replicas.  and it might be that one
>>> rack can
>>> >>>> hold more than one tenant.
>>> >>>>
>>> >>>> Separating each tenant in a different keyspace, as was suggested
>>> >>>> in  previous mail thread in this subject, seems to be a good
>>> approach
>>> >>>> (assuming the memtable problem will be solved somehow).
>>> >>>> But then we had concern with regard to the cluster size.
>>> >>>> and here are my questions:
>>> >>>> 1) Given the above, should I define one Cassandra cluster that hold
>>> all
>>> >>>> the DCs? sounds not reasonable  given hundreds DCs tens of servers
>>> in each
>>> >>>> DC etc. Where is the bottleneck here? keep-alive messages, the
>>> gossip,
>>> >>>> request routing? what is the largest number of servers a cluster can
>>> bear?
>>> >>>> 2) Now assuming that I can create the per-tenant  keyspace only for
>>>  the
>>> >>>> servers that in the three racks where the replicas are held,  does
>>> such
>>> >>>> definition reduces the messaging transfer among the other servers.
>>> Does
>>> >>>> Cassandra optimizes the message transfer in such case?
>>> >>>> 3) Additional possible solution was to create a separate clusters
>>> per
>>> >>>> each tenant. But it can cause a situation where one server has to
>>> run two or
>>> >>>> more Cassandra's clusters. Can we run more than one cluster in
>>> parallel,
>>> >>>> does it means two cassandra daemons / instances on one server? what
>>> will be
>>> >>>> the overhead? do you have a link that explains how to deal with it?
>>> >>>>
>>> >>>> Please can you help me to decide which of these solution can work or
>>> you
>>> >>>> are welcome to suggest something else.
>>> >>>> Thanks a lot,
>>> >>>> Mimi
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Frank LoVecchio
>>> >> Senior Software Engineer | Isidorey, LLC
>>> >> Google Voice +1.720.295.9179
>>> >> isidorey.com | facebook.com/franklovecchio | franklovecchio.com
>>> >>
>>> >
>>>
>>
>>
>

Re: cluster size, several cluster on one node for multi-tenancy

Posted by Mimi Aluminium <mi...@gmail.com>.
Thanks a lot for you suggestions,
I will check the virtual keyspace solution - btw, currently I am using
Thrift client with Pycassa, I am not familiar with Hector - does it mean
we'll need to move to Hector client?

I thought of using keyspaces for each tenant, but I dont understand how to
define the whole cluster. Meaning, assuming the tenants are distributed
(replicated) across hundreds  of DCs each consists of tens of racks and
servers, so can I define a single cassandra cluster for all the servers? it
does not seem to be reasonable , this is the reason I thought of sepearating
the clusters. Please let me know how would you solve it?
Thanks,
Miriam



On Thu, Feb 17, 2011 at 10:30 PM, Nate McCall <na...@datastax.com> wrote:

> Hector's virtual keyspaces would work well for what you describe. Ed
> Anuff, who added this feature to Hector, showed me a working
> multi-tennancy based app the other day and it worked quite well.
>
> On Thu, Feb 17, 2011 at 1:44 PM, Norman Maurer <no...@apache.org> wrote:
> > Maybe you could make use of "Virtual Keyspaces".
> >
> > See this wiki for the idea:
> > https://github.com/rantav/hector/wiki/Virtual-Keyspaces
> >
> > Bye,
> > Norman
> >
> > 2011/2/17 Frank LoVecchio <fr...@isidorey.com>:
> >> Why not just create some sort of ACL on the client side and use one
> >> Keyspace?  It's a lot less management.
> >>
> >> On Thu, Feb 17, 2011 at 12:34 PM, Mimi Aluminium <
> mimi.aluminium@gmail.com>
> >> wrote:
> >>>
> >>> Hi,
> >>> I really need your help in this matter.
> >>> I will try to simplify my problem and ask specific questions
> >>>
> >>> I am thinking of solving the multi-tenancy problem by providing a
> separate
> >>> cluster per each tenant. Does it sound reasonable?
> >>> I can end-up with one node belongs to several clusters.
> >>> Does Cassandra support several clusters per node? Does it mean several
> >>> Cassandra daemons on each node? Do you recommend doing that ? what is
> the
> >>> overhead? is there any link that explain how to do that?
> >>>
> >>> Thanks a lot,
> >>> Mimi
> >>>
> >>>
> >>> On Wed, Feb 16, 2011 at 6:43 PM, Mimi Aluminium <
> mimi.aluminium@gmail.com>
> >>> wrote:
> >>>>
> >>>> Hi,
> >>>> We are interested in a multi-tenancy environment, that may consist of
> up
> >>>> to hundreds of data centers. The current design requires cross rack
> and
> >>>> cross DC replication. Specifically, the per-tenant CFs will be
> replicated 6
> >>>> times: in three racks,  with 2 copies inside a rack, the racks will be
> >>>> located in at least two different DCs. In the future other replication
> >>>> policies will be considered. The application will decide where (which
> racks
> >>>> and DC)  to place each tenant's replicas.  and it might be that one
> rack can
> >>>> hold more than one tenant.
> >>>>
> >>>> Separating each tenant in a different keyspace, as was suggested
> >>>> in  previous mail thread in this subject, seems to be a good approach
> >>>> (assuming the memtable problem will be solved somehow).
> >>>> But then we had concern with regard to the cluster size.
> >>>> and here are my questions:
> >>>> 1) Given the above, should I define one Cassandra cluster that hold
> all
> >>>> the DCs? sounds not reasonable  given hundreds DCs tens of servers in
> each
> >>>> DC etc. Where is the bottleneck here? keep-alive messages, the gossip,
> >>>> request routing? what is the largest number of servers a cluster can
> bear?
> >>>> 2) Now assuming that I can create the per-tenant  keyspace only for
>  the
> >>>> servers that in the three racks where the replicas are held,  does
> such
> >>>> definition reduces the messaging transfer among the other servers.
> Does
> >>>> Cassandra optimizes the message transfer in such case?
> >>>> 3) Additional possible solution was to create a separate clusters per
> >>>> each tenant. But it can cause a situation where one server has to run
> two or
> >>>> more Cassandra's clusters. Can we run more than one cluster in
> parallel,
> >>>> does it means two cassandra daemons / instances on one server? what
> will be
> >>>> the overhead? do you have a link that explains how to deal with it?
> >>>>
> >>>> Please can you help me to decide which of these solution can work or
> you
> >>>> are welcome to suggest something else.
> >>>> Thanks a lot,
> >>>> Mimi
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>
> >>
> >>
> >> --
> >> Frank LoVecchio
> >> Senior Software Engineer | Isidorey, LLC
> >> Google Voice +1.720.295.9179
> >> isidorey.com | facebook.com/franklovecchio | franklovecchio.com
> >>
> >
>

Re: cluster size, several cluster on one node for multi-tenancy

Posted by Nate McCall <na...@datastax.com>.
Hector's virtual keyspaces would work well for what you describe. Ed
Anuff, who added this feature to Hector, showed me a working
multi-tennancy based app the other day and it worked quite well.

On Thu, Feb 17, 2011 at 1:44 PM, Norman Maurer <no...@apache.org> wrote:
> Maybe you could make use of "Virtual Keyspaces".
>
> See this wiki for the idea:
> https://github.com/rantav/hector/wiki/Virtual-Keyspaces
>
> Bye,
> Norman
>
> 2011/2/17 Frank LoVecchio <fr...@isidorey.com>:
>> Why not just create some sort of ACL on the client side and use one
>> Keyspace?  It's a lot less management.
>>
>> On Thu, Feb 17, 2011 at 12:34 PM, Mimi Aluminium <mi...@gmail.com>
>> wrote:
>>>
>>> Hi,
>>> I really need your help in this matter.
>>> I will try to simplify my problem and ask specific questions
>>>
>>> I am thinking of solving the multi-tenancy problem by providing a separate
>>> cluster per each tenant. Does it sound reasonable?
>>> I can end-up with one node belongs to several clusters.
>>> Does Cassandra support several clusters per node? Does it mean several
>>> Cassandra daemons on each node? Do you recommend doing that ? what is the
>>> overhead? is there any link that explain how to do that?
>>>
>>> Thanks a lot,
>>> Mimi
>>>
>>>
>>> On Wed, Feb 16, 2011 at 6:43 PM, Mimi Aluminium <mi...@gmail.com>
>>> wrote:
>>>>
>>>> Hi,
>>>> We are interested in a multi-tenancy environment, that may consist of up
>>>> to hundreds of data centers. The current design requires cross rack and
>>>> cross DC replication. Specifically, the per-tenant CFs will be replicated 6
>>>> times: in three racks,  with 2 copies inside a rack, the racks will be
>>>> located in at least two different DCs. In the future other replication
>>>> policies will be considered. The application will decide where (which racks
>>>> and DC)  to place each tenant's replicas.  and it might be that one rack can
>>>> hold more than one tenant.
>>>>
>>>> Separating each tenant in a different keyspace, as was suggested
>>>> in  previous mail thread in this subject, seems to be a good approach
>>>> (assuming the memtable problem will be solved somehow).
>>>> But then we had concern with regard to the cluster size.
>>>> and here are my questions:
>>>> 1) Given the above, should I define one Cassandra cluster that hold all
>>>> the DCs? sounds not reasonable  given hundreds DCs tens of servers in each
>>>> DC etc. Where is the bottleneck here? keep-alive messages, the gossip,
>>>> request routing? what is the largest number of servers a cluster can bear?
>>>> 2) Now assuming that I can create the per-tenant  keyspace only for  the
>>>> servers that in the three racks where the replicas are held,  does such
>>>> definition reduces the messaging transfer among the other servers. Does
>>>> Cassandra optimizes the message transfer in such case?
>>>> 3) Additional possible solution was to create a separate clusters per
>>>> each tenant. But it can cause a situation where one server has to run two or
>>>> more Cassandra's clusters. Can we run more than one cluster in parallel,
>>>> does it means two cassandra daemons / instances on one server? what will be
>>>> the overhead? do you have a link that explains how to deal with it?
>>>>
>>>> Please can you help me to decide which of these solution can work or you
>>>> are welcome to suggest something else.
>>>> Thanks a lot,
>>>> Mimi
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>>
>>
>> --
>> Frank LoVecchio
>> Senior Software Engineer | Isidorey, LLC
>> Google Voice +1.720.295.9179
>> isidorey.com | facebook.com/franklovecchio | franklovecchio.com
>>
>

Re: cluster size, several cluster on one node for multi-tenancy

Posted by Norman Maurer <no...@apache.org>.
Maybe you could make use of "Virtual Keyspaces".

See this wiki for the idea:
https://github.com/rantav/hector/wiki/Virtual-Keyspaces

Bye,
Norman

2011/2/17 Frank LoVecchio <fr...@isidorey.com>:
> Why not just create some sort of ACL on the client side and use one
> Keyspace?  It's a lot less management.
>
> On Thu, Feb 17, 2011 at 12:34 PM, Mimi Aluminium <mi...@gmail.com>
> wrote:
>>
>> Hi,
>> I really need your help in this matter.
>> I will try to simplify my problem and ask specific questions
>>
>> I am thinking of solving the multi-tenancy problem by providing a separate
>> cluster per each tenant. Does it sound reasonable?
>> I can end-up with one node belongs to several clusters.
>> Does Cassandra support several clusters per node? Does it mean several
>> Cassandra daemons on each node? Do you recommend doing that ? what is the
>> overhead? is there any link that explain how to do that?
>>
>> Thanks a lot,
>> Mimi
>>
>>
>> On Wed, Feb 16, 2011 at 6:43 PM, Mimi Aluminium <mi...@gmail.com>
>> wrote:
>>>
>>> Hi,
>>> We are interested in a multi-tenancy environment, that may consist of up
>>> to hundreds of data centers. The current design requires cross rack and
>>> cross DC replication. Specifically, the per-tenant CFs will be replicated 6
>>> times: in three racks,  with 2 copies inside a rack, the racks will be
>>> located in at least two different DCs. In the future other replication
>>> policies will be considered. The application will decide where (which racks
>>> and DC)  to place each tenant's replicas.  and it might be that one rack can
>>> hold more than one tenant.
>>>
>>> Separating each tenant in a different keyspace, as was suggested
>>> in  previous mail thread in this subject, seems to be a good approach
>>> (assuming the memtable problem will be solved somehow).
>>> But then we had concern with regard to the cluster size.
>>> and here are my questions:
>>> 1) Given the above, should I define one Cassandra cluster that hold all
>>> the DCs? sounds not reasonable  given hundreds DCs tens of servers in each
>>> DC etc. Where is the bottleneck here? keep-alive messages, the gossip,
>>> request routing? what is the largest number of servers a cluster can bear?
>>> 2) Now assuming that I can create the per-tenant  keyspace only for  the
>>> servers that in the three racks where the replicas are held,  does such
>>> definition reduces the messaging transfer among the other servers. Does
>>> Cassandra optimizes the message transfer in such case?
>>> 3) Additional possible solution was to create a separate clusters per
>>> each tenant. But it can cause a situation where one server has to run two or
>>> more Cassandra's clusters. Can we run more than one cluster in parallel,
>>> does it means two cassandra daemons / instances on one server? what will be
>>> the overhead? do you have a link that explains how to deal with it?
>>>
>>> Please can you help me to decide which of these solution can work or you
>>> are welcome to suggest something else.
>>> Thanks a lot,
>>> Mimi
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>
>
>
> --
> Frank LoVecchio
> Senior Software Engineer | Isidorey, LLC
> Google Voice +1.720.295.9179
> isidorey.com | facebook.com/franklovecchio | franklovecchio.com
>

Re: cluster size, several cluster on one node for multi-tenancy

Posted by Frank LoVecchio <fr...@isidorey.com>.
Why not just create some sort of ACL on the client side and use one
Keyspace?  It's a lot less management.

On Thu, Feb 17, 2011 at 12:34 PM, Mimi Aluminium
<mi...@gmail.com>wrote:

> Hi,
> I really need your help in this matter.
> I will try to simplify my problem and ask specific questions
>
> I am thinking of solving the multi-tenancy problem by providing a separate
> cluster per each tenant. Does it sound reasonable?
> I can end-up with one node belongs to several clusters.
> Does Cassandra support several clusters per node? Does it mean several
> Cassandra daemons on each node? Do you recommend doing that ? what is the
> overhead? is there any link that explain how to do that?
>
> Thanks a lot,
> Mimi
>
>
> On Wed, Feb 16, 2011 at 6:43 PM, Mimi Aluminium <mi...@gmail.com>wrote:
>
>>  Hi,
>> We are interested in a multi-tenancy environment, that may consist of up
>> to hundreds of data centers. The current design requires cross rack and
>> cross DC replication. Specifically, the per-tenant CFs will be replicated 6
>> times: in three racks,  with 2 copies inside a rack, the racks will be
>> located in at least two different DCs. In the future other replication
>> policies will be considered. The application will decide where (which racks
>> and DC)  to place each tenant's replicas.  and it might be that one rack can
>> hold more than one tenant.
>>
>> Separating each tenant in a different keyspace, as was suggested
>> in  previous mail thread in this subject, seems to be a good approach
>> (assuming the memtable problem will be solved somehow).
>> But then we had concern with regard to the cluster size.
>> and here are my questions:
>> 1) Given the above, should I define one Cassandra cluster that hold all
>> the DCs? sounds not reasonable  given hundreds DCs tens of servers in each
>> DC etc. Where is the bottleneck here? keep-alive messages, the gossip,
>> request routing? what is the largest number of servers a cluster can bear?
>> 2) Now assuming that I can create the per-tenant  keyspace only for  the
>> servers that in the three racks where the replicas are held,  does such
>> definition reduces the messaging transfer among the other servers. Does
>> Cassandra optimizes the message transfer in such case?
>> 3) Additional possible solution was to create a separate clusters per each
>> tenant. But it can cause a situation where one server has to run two or more
>> Cassandra's clusters. Can we run more than one cluster in parallel, does it
>> means two cassandra daemons / instances on one server? what will be the
>> overhead? do you have a link that explains how to deal with it?
>>
>> Please can you help me to decide which of these solution can work or you
>> are welcome to suggest something else.
>> Thanks a lot,
>> Mimi
>>
>>
>>
>>
>>
>>
>>
>>
>
>


-- 
Frank LoVecchio
Senior Software Engineer | Isidorey, LLC
Google Voice +1.720.295.9179
isidorey.com | facebook.com/franklovecchio | franklovecchio.com

Re: cluster size, several cluster on one node for multi-tenancy

Posted by Mimi Aluminium <mi...@gmail.com>.
Hi,
I really need your help in this matter.
I will try to simplify my problem and ask specific questions

I am thinking of solving the multi-tenancy problem by providing a separate
cluster per each tenant. Does it sound reasonable?
I can end-up with one node belongs to several clusters.
Does Cassandra support several clusters per node? Does it mean several
Cassandra daemons on each node? Do you recommend doing that ? what is the
overhead? is there any link that explain how to do that?

Thanks a lot,
Mimi


On Wed, Feb 16, 2011 at 6:43 PM, Mimi Aluminium <mi...@gmail.com>wrote:

>  Hi,
> We are interested in a multi-tenancy environment, that may consist of up to
> hundreds of data centers. The current design requires cross rack and cross
> DC replication. Specifically, the per-tenant CFs will be replicated 6 times:
> in three racks,  with 2 copies inside a rack, the racks will be located in
> at least two different DCs. In the future other replication policies will be
> considered. The application will decide where (which racks and DC)  to place
> each tenant's replicas.  and it might be that one rack can hold more than
> one tenant.
>
> Separating each tenant in a different keyspace, as was suggested
> in  previous mail thread in this subject, seems to be a good approach
> (assuming the memtable problem will be solved somehow).
> But then we had concern with regard to the cluster size.
> and here are my questions:
> 1) Given the above, should I define one Cassandra cluster that hold all the
> DCs? sounds not reasonable  given hundreds DCs tens of servers in each DC
> etc. Where is the bottleneck here? keep-alive messages, the gossip, request
> routing? what is the largest number of servers a cluster can bear?
> 2) Now assuming that I can create the per-tenant  keyspace only for  the
> servers that in the three racks where the replicas are held,  does such
> definition reduces the messaging transfer among the other servers. Does
> Cassandra optimizes the message transfer in such case?
> 3) Additional possible solution was to create a separate clusters per each
> tenant. But it can cause a situation where one server has to run two or more
> Cassandra's clusters. Can we run more than one cluster in parallel, does it
> means two cassandra daemons / instances on one server? what will be the
> overhead? do you have a link that explains how to deal with it?
>
> Please can you help me to decide which of these solution can work or you
> are welcome to suggest something else.
> Thanks a lot,
> Mimi
>
>
>
>
>
>
>
>

Re: cluster size, several cluster on one node for multi-tenancy

Posted by Mimi Aluminium <mi...@gmail.com>.
Hi,
I really need your help in this matter.
I will try to simplify my problem and ask specific questions

I am thinking of solving the multi-tenancy problem by providing a separate
cluster per each tenant. Does it sound reasonable?
I can end-up with one node belongs to several clusters.
Does Cassandra support several clusters per node? Does it mean several
Cassandra daemons on each node? Do you recommend doing that ? what is the
overhead? is there any link that explain how to do that?

Thanks a lot,
Mimi


On Wed, Feb 16, 2011 at 6:43 PM, Mimi Aluminium <mi...@gmail.com>wrote:

>  Hi,
> We are interested in a multi-tenancy environment, that may consist of up to
> hundreds of data centers. The current design requires cross rack and cross
> DC replication. Specifically, the per-tenant CFs will be replicated 6 times:
> in three racks,  with 2 copies inside a rack, the racks will be located in
> at least two different DCs. In the future other replication policies will be
> considered. The application will decide where (which racks and DC)  to place
> each tenant's replicas.  and it might be that one rack can hold more than
> one tenant.
>
> Separating each tenant in a different keyspace, as was suggested
> in  previous mail thread in this subject, seems to be a good approach
> (assuming the memtable problem will be solved somehow).
> But then we had concern with regard to the cluster size.
> and here are my questions:
> 1) Given the above, should I define one Cassandra cluster that hold all the
> DCs? sounds not reasonable  given hundreds DCs tens of servers in each DC
> etc. Where is the bottleneck here? keep-alive messages, the gossip, request
> routing? what is the largest number of servers a cluster can bear?
> 2) Now assuming that I can create the per-tenant  keyspace only for  the
> servers that in the three racks where the replicas are held,  does such
> definition reduces the messaging transfer among the other servers. Does
> Cassandra optimizes the message transfer in such case?
> 3) Additional possible solution was to create a separate clusters per each
> tenant. But it can cause a situation where one server has to run two or more
> Cassandra's clusters. Can we run more than one cluster in parallel, does it
> means two cassandra daemons / instances on one server? what will be the
> overhead? do you have a link that explains how to deal with it?
>
> Please can you help me to decide which of these solution can work or you
> are welcome to suggest something else.
> Thanks a lot,
> Mimi
>
>
>
>
>
>
>
>