You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Guofeng Zhang <gu...@gmail.com> on 2011/08/25 07:08:05 UTC

For multi-tenant, is it good to have a key space for each tenant?

I wonder if it is a good practice to create a key space for each tenant. Any
advice is appreciated.

Thanks

Re: For multi-tenant, is it good to have a key space for each tenant?

Posted by Ryan Lowe <ry...@gmail.com>.
I've been doing multi-tenant with cassandra for a while, and from what I
have found, it is better to keep your keyspaces down in number.  That said,
I have been using composite keys for my multi-tenancy now and it works
great:

Column Family: User
Key:  [AccountId]/[UserId]

This makes it super handy especially if you use the order preserving
partitioner with range queries... If for example I want all of the users in
account 14, I can do this range query:

get User["14/":"14/~"];

But I am no great expert... just someone who is trying and loving cassandra!

Ryan

On Thu, Aug 25, 2011 at 1:20 AM, Himanshi Sharma <hi...@tcs.com>wrote:

>
> I am working on similar sort of stuff. As per my knowledge, creating
> keyspace for each tenant would impose lot of memory constraints.
>
> Following Shared Keyspace and Shared Column families would be a better
> approach. And each row in CF could be referred by tenant_id as row key.
> And again it depends on the type of application.
>
> Hey this is just a suggestion, m not completely sure.. :)
>
>
> Himanshi Sharma
>
>
>
>
>  From: Guofeng Zhang <gu...@gmail.com> To: user@cassandra.apache.org
> Date: 08/25/2011 10:38 AM Subject: For multi-tenant, is it good to have a
> key space for each tenant?
> ------------------------------
>
>
>
> I wonder if it is a good practice to create a key space for each tenant.
> Any advice is appreciated.
>
> Thanks
>
>
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>

Re: For multi-tenant, is it good to have a key space for each tenant?

Posted by Edward Capriolo <ed...@gmail.com>.
On Fri, Oct 7, 2011 at 10:47 AM, David McNelis
<dm...@agentisenergy.com>wrote:

> So at the end of the day its going to be limited by available memory then?
>  Going by this line:
>
> Do note that a minimum of 1MB per memtable is used by the per-memtable
> arena allocator <https://issues.apache.org/jira/browse/CASSANDRA-2252> also
> introduced in 1.0, which is worth keeping in mind if you are looking at
> going from thousands to tens of thousands of ColumnFamilies.
>
>   Then you'd be looking at a requirement of 1gig  memory for each  1024
> column families had on your cluster, regardless of number of keyspaces those
> lived in.  So in order to have 10s of thousands of CFs, one would need 10s
> of gigs of ram on each node just to handle that overhead...at least as  of
>  v1.
>
> On Fri, Oct 7, 2011 at 9:40 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>
>> On Fri, Oct 7, 2011 at 9:36 AM, David McNelis
>> <dm...@agentisenergy.com> wrote:
>> > In some documentation I've read it says that
>> > keyspace's take up the majority of the resources
>>
>> This has never been the case.
>>
>> > in a couple of older
>> > threads they talked about getting the number of column families down.
>>
>> This was good advice pre-0.8.
>>
>> I covered the state of 0.8 and 1.0 here:
>>
>> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>
>
> --
> *David McNelis*
> Lead Software Engineer
> Agentis Energy
> www.agentisenergy.com
> o: 630.359.6395
> c: 219.384.5143
>
> *A Smart Grid technology company focused on helping consumers of energy
> control an often under-managed resource.*
>
>
>
You do not want to do multiple keyspaces because a Cassandra client needs to
do an RPC call to change keyspace. There will be no effective way to keep a
connect pool warm for each keyspace. Imagine you if 1000 keyspaces and 1000
servers. That is a lot of connections.

This shortcoming of having to make an RPC to change keyspace was my
motivation for suggesting:

https://issues.apache.org/jira/browse/CASSANDRA-3130

Re: For multi-tenant, is it good to have a key space for each tenant?

Posted by David McNelis <dm...@agentisenergy.com>.
So at the end of the day its going to be limited by available memory then?
 Going by this line:

Do note that a minimum of 1MB per memtable is used by the per-memtable arena
allocator <https://issues.apache.org/jira/browse/CASSANDRA-2252> also
introduced in 1.0, which is worth keeping in mind if you are looking at
going from thousands to tens of thousands of ColumnFamilies.

  Then you'd be looking at a requirement of 1gig  memory for each  1024
column families had on your cluster, regardless of number of keyspaces those
lived in.  So in order to have 10s of thousands of CFs, one would need 10s
of gigs of ram on each node just to handle that overhead...at least as  of
 v1.

On Fri, Oct 7, 2011 at 9:40 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> On Fri, Oct 7, 2011 at 9:36 AM, David McNelis
> <dm...@agentisenergy.com> wrote:
> > In some documentation I've read it says that
> > keyspace's take up the majority of the resources
>
> This has never been the case.
>
> > in a couple of older
> > threads they talked about getting the number of column families down.
>
> This was good advice pre-0.8.
>
> I covered the state of 0.8 and 1.0 here:
>
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
o: 630.359.6395
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*

Re: For multi-tenant, is it good to have a key space for each tenant?

Posted by Jonathan Ellis <jb...@gmail.com>.
On Fri, Oct 7, 2011 at 9:36 AM, David McNelis
<dm...@agentisenergy.com> wrote:
> In some documentation I've read it says that
> keyspace's take up the majority of the resources

This has never been the case.

> in a couple of older
> threads they talked about getting the number of column families down.

This was good advice pre-0.8.

I covered the state of 0.8 and 1.0 here:
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: For multi-tenant, is it good to have a key space for each tenant?

Posted by David McNelis <dm...@agentisenergy.com>.
Reviving this thread...

Say you want to enable potentially thousands of tenants with their own sets
of column families?  In this situation a keyspace for each tenant wouldn't
work it would seem...what issues would we likely see if we were to build out
thousands of column families (where a column family name contains a unique
identifier to the tenant)?  In some documentation I've read it says that
keyspace's take up the majority of the resources, in a couple of older
threads they talked about getting the number of column families down.

At the end of the day the best part of cassandra is not having to
shard...but in this  instance, it seems like multiple clusters would make
the most sense, where you'd need to find the keyspace/column family sweet
spot for your application.

Ideally I'd think you'd want to be able to have a keyspace for each tenant,
then as many column families as that tenant required, and be able to do it
on a single cluster.

Has anyone come up with a rule of thumb regarding number of keyspaces /
column families?

On Thu, Aug 25, 2011 at 1:13 PM, Nate McCall <na...@datastax.com> wrote:

> We have a 'virtual keyspaces' feature baked into the Hector client
> that might be of interest:
> https://github.com/rantav/hector/wiki/Virtual-Keyspaces
>
> On Thu, Aug 25, 2011 at 8:23 AM, Terje Marthinussen
> <tm...@gmail.com> wrote:
> >
> > Depends of course a lot on how many tenants you have.
> > Hopefully the new off heap memtables is 1.0 may help as well as java gc
> on large heaps is getting a much bigger issue than memory cost.
> > Regards,
> > Terje
> > On 25 Aug 2011, at 14:20, Himanshi Sharma <hi...@tcs.com>
> wrote:
> >
> >
> > I am working on similar sort of stuff. As per my knowledge, creating
> keyspace for each tenant would impose lot of memory constraints.
> >
> > Following Shared Keyspace and Shared Column families would be a better
> approach. And each row in CF could be referred by tenant_id as row key.
> > And again it depends on the type of application.
> >
> > Hey this is just a suggestion, m not completely sure.. :)
> >
> >
> > Himanshi Sharma
> >
> >
> >
> >
> > From: Guofeng Zhang <gu...@gmail.com>
> > To: user@cassandra.apache.org
> > Date: 08/25/2011 10:38 AM
> > Subject: For multi-tenant, is it good to have a key space for each
> tenant?
> > ________________________________
> >
> >
> > I wonder if it is a good practice to create a key space for each tenant.
> Any advice is appreciated.
> >
> > Thanks
> >
> >
> > =====-----=====-----=====
> > Notice: The information contained in this e-mail
> > message and/or attachments to it may contain
> > confidential or privileged information. If you are
> > not the intended recipient, any dissemination, use,
> > review, distribution, printing or copying of the
> > information contained in this e-mail message
> > and/or attachments to it are strictly prohibited. If
> > you have received this communication in error,
> > please notify us by reply e-mail or telephone and
> > immediately and permanently delete the message
> > and any attachments. Thank you
> >
> >
>



-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
o: 630.359.6395
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*

Re: For multi-tenant, is it good to have a key space for each tenant?

Posted by Nate McCall <na...@datastax.com>.
We have a 'virtual keyspaces' feature baked into the Hector client
that might be of interest:
https://github.com/rantav/hector/wiki/Virtual-Keyspaces

On Thu, Aug 25, 2011 at 8:23 AM, Terje Marthinussen
<tm...@gmail.com> wrote:
>
> Depends of course a lot on how many tenants you have.
> Hopefully the new off heap memtables is 1.0 may help as well as java gc on large heaps is getting a much bigger issue than memory cost.
> Regards,
> Terje
> On 25 Aug 2011, at 14:20, Himanshi Sharma <hi...@tcs.com> wrote:
>
>
> I am working on similar sort of stuff. As per my knowledge, creating keyspace for each tenant would impose lot of memory constraints.
>
> Following Shared Keyspace and Shared Column families would be a better approach. And each row in CF could be referred by tenant_id as row key.
> And again it depends on the type of application.
>
> Hey this is just a suggestion, m not completely sure.. :)
>
>
> Himanshi Sharma
>
>
>
>
> From: Guofeng Zhang <gu...@gmail.com>
> To: user@cassandra.apache.org
> Date: 08/25/2011 10:38 AM
> Subject: For multi-tenant, is it good to have a key space for each tenant?
> ________________________________
>
>
> I wonder if it is a good practice to create a key space for each tenant. Any advice is appreciated.
>
> Thanks
>
>
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>

Re: For multi-tenant, is it good to have a key space for each tenant?

Posted by Terje Marthinussen <tm...@gmail.com>.
Depends of course a lot on how many tenants you have.

Hopefully the new off heap memtables is 1.0 may help as well as java gc on large heaps is getting a much bigger issue than memory cost.

Regards,
Terje

On 25 Aug 2011, at 14:20, Himanshi Sharma <hi...@tcs.com> wrote:

> 
> I am working on similar sort of stuff. As per my knowledge, creating keyspace for each tenant would impose lot of memory constraints. 
> 
> Following Shared Keyspace and Shared Column families would be a better approach. And each row in CF could be referred by tenant_id as row key. 
> And again it depends on the type of application. 
> 
> Hey this is just a suggestion, m not completely sure.. :) 
> 
> 
> Himanshi Sharma 
> 
> 
> 
> 
> From:	Guofeng Zhang <gu...@gmail.com>
> To:	user@cassandra.apache.org
> Date:	08/25/2011 10:38 AM
> Subject: 	For multi-tenant, is it good to have a key space for each tenant?
> 
> 
> 
> 
> I wonder if it is a good practice to create a key space for each tenant. Any advice is appreciated. 
> 
> Thanks 
> 
> 
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain 
> confidential or privileged information. If you are 
> not the intended recipient, any dissemination, use, 
> review, distribution, printing or copying of the 
> information contained in this e-mail message 
> and/or attachments to it are strictly prohibited. If 
> you have received this communication in error, 
> please notify us by reply e-mail or telephone and 
> immediately and permanently delete the message 
> and any attachments. Thank you
> 
> 

Re: For multi-tenant, is it good to have a key space for each tenant?

Posted by Himanshi Sharma <hi...@tcs.com>.
I am working on similar sort of stuff. As per my knowledge, creating 
keyspace for each tenant would impose lot of memory constraints. 

Following Shared Keyspace and Shared Column families would be a better 
approach. And each row in CF could be referred by tenant_id as row key. 
And again it depends on the type of application. 

Hey this is just a suggestion, m not completely sure.. :)


Himanshi Sharma





From:
Guofeng Zhang <gu...@gmail.com>
To:
user@cassandra.apache.org
Date:
08/25/2011 10:38 AM
Subject:
For multi-tenant, is it good to have a key space for each tenant?



I wonder if it is a good practice to create a key space for each tenant. 
Any advice is appreciated.

Thanks


=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you