You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Dwight Smith <Dw...@alcatel-lucent.com> on 2010/07/08 17:45:15 UTC

Use of multiple Keyspaces

Hi

 

I am new to Cassandra and am preparing a data model for use in a
production environment, and need to decide if using multiple keyspaces
has any benefit.  

 

There are basically two types of data; the first,  large numbers (
~1750K) of entries which are written, very few reads, and then removed
after several seconds to several days. The keys are MD5 generated from
the content being written.  The second type, ~ 60K, entries written,
accessed with get_range_slices, then based on the time indicated in the
content, perform an action, then delete the specific entry from
Cassandra.  There are three columns for the second type, time to action
Key ( MD5 of action information ) - column TimeToScheduleAction, action
key to time - column ScheduledActionToTime, and finally action key to
action information - ActionToScheduledAction.

 

Currently these are members of two separate keyspaces.  Separate
keyspaces were chosen since the data volume was significantly different,
and as I understand, the memtables are dependent upon the data volume,
if KeysCached is not zero. Separate keyspaces would speed up the
memtable access for both.  In addition, it seems the compaction would
benefit.
 
Comments please
 
Thanks much
 
Dwight 

 


					
-------------------------------------------------------------------------------------------------------------------
CONFIDENTIALITY NOTICE: This e-mail and any files attached may contain confidential and proprietary information of Alcatel-Lucent and/or its affiliated entities. Access by the intended recipient only is authorized. Any liability arising from any party acting, or refraining from acting, on any information contained in this e-mail is hereby excluded. If you are not the intended recipient, please notify the sender immediately, destroy the original transmission and its attachments and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Copyright in this e-mail and any attachments belongs to Alcatel-Lucent and/or its affiliated entities.
					

Re: Use of multiple Keyspaces

Posted by Benjamin Black <b...@b3k.us>.
as rcoli just reminded me, i should be more clear that it is 1
_active_ memtable per CF, but there may be several pending flush.

space from deletions is only reclaimed after GCGraceSeconds has
elapsed AND a major compaction is run.  default for the former is 10
days.  the latter is not automatic.

On Thu, Jul 8, 2010 at 11:32 AM, Dwight Smith
<Dw...@alcatel-lucent.com> wrote:
> Thanks - I found on Wiki that the memtables and sstables are on a per CF
> basis.
>
> Sorry about the mail client formatting - I have no choice - corporate
> controlled:)
>
> Now I am concerned about the deletions - what areas should I investigate
> to understand the concerns you raise?
>
> Thanks again
>
> -----Original Message-----
> From: Benjamin Black [mailto:b@b3k.us]
> Sent: Thursday, July 08, 2010 11:28 AM
> To: user@cassandra.apache.org
> Subject: Re: Use of multiple Keyspaces
>
> (and I'm sure someone will correct me if I am wrong on that)
>
> On Thu, Jul 8, 2010 at 11:24 AM, Benjamin Black <b...@b3k.us> wrote:
>> There is a memtable per CF, regardless of how many keyspaces you have.
>
>
> -------------------------------------------------------------------------------------------------------------------
> CONFIDENTIALITY NOTICE: This e-mail and any files attached may contain confidential and proprietary information of Alcatel-Lucent and/or its affiliated entities. Access by the intended recipient only is authorized. Any liability arising from any party acting, or refraining from acting, on any information contained in this e-mail is hereby excluded. If you are not the intended recipient, please notify the sender immediately, destroy the original transmission and its attachments and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Copyright in this e-mail and any attachments belongs to Alcatel-Lucent and/or its affiliated entities.
>
>

RE: Use of multiple Keyspaces

Posted by Dwight Smith <Dw...@alcatel-lucent.com>.
Thanks - I found on Wiki that the memtables and sstables are on a per CF
basis. 

Sorry about the mail client formatting - I have no choice - corporate
controlled:)

Now I am concerned about the deletions - what areas should I investigate
to understand the concerns you raise?

Thanks again

-----Original Message-----
From: Benjamin Black [mailto:b@b3k.us] 
Sent: Thursday, July 08, 2010 11:28 AM
To: user@cassandra.apache.org
Subject: Re: Use of multiple Keyspaces

(and I'm sure someone will correct me if I am wrong on that)

On Thu, Jul 8, 2010 at 11:24 AM, Benjamin Black <b...@b3k.us> wrote:
> There is a memtable per CF, regardless of how many keyspaces you have.

					
-------------------------------------------------------------------------------------------------------------------
CONFIDENTIALITY NOTICE: This e-mail and any files attached may contain confidential and proprietary information of Alcatel-Lucent and/or its affiliated entities. Access by the intended recipient only is authorized. Any liability arising from any party acting, or refraining from acting, on any information contained in this e-mail is hereby excluded. If you are not the intended recipient, please notify the sender immediately, destroy the original transmission and its attachments and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Copyright in this e-mail and any attachments belongs to Alcatel-Lucent and/or its affiliated entities.
					

Re: Use of multiple Keyspaces

Posted by Benjamin Black <b...@b3k.us>.
(and I'm sure someone will correct me if I am wrong on that)

On Thu, Jul 8, 2010 at 11:24 AM, Benjamin Black <b...@b3k.us> wrote:
> There is a memtable per CF, regardless of how many keyspaces you have.

Re: Use of multiple Keyspaces

Posted by Benjamin Black <b...@b3k.us>.
There is a memtable per CF, regardless of how many keyspaces you have.
 I'd pay more
attention to the delete/compaction side of things if you are going to
be doing that many
deletions.

Also, your mail client's formatting is broken.


b

On Thu, Jul 8, 2010 at 8:45 AM, Dwight Smith
<Dw...@alcatel-lucent.com> wrote:
> Hi
>
>
>
> I am new to Cassandra and am preparing a data model for use in a production
> environment, and need to decide if using multiple keyspaces has any
> benefit.
>
>
>
> There are basically two types of data; the first,  large numbers ( ~1750K)
> of entries which are written, very few reads, and then removed after several
> seconds to several days. The keys are MD5 generated from the content being
> written.  The second type, ~ 60K, entries written, accessed with
> get_range_slices, then based on the time indicated in the content, perform
> an action, then delete the specific entry from Cassandra.  There are three
> columns for the second type, time to action Key ( MD5 of action information
> ) – column TimeToScheduleAction, action key to time – column
> ScheduledActionToTime, and finally action key to action information -
> ActionToScheduledAction.
>
>
>
> Currently these are members of two separate keyspaces.  Separate keyspaces
> were chosen since the data volume was significantly different, and as I
> understand, the memtables are dependent upon the data volume, if KeysCached
> is not zero. Separate keyspaces would speed up the memtable access for
> both.  In addition, it seems the compaction would benefit.
>
>
>
> Comments please
>
>
>
> Thanks much
>
>
>
> Dwight
>
>
>
>
>
> CONFIDENTIALITY NOTICE: This e-mail and any files attached may contain
> confidential and proprietary information of Alcatel-Lucent and/or its
> affiliated entities. Access by the intended recipient only is authorized.
> Any liability arising from any party acting, or refraining from acting, on
> any information contained in this e-mail is hereby excluded. If you are not
> the intended recipient, please notify the sender immediately, destroy the
> original transmission and its attachments and do not disclose the contents
> to any other person, use it for any purpose, or store or copy the
> information in any medium. Copyright in this e-mail and any attachments
> belongs to Alcatel-Lucent and/or its affiliated entities.