You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Todd Burruss <bb...@real.com> on 2010/07/13 01:33:38 UTC

GCGraceSeconds per ColumnFamily/Keyspace

I have two CFs in my keyspace.  one i care about allowing a good amount of time for tombstones to propagate (GCGraceSeconds large) ... but the other i couldn't care and in fact i want them gone ASAP so i don't iterate over them.  has any thought been given to making this setting per Keyspace or per ColumnFamily?

my scenario is that i add columns to rows in one CF, UserData, with logging data or activity, but we only want to keep, say 5000 columns per user.  So i also store the user's ID in another CF, PruneCollection, and periodically iterate over it using the IDs found in PruneCollection to "prune" the columns in UserData - and then immediately delete the ID from PruneCollection.  if the code is adding, say 50 IDs per second to PruneCollection then the number of deleted keys starts to build up, forcing my iterator to skip over large amounts of deleted keys.  With a small GCGraceSeconds these keys are removed nicely, but i can't do that because it affects the tombstones in UserData as well, which need to be propagated.

thoughts?

Re: GCGraceSeconds per ColumnFamily/Keyspace

Posted by Jonathan Ellis <jb...@gmail.com>.

GCGS per CF sounds totally reasonable to me.

On Mon, Jul 12, 2010 at 6:33 PM, Todd Burruss <bb...@real.com> wrote:
> I have two CFs in my keyspace.  one i care about allowing a good amount of time for tombstones to propagate (GCGraceSeconds large) ... but the other i couldn't care and in fact i want them gone ASAP so i don't iterate over them.  has any thought been given to making this setting per Keyspace or per ColumnFamily?
>
> my scenario is that i add columns to rows in one CF, UserData, with logging data or activity, but we only want to keep, say 5000 columns per user.  So i also store the user's ID in another CF, PruneCollection, and periodically iterate over it using the IDs found in PruneCollection to "prune" the columns in UserData - and then immediately delete the ID from PruneCollection.  if the code is adding, say 50 IDs per second to PruneCollection then the number of deleted keys starts to build up, forcing my iterator to skip over large amounts of deleted keys.  With a small GCGraceSeconds these keys are removed nicely, but i can't do that because it affects the tombstones in UserData as well, which need to be propagated.
>
> thoughts?



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com