You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Michael Theroux <mt...@yahoo.com> on 2013/09/13 00:47:58 UTC

Issue with leveled compaction and data migration

Hello,

We've been undergoing a migration on Cassandra 1.1.9 where we are combining two column families.  We are incrementally moving data from one column family into another, where the columns in a row in the source column family are being appended to columns in a row in the target column family.  Both column families are using leveled compaction, and both column families have over 100 million rows.  

However, our bloom filters on the target column family grow dramatically (less than double) after converting less than 1/4 of the data.  I assume this is because new changes are not being compacted with older changes, although I thought leveled compaction would mitigate this for me. Any advice on what we can do to control our bloom filter growth during this migration?

Appreciate the help,
Thanks,
-Mike

Re: Issue with leveled compaction and data migration

Posted by Mike <mt...@yahoo.com>.
Thanks for the response Rob,

And yes, the relevel helped the bloom filter issue quite a bit, although it took a couple of days for the relevel to complete on a single node (so if anyone tried this, be prepared)

-Mike

Sent from my iPhone

On Sep 23, 2013, at 6:34 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Fri, Sep 13, 2013 at 4:27 AM, Michael Theroux <mt...@yahoo.com> wrote:
>> Another question on [the topic of row fragmentation when old rows get a large append to their "end" resulting in larger-than-expected bloom filters].
>> 
>> Would forcing the table to relevel help this situation?  I believe the process to do this on 1.1.X would be to stop cassandra, remove .json file, and restart cassandra.  Is this true?
> 
> I believe forcing a re-level would help, because each row would appear in fewer sstables and therefore fewer bloom filters.
> 
> Yes, that is the process to re-level on Cassandra 1.1.x.
>  
> =Rob

Re: Issue with leveled compaction and data migration

Posted by Robert Coli <rc...@eventbrite.com>.
On Fri, Sep 13, 2013 at 4:27 AM, Michael Theroux <mt...@yahoo.com>wrote:

> Another question on [the topic of row fragmentation when old rows get a
> large append to their "end" resulting in larger-than-expected bloom
> filters].
>
> Would forcing the table to relevel help this situation?  I believe the
> process to do this on 1.1.X would be to stop cassandra, remove .json file,
> and restart cassandra.  Is this true?
>

I believe forcing a re-level would help, because each row would appear in
fewer sstables and therefore fewer bloom filters.

Yes, that is the process to re-level on Cassandra 1.1.x.

=Rob

Re: Issue with leveled compaction and data migration

Posted by Michael Theroux <mt...@yahoo.com>.
Another question on this topic.

Would forcing the table to relevel help this situation?  I believe the process to do this on 1.1.X would be to stop cassandra, remove .json file, and restart cassandra.  Is this true?

Any help would be appreciated,
Thanks,
-Mike

On Sep 12, 2013, at 6:47 PM, Michael Theroux wrote:

> Hello,
> 
> We've been undergoing a migration on Cassandra 1.1.9 where we are combining two column families.  We are incrementally moving data from one column family into another, where the columns in a row in the source column family are being appended to columns in a row in the target column family.  Both column families are using leveled compaction, and both column families have over 100 million rows.  
> 
> However, our bloom filters on the target column family grow dramatically (less than double) after converting less than 1/4 of the data.  I assume this is because new changes are not being compacted with older changes, although I thought leveled compaction would mitigate this for me. Any advice on what we can do to control our bloom filter growth during this migration?
> 
> Appreciate the help,
> Thanks,
> -Mike