You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Vidhyashankar Venkataraman <vi...@yahoo-inc.com> on 2011/01/11 01:33:15 UTC

No minor compactions on a table built only on bulk loads

I have a table (in 0.90 candidate) whose writes are solely through bulk incremental loads. I had been running over a period of time and noticed that the Storefiles were not minor compacting or splitting. I eyeballed the code and observed that the (minor) compaction check is made at the time of deploying the region and during every commit (after a flush?) but never while bulk loading.

Can you let me know if this was deliberate/an existing bug?

Thank you
Vidhya

RE: No minor compactions on a table built only on bulk loads

Posted by Jonathan Gray <jg...@fb.com>.
It's not really a bug.

I think the assumption is that if you are at the level of doing your own bulk loads, you should also manage when you want to compact and split.  I know in cases where I've done this, I would usually know at certain points I would want to trigger major compactions.

At some level, bulk loads are about avoiding compactions so I don't think it would make sense that they automatically be triggered if solely bulk loads.

JG

> -----Original Message-----
> From: Vidhyashankar Venkataraman [mailto:vidhyash@yahoo-inc.com]
> Sent: Monday, January 10, 2011 4:33 PM
> To: hbase-user@hadoop.apache.org
> Cc: dev@hbase.apache.org
> Subject: No minor compactions on a table built only on bulk loads
> 
> I have a table (in 0.90 candidate) whose writes are solely through bulk
> incremental loads. I had been running over a period of time and noticed that
> the Storefiles were not minor compacting or splitting. I eyeballed the code
> and observed that the (minor) compaction check is made at the time of
> deploying the region and during every commit (after a flush?) but never
> while bulk loading.
> 
> Can you let me know if this was deliberate/an existing bug?
> 
> Thank you
> Vidhya

RE: No minor compactions on a table built only on bulk loads

Posted by Jonathan Gray <jg...@fb.com>.
It's not really a bug.

I think the assumption is that if you are at the level of doing your own bulk loads, you should also manage when you want to compact and split.  I know in cases where I've done this, I would usually know at certain points I would want to trigger major compactions.

At some level, bulk loads are about avoiding compactions so I don't think it would make sense that they automatically be triggered if solely bulk loads.

JG

> -----Original Message-----
> From: Vidhyashankar Venkataraman [mailto:vidhyash@yahoo-inc.com]
> Sent: Monday, January 10, 2011 4:33 PM
> To: hbase-user@hadoop.apache.org
> Cc: dev@hbase.apache.org
> Subject: No minor compactions on a table built only on bulk loads
> 
> I have a table (in 0.90 candidate) whose writes are solely through bulk
> incremental loads. I had been running over a period of time and noticed that
> the Storefiles were not minor compacting or splitting. I eyeballed the code
> and observed that the (minor) compaction check is made at the time of
> deploying the region and during every commit (after a flush?) but never
> while bulk loading.
> 
> Can you let me know if this was deliberate/an existing bug?
> 
> Thank you
> Vidhya