You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2011/04/11 19:09:05 UTC

[jira] [Commented] (HBASE-3656) Merging flush; merge a flush with one of the existing store files (the smallest?) so we skip creating a new store file on each flush

    [ https://issues.apache.org/jira/browse/HBASE-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13018439#comment-13018439 ] 

Jean-Daniel Cryans commented on HBASE-3656:
-------------------------------------------

Talking on IRC with aph__ (yeah that's his handle), he was running imports on a small 3 nodes cluster with heap set at 1.5GB and saw abysmal performance. While we told him to get some real hardware because tuning his current setup would be a massive waste of time, I was thinking that HBase could still do a much better job out of the box. One thing that would make it better is somehow automatically controlling regions size/number to not run into the situation where you basically are filling up 1000 regions at the same rate and flush tiny files (which then triggers compactions _ad vitam aeternam_).

The other thing that would help is this jira since you wouldn't be compacting so much in the first place.

> Merging flush; merge a flush with one of the existing store files (the smallest?) so we skip creating a new store file on each flush
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3656
>                 URL: https://issues.apache.org/jira/browse/HBASE-3656
>             Project: HBase
>          Issue Type: Task
>            Reporter: stack
>
> This behavior is described in the BT paper.  Years ago I had a go at it but at the time it slowed flushing significantly -- and IIRC we had no barriers on writes when the memory pressue was high -- so it brought on OOMEs... so punted on it.  Its time to consider this feature again.
> Would we always do it?  Maybe not if its a close?  If a close we want stuff to run quickly so we should skip the merge.  But any other time, we should do it?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira