You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "stack (JIRA)" <ji...@apache.org> on 2007/12/11 18:13:43 UTC

[jira] Created: (HADOOP-2405) [hbase] Merge region tool exposed in shell and/or in UI

[hbase] Merge region tool exposed in shell and/or in UI
-------------------------------------------------------

                 Key: HADOOP-2405
                 URL: https://issues.apache.org/jira/browse/HADOOP-2405
             Project: Hadoop
          Issue Type: New Feature
          Components: contrib/hbase
            Reporter: stack
            Priority: Minor


hbase has support for merging regions.  Expose a merge trigger in the shell or in the UI (Can only merge adjacent features so perhaps only makes sense in UI in the regionserver UI).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


RE: [jira] Commented: (HADOOP-2405) [hbase] Merge region tool exposed in shell and/or in UI

Posted by Jim Kellerman <ji...@powerset.com>.
Google does dynamic splitting and merging of regions to deal with hot spots.
They had to be careful that they did not oscilate between splitting and merging when the
load pattern changed.

Right now, manual merges are ok because we only do splits when regions grow and the only reason
to merge is if many rows are deleted.

When we get to doing more sophisticated load balancing, we will want the capability of both
splitting and merging based on load.

> -----Original Message-----
> From: Bryan Duxbury (JIRA) [mailto:jira@apache.org]
> Sent: Monday, January 07, 2008 1:10 PM
> To: hadoop-dev@lucene.apache.org
> Subject: [jira] Commented: (HADOOP-2405) [hbase] Merge region
> tool exposed in shell and/or in UI
>
>
>     [
> https://issues.apache.org/jira/browse/HADOOP-2405?page=com.atl
assian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=> 12556694#action_12556694 ]
>
> Bryan Duxbury commented on HADOOP-2405:
> ---------------------------------------
>
> So, you envision the merge operation to not only require
> manual triggering but to require manual targeting? Shouldn't
> the point of merging regions be to maintain the equilibrium
> of size of regions? Under what circumstances will you have to
> manually intervene to keep regions appropriately sized?
>
> It seems like this should really only happen after a
> substantial number of deletions has occurred, right? That
> would cause a compacted region to shrink below a healthy
> size, and if it could be merged with a neighbor, it would be
> nice. This logic should be built in and automatic, otherwise
> it would require constant monitoring of region sizes by an
> administrator.
>
> Other than this sort of automatic merging, when would you
> want to manually merge two regions? Doesn't that expose a
> somewhat dangerous amount of functionality to the end user?
>
> > [hbase] Merge region tool exposed in shell and/or in UI
> > -------------------------------------------------------
> >
> >                 Key: HADOOP-2405
> >                 URL:
> https://issues.apache.org/jira/browse/HADOOP-2405
> >             Project: Hadoop
> >          Issue Type: New Feature
> >          Components: contrib/hbase
> >            Reporter: stack
> >            Priority: Minor
> >
> > hbase has support for merging regions.  Expose a merge
> trigger in the shell or in the UI (Can only merge adjacent
> features so perhaps only makes sense in UI in the regionserver UI).
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.516 / Virus Database: 269.17.13/1213 - Release
> Date: 1/7/2008 9:14 AM
>
>

No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.17.13/1213 - Release Date: 1/7/2008 9:14 AM


[jira] Commented: (HADOOP-2405) [hbase] Merge region tool exposed in shell and/or in UI

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556694#action_12556694 ] 

Bryan Duxbury commented on HADOOP-2405:
---------------------------------------

So, you envision the merge operation to not only require manual triggering but to require manual targeting? Shouldn't the point of merging regions be to maintain the equilibrium of size of regions? Under what circumstances will you have to manually intervene to keep regions appropriately sized?

It seems like this should really only happen after a substantial number of deletions has occurred, right? That would cause a compacted region to shrink below a healthy size, and if it could be merged with a neighbor, it would be nice. This logic should be built in and automatic, otherwise it would require constant monitoring of region sizes by an administrator. 

Other than this sort of automatic merging, when would you want to manually merge two regions? Doesn't that expose a somewhat dangerous amount of functionality to the end user?

> [hbase] Merge region tool exposed in shell and/or in UI
> -------------------------------------------------------
>
>                 Key: HADOOP-2405
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2405
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>
> hbase has support for merging regions.  Expose a merge trigger in the shell or in the UI (Can only merge adjacent features so perhaps only makes sense in UI in the regionserver UI).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2405) [hbase] Merge region tool exposed in shell and/or in UI

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556712#action_12556712 ] 

stack commented on HADOOP-2405:
-------------------------------

Automated balancing of region sizes using merge as well as split would be cool (We only have split working at mo).  We should go there next after we've made it so an admin can run merge against an online table.

> [hbase] Merge region tool exposed in shell and/or in UI
> -------------------------------------------------------
>
>                 Key: HADOOP-2405
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2405
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>
> hbase has support for merging regions.  Expose a merge trigger in the shell or in the UI (Can only merge adjacent features so perhaps only makes sense in UI in the regionserver UI).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2405) [hbase] Merge region tool exposed in shell and/or in UI

Posted by "Edward Yoon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556463#action_12556463 ] 

Edward Yoon commented on HADOOP-2405:
-------------------------------------

This sounds like a good idea stack. Can you supply examples that would illustrate how you think this should work?

> [hbase] Merge region tool exposed in shell and/or in UI
> -------------------------------------------------------
>
>                 Key: HADOOP-2405
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2405
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>
> hbase has support for merging regions.  Expose a merge trigger in the shell or in the UI (Can only merge adjacent features so perhaps only makes sense in UI in the regionserver UI).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2405) [hbase] Merge region tool exposed in shell and/or in UI

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556464#action_12556464 ] 

Billy Pearson commented on HADOOP-2405:
---------------------------------------

I was thanking about this too. Using HADOOP-1958 and let the master handle the mergers. I would thank this would be an easer task once we have HADOOP-1958 and HADOOP-1700 in place.

> [hbase] Merge region tool exposed in shell and/or in UI
> -------------------------------------------------------
>
>                 Key: HADOOP-2405
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2405
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>
> hbase has support for merging regions.  Expose a merge trigger in the shell or in the UI (Can only merge adjacent features so perhaps only makes sense in UI in the regionserver UI).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2405) [hbase] Merge region tool exposed in shell and/or in UI

Posted by "Billy Pearson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556738#action_12556738 ] 

Billy Pearson commented on HADOOP-2405:
---------------------------------------

Maybe make a set size in the config make it a % of the max size or something like that and have a command that will run and select all regions that match and merge them in to the smaller of the two regions that are before and after it. It would be nice to have this automated maybe we could tie it in to the split or compaction thread that run on the region servers.

I like you idea stack if we could do that then the table would not have to be taken offline to merge the regions and all would be cleaned up on the next compaction.


> [hbase] Merge region tool exposed in shell and/or in UI
> -------------------------------------------------------
>
>                 Key: HADOOP-2405
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2405
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>
> hbase has support for merging regions.  Expose a merge trigger in the shell or in the UI (Can only merge adjacent features so perhaps only makes sense in UI in the regionserver UI).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2405) [hbase] Merge region tool exposed in shell and/or in UI

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12556688#action_12556688 ] 

stack commented on HADOOP-2405:
-------------------------------

Here is an (ugly) suggestion: In the UI where we list regions over on the regionserver, there would be checkbox where regions meet  Clicking on the checkbox and submitting would run the HRegion.merge code.

In the shell, it'd be some kind of alter table command: alter table x merge regionA regionB

But currently regions have to be offline for merges to run so this makes things a little awkward.  Shell manages the offlining doing truncate at least. Maybe shell and UI should have means of viewing/dealing with tables that have been offlined?  Or should we change the merge code so it can go against onlined tables?

Billy: I don't think HADOOP-1700 a prerequisite adding this feature.  We could do something like we do currently when we split where daughter regions are made with references to the parent: One daughter references the top half of the daughter region and the other, the bottom half.  Eventually the references are let go as compactions start to run in the children.   The new merged region could be made up of mapfiles that reference the two input regions -- a sort of reverse of the split operation.  Doing this, the merges should be as fast as splits and could be done with the table online.

> [hbase] Merge region tool exposed in shell and/or in UI
> -------------------------------------------------------
>
>                 Key: HADOOP-2405
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2405
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: contrib/hbase
>            Reporter: stack
>            Priority: Minor
>
> hbase has support for merging regions.  Expose a merge trigger in the shell or in the UI (Can only merge adjacent features so perhaps only makes sense in UI in the regionserver UI).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.