You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Mubarak Seyed (Created) (JIRA)" <ji...@apache.org> on 2012/03/02 00:28:57 UTC

[jira] [Created] (HBASE-5504) Online Merge

Online Merge
------------

                 Key: HBASE-5504
                 URL: https://issues.apache.org/jira/browse/HBASE-5504
             Project: HBase
          Issue Type: New Feature
          Components: client, master, shell, zookeeper
    Affects Versions: 0.94.0
            Reporter: Mubarak Seyed
             Fix For: 0.96.0


As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]

Design suggestion from Stack:

{quote}
I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.

Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.

(C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)

1. Client calls merge or deleteRegion API. API is a range of rows. (C)
2. Master gets call. (M)
3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
8. Get regions that are involved in the span (M)
9. Hoist the list up into zk. (M -> ZK)
10. Create region to span the range. (M)
11. Write that we did this up into zk. (M -> ZK)
12. Close regions in parallel. Confirm close in parallel. (M -> RS)
13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
16. Edit .META. (M)
17. Confirm edits went in. (M)
18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
19. Enable balancer (if it was off) (M)
20. Unlock table (M)
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220601#comment-13220601 ] 

Zhihong Yu commented on HBASE-5504:
-----------------------------------

w.r.t. the comment @ 02/Mar/12 01:18,
Is it possible that there would be more than one master-coordinated task for the same table running at the same time ?

If so, what should the state store ?

I think we should remove the noob label for this JIRA.
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220689#comment-13220689 ] 

stack commented on HBASE-5504:
------------------------------

bq. Since region splitting is disabled after step 3, how do we deal with the case where the start of row range lies in the middle of a region ?

Good question.  How about, if a merge, we include the hosting region in the merge.  If a delete region, we throw an exception saying you need to specify region edges.

bq. For step 14, we still need to move data for delete region request because we should have chosen one of the neighbor regions to cover the hole in .META.

I think this comes of a misunderstanding that you might have Ted.  You can't alter region edges once created.  For example, the directory in hdfs is the hash of regionname which includes at least the startkey and should one day include the endkey... If you change the delimiting keys, you have to make a new region.  Were you thinking we could change the delimiting keys on an existing region?

On 'What if the master crashes anywhere between step 6 and step 19 ?', its what Mubarak says; the new master comes up and after initializing, tries to pick up merge/delete from where the previous master left off... Or simpler, it could just undo it all?

bq. How do we get around if merge/delete-range get stuck (it should not but if it happens???)

I think we need to make the operation cancelable?  In shell/api, there'd be a cancel operation on table!  (Since you need a write lock to do one of these operations, this would mean one operation at a time per table only..... maybe no need of there being a lock transaction id because only one happening at a time?)

Later we can do something better where if an operation does not complete, the master operation runner would do the cancel.. but that we could do later?




                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Mubarak Seyed (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220622#comment-13220622 ] 

Mubarak Seyed commented on HBASE-5504:
--------------------------------------

[HBASE-5494|https://issues.apache.org/jira/browse/HBASE-5494] says only one table operation at a time.

@Stack
What happens if task (which held the lock) get stuck? 
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220743#comment-13220743 ] 

Zhihong Yu commented on HBASE-5504:
-----------------------------------

bq. You can't alter region edges once created.
Understood.
I meant that data for the neighbor region we choose should be copied. The neighbor region would have new delimiting key.
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220558#comment-13220558 ] 

Zhihong Yu commented on HBASE-5504:
-----------------------------------

Since region splitting is disabled after step 3, how do we deal with the case where the start of row range lies in the middle of a region ?
For step 6:
bq. if a write lock outstanding on a table, then the balancer does not balance regions in the locked table
I like the above approach. We can create a sub-task which is to be done after HBASE-5494 goes in.
For step 14, we still need to move data for delete region request because we should have chosen one of the neighbor regions to cover the hole in .META.

What if the master crashes anywhere between step 6 and step 19 ? How would the new master deal with the outstanding table lock ?
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220734#comment-13220734 ] 

Lars Hofhansl commented on HBASE-5504:
--------------------------------------

.bq 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)

That's would be awesome for a variety of other reason. For example snapshots.
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Mubarak Seyed (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220572#comment-13220572 ] 

Mubarak Seyed commented on HBASE-5504:
--------------------------------------

bq. What if the master crashes anywhere between step 6 and step 19 ? How would the new master deal with the outstanding table lock ?
I think new active master (after failover) should resume executing task (provided state is maintained in table lock znode)
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Mubarak Seyed (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220615#comment-13220615 ] 

Mubarak Seyed commented on HBASE-5504:
--------------------------------------

bq. Is it possible that there would be more than one master-coordinated task for the same table running at the same time ?
If we store like /hbase/lock/<table_name> then we can't. How about

{code}
 /hbase/lock/<table_name>/lockNode1
                        /lockNode1
{code}

lockNode1 for task1 and be W, lockNode2 for task2 and be R
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Eric Newton (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257523#comment-13257523 ] 

Eric Newton commented on HBASE-5504:
------------------------------------

We implemented delayed file cleanup because that is what was described in the BigTable paper.  A tablet will have files in multiple directories, including directories under different tables, as is the case just after a table copy.  The file information is stored in the metadata table.  Splits, bulk import and table copying all create references to files, which remain shared.  Data will exist in the file, but outside the tablet's range, which is why we chop them during merge.  The tablet server keeps the list up-to-date in the metadata table as files are bulk loaded and compacted.  We also keep the list of files that are candidates for deletion in the metadata table, so we aren't asking the NameNode for information about files during GC.  Bulk import, compaction, and gc must all be coordinated such that the file isn't inadvertently deleted while still being imported into other tablets.

                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5504) Online Merge

Posted by "Mubarak Seyed (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mubarak Seyed updated HBASE-5504:
---------------------------------

    Issue Type: Brainstorming  (was: New Feature)

Design will be evolved.
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Mubarak Seyed (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220596#comment-13220596 ] 

Mubarak Seyed commented on HBASE-5504:
--------------------------------------

How do we get around if merge/delete-range get stuck (it should not but if it happens???), can we provide a tool to fail or delete? (something like admin operation). User can delete the lock znode in ZK (thats the worst case)? Can we make use of [HBASE-5459|https://issues.apache.org/jira/browse/HBASE-5459]?
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5504) Online Merge

Posted by "Mubarak Seyed (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mubarak Seyed updated HBASE-5504:
---------------------------------

    Labels:   (was: noob)
    
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221179#comment-13221179 ] 

stack commented on HBASE-5504:
------------------------------

@Ted that takes me to a comment I made.  I still am without understanding.
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Mubarak Seyed (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220661#comment-13220661 ] 

Mubarak Seyed commented on HBASE-5504:
--------------------------------------

bq. Since region splitting is disabled after step 3, how do we deal with the case where the start of row range lies in the middle of a region ?
I think we were discussing about start/end key should match with region boundary (single or multiple), isn't? Are we planning to do arbitrary start/end key?
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221082#comment-13221082 ] 

Zhihong Yu commented on HBASE-5504:
-----------------------------------

Cycling old bits:
https://issues.apache.org/jira/browse/HBASE-4991?focusedCommentId=13195136&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13195136
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221023#comment-13221023 ] 

stack commented on HBASE-5504:
------------------------------

bq. I meant that data for the neighbor region we choose should be copied. The neighbor region would have new delimiting key.

Sorry, I'm not following Ted.  You need to bring me a long.  Thanks.
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247307#comment-13247307 ] 

stack commented on HBASE-5504:
------------------------------

Hey Eric.  Thanks for the help. Looking at FATE it didn't look like we could pull it over easily (maybe I should look again) but for sure we need to emulate something like it.  First up would be table read/write lock and have actions like split (or bulk import) take out at a read lock before progressing.  Can I have a pointer for your file GCer, on why delayed cleanup?  So you'd keep up in meta the list of files for a range and this would be the region/tablets' "list" even though the files might not sit physically under a particular region/tablet (What about compactions?  Would it have to update the list in the meta table when it moved the compacted file into place?).  On point 1., isn't it possible a file might still over-span a range?

Good stuff.
                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220669#comment-13220669 ] 

Zhihong Yu commented on HBASE-5504:
-----------------------------------

So there is validation at step 2 for region boundary check. That's fine.
In first version, we can disable region splitting.

                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5504) Online Merge

Posted by "Eric Newton (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246885#comment-13246885 ] 

Eric Newton commented on HBASE-5504:
------------------------------------

Hi,

I work on Accumulo, and I implemented the merge/deleteRange feature.

Our primary use-case was a time-based row, in which data was being deleted as it aged.  Over time, this created splits that were empty and resulted in poor balancing. The same goes for deleteRange.  It is now very efficient to tell Accumulo: delete all rows that fall before "20111231".

I know that stack and Keith Turner have commented on other tickets and referenced the FATE architecture.  It really simplified the many steps in merge (see the outline in this ticket).  FATE is not very big and would be easy to borrow/emulate.

We also added the concept of read and write locks on tables.  Merge grabs a write lock on a table, and so does bulk import.  This reduces the number of assumptions each of the processes has to understand.

Accumulo has a file garbage collector, so files can survive many splits. But this makes merge more complex because files containing data for an unused section of a file in a split might be re-used in a merge, and that data might have been deleted.  A merge would inadvertently bring that data back. We had at least two implementation options:

 1. keep range information with files in the metadata tablet
 2. "chop" files, or compact a file down to the tablet's range before the merge

We chose option 2.  This might appeal to the HBase community, too, since it avoids double-indirection and file-deletion issues.  But I regret not using option 1 because it would have made merge as fast as split.

Our merge works on a range. But we have tools that allow a user to merge based on their tablet size.  But this only works with one range at a time.  It also requires the tablet to go online to perform the chop operation. If a user decides to change their tablet size from 256M to 1G, and their table size is 1P, they will wait a very long time.  In the future we will recognize this case and chop the tablets in parallel and take the table offline to re-write the metadata table.

I know we found a lot of places in Accumulo where we assumed that tablets would only split.  It took a lot of testing to discover these assumptions.  Fortunately merge/split are isomorphic and can be used to test the each other.  DeleteRange, while very similar to merge, does not have this property and is harder to test at scale.

                
> Online Merge
> ------------
>
>                 Key: HBASE-5504
>                 URL: https://issues.apache.org/jira/browse/HBASE-5504
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: client, master, shell, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>             Fix For: 0.96.0
>
>
> As discussed, please refer the discussion at [HBASE-4991|https://issues.apache.org/jira/browse/HBASE-4991]
> Design suggestion from Stack:
> {quote}
> I suggest a design below. It has some prerequisites, some general function that this feature could use (and others). The prereqs if you think them good, could be done outside of this JIRA.
> Here's a suggested rough outline of how I think this feature should run. The feature I'm describing below is merge and deleteRegion for I see them as in essence the same thing.
> (C) Client, (M) Master, RS (Region server), ZK (ZooKeeper)
> 1. Client calls merge or deleteRegion API. API is a range of rows. (C)
> 2. Master gets call. (M)
> 3. Master obtains a write lock on table so it can't be disabled from under us. The write lock will also disable splitting. This is one of the prereqs I think. Its HBASE-5494 (Or we could just do something simpler where we have a flag up in zk that splitRegion checks but thats less useful I think; OR we do the dynamic configs issue and set splits to off via a config. change). There'd be a timer for how long we wait on the table lock. (M -> ZK)
> 4. If we get the lock, write intent to merge a range up into zk. It also hoists into zk if its a pure merge or a merge that drops the region data (a deleteRegion call) (M)
> 5. Return to the client either our failed attempt at locking the table or an id of some sort used identifying this running operation; can use it querying status. (M -> C)
> 6. Turn off balancer. TODO/prereq: Do it in a way that is persisted. Balancer switch currently in memory only so if master crashes, new master will come up in balancing mode # (If we had dynamic config. could hoist up to zk a config. that disables the balancer rather than have a balancer-specific flag/znode OR if a write lock outstanding on a table, then the balancer does not balance regions in the locked table - this latter might be the easiest to do) (M)
> 7. Write into zk that just turned off the balancer (If it was on) (M -> ZK)
> 8. Get regions that are involved in the span (M)
> 9. Hoist the list up into zk. (M -> ZK)
> 10. Create region to span the range. (M)
> 11. Write that we did this up into zk. (M -> ZK)
> 12. Close regions in parallel. Confirm close in parallel. (M -> RS)
> 13. Write up into zk regions closed (This might not be necessary since can ask if region is open). (M -> ZK)
> 14. If a merge and not a delete region, move files under new region. Might multithread this (moves should go pretty fast). If a deleteregion, we skip this step. (M)
> 15. On completion mark zk (though may not be necessary since its easy to look in fs to see state of move). (M -> ZK)
> 16. Edit .META. (M)
> 17. Confirm edits went in. (M)
> 18. Move old regions to hbase trash folder TODO: There is no trash folder under /hbase currently. We should add one. (M)
> 19. Enable balancer (if it was off) (M)
> 20. Unlock table (M)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira