You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Mubarak Seyed (Created) (JIRA)" <ji...@apache.org> on 2012/02/28 21:37:46 UTC

[jira] [Created] (HBASE-5487) Generic framework for Master-coordinated tasks

Generic framework for Master-coordinated tasks
----------------------------------------------

                 Key: HBASE-5487
                 URL: https://issues.apache.org/jira/browse/HBASE-5487
             Project: HBase
          Issue Type: New Feature
          Components: master, regionserver, zookeeper
    Affects Versions: 0.94.0
            Reporter: Mubarak Seyed


Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 

Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s) based on start/end key) can make use of this framework.

The advantages of framework are
1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated tasks
2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
3. Easy to plugin new master-coordinated tasks without adding code to core components


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks

Posted by "Keith Turner (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245596#comment-13245596 ] 

Keith Turner commented on HBASE-5487:
-------------------------------------

The description of FATE given in this ticket is pretty good.  The following resources may provide a little more info.

http://people.apache.org/~kturner/accumulo14_15.pdf

http://mail-archives.apache.org/mod_mbox/incubator-accumulo-dev/201202.mbox/%3CCAGUtCHpcHTDue-C_2RyDkZm0diW=Zojd7-BzCGsZQDTiDznZqg@mail.gmail.com%3E
                
> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
>                 Key: HBASE-5487
>                 URL: https://issues.apache.org/jira/browse/HBASE-5487
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 
> Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core components

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks

Posted by "Keith Turner (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250769#comment-13250769 ] 

Keith Turner commented on HBASE-5487:
-------------------------------------

Accumulo 1.3 cannot not survive running our random walk test w/ the agitator (a perl script that kills accumulo processes, it not as devious as Todd's gremlins).  

  Random walk + Agitation + Accumulo 1.3 == foobar

Attempting the above would leave Accumulo in an inconsistent state (like corrupted metadata table) or test clients would die with unexpected exceptions.

My point is that while developing FATE it was nice to have Random Walk + Agitation to really beat up the FATE framework and the FATE table operations.  We also wrote some new random walk test for 1.4 that were even meaner.

                
> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
>                 Key: HBASE-5487
>                 URL: https://issues.apache.org/jira/browse/HBASE-5487
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 
> Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core components

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks

Posted by "Mubarak Seyed (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245829#comment-13245829 ] 

Mubarak Seyed commented on HBASE-5487:
--------------------------------------

@Enis
I am not working on this right now. Thanks.
                
> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
>                 Key: HBASE-5487
>                 URL: https://issues.apache.org/jira/browse/HBASE-5487
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 
> Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core components

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks

Posted by "Keith Turner (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250770#comment-13250770 ] 

Keith Turner commented on HBASE-5487:
-------------------------------------

To add context to my above comment, Accumulo 1.3 does not have FATE it was introduced in 1.4.
                
> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
>                 Key: HBASE-5487
>                 URL: https://issues.apache.org/jira/browse/HBASE-5487
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 
> Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core components

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks

Posted by "Keith Turner (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245545#comment-13245545 ] 

Keith Turner commented on HBASE-5487:
-------------------------------------

I am an Accumulo developer.  CompactRange is our operation to force a range of tablets(regions) to major compact all of their files into one file.  The TableRangeOp will merge a range of tablets into one tablet.  TableRangeOp can also delete a range of row from a table efficiently.  It inserts splits points at the rows you want to delete, drops the tablets, and then merges whats left.
                
> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
>                 Key: HBASE-5487
>                 URL: https://issues.apache.org/jira/browse/HBASE-5487
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 
> Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core components

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218785#comment-13218785 ] 

stack commented on HBASE-5487:
------------------------------

I took a look at FATE over in accumulo.  Its some nice generic primitives for running a suite of idempotent operations (even if operation only part completes, if its run again, it should clean up and continue).  There is a notion of locking on a table (so can stop it transiting I suppose; there are read/write locks), a stack for operations (ops are pushed and popped off the stack), operations can respond done, failed, or even w/ a new set of operations to do first (This basic can be used to step through a number of tasks one after the other).  All is persisted up in zk run by the master; if master dies, a new master can pick up the half-done task and finish it.  Clients can watch zk to see if task is done.  There ain't too much to the fate package; there is fate class itself, an admin, a 'store' interface of which there is a zk implementation.  We should for sure take inspiration at least from the work already done.

Here are the ops they do via fate:

{code}
          fate.seedTransaction(opid, new TraceRepo<Master>(new CreateTable(c.user, tableName, timeType, options)), autoCleanup);
          fate.seedTransaction(opid, new TraceRepo<Master>(new RenameTable(tableId, oldTableName, newTableName)), autoCleanup);
          fate.seedTransaction(opid, new TraceRepo<Master>(new CloneTable(c.user, srcTableId, tableName, propertiesToSet, propertiesToExclude)), autoCleanup);
          fate.seedTransaction(opid, new TraceRepo<Master>(new DeleteTable(tableId)), autoCleanup);
          fate.seedTransaction(opid, new TraceRepo<Master>(new ChangeTableState(tableId, TableOperation.ONLINE)), autoCleanup);
          fate.seedTransaction(opid, new TraceRepo<Master>(new ChangeTableState(tableId, TableOperation.OFFLINE)), autoCleanup);
          fate.seedTransaction(opid, new TraceRepo<Master>(new TableRangeOp(MergeInfo.Operation.MERGE, tableId, startRow, endRow)), autoCleanup);
          fate.seedTransaction(opid, new TraceRepo<Master>(new TableRangeOp(MergeInfo.Operation.DELETE, tableId, startRow, endRow)), autoCleanup);
          fate.seedTransaction(opid, new TraceRepo<Master>(new BulkImport(tableId, dir, failDir, setTime)), autoCleanup);
          fate.seedTransaction(opid, new TraceRepo<Master>(new CompactRange(tableId, startRow, endRow)), autoCleanup);{code}
{code}

CompactRange is their term for merge.  It takes a key range span, figures the tablets involved and runs the compact/merge.  We want that and then something to do the remove or regions too?




                
> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
>                 Key: HBASE-5487
>                 URL: https://issues.apache.org/jira/browse/HBASE-5487
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 
> Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core components

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks

Posted by "Enis Soztutar (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245672#comment-13245672 ] 

Enis Soztutar commented on HBASE-5487:
--------------------------------------

Keith thanks a lot for the refs, we badly need something like FATE. 
Mubarak, or anybody else plans to work on this?
                
> Generic framework for Master-coordinated tasks
> ----------------------------------------------
>
>                 Key: HBASE-5487
>                 URL: https://issues.apache.org/jira/browse/HBASE-5487
>             Project: HBase
>          Issue Type: New Feature
>          Components: master, regionserver, zookeeper
>    Affects Versions: 0.94.0
>            Reporter: Mubarak Seyed
>              Labels: noob
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant manner. 
> Master-coordinated tasks such as online-scheme change and delete-range (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core components

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira