You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Jonathan Gray <jg...@apache.org> on 2010/08/03 14:02:49 UTC

Re: Review Request: HBASE-2697 The master rewrite

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/
-----------------------------------------------------------

(Updated 2010-08-03 05:02:49.124462)


Review request for hbase, stack and Karthik Ranganathan.


Summary (updated)
-------

This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.

I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.

All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.

All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.

The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...

Some new methods in FileSystemManager, collected from being scattered around.

This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.

Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)

RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...

I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.

Among other changes....


Here's my rough list of things left to do (covers most but not all TODOs in the branch)


* make final decisions on root/meta timeouts.  almost everyone is coordinating
  access through CatalogTracker which should make it easy to standardize.
  if there are operations that should just retry indefinitely, they need to
  resubmit themselves to their executor service.

* move splits to RS side, integrate new patch from stack on trunk
  might need a new CREATED unassigned now, or new rpc, but get rid of sending
  split notification on heartbeat?
  how to handle splits concurrent with disable?

* review master startup order

* figure what to do with client table admin ops (flush, split, compact)

* MasterAddressManager move to extending ZooKeeperNodeTracker

* on region open (and wherever split children notify master) should check if
  if the table is disabled and should close the regions... maybe.

* regionserver exit needs to be reimplemented in servermanager
  also rs expiration

* add priorities and pool size to handlers (bring in from flush patch)

* in RootEditor there is a race condition between delete and watch?

* migrate TestZKBased* tests to use new handlers

* make sync calls for enable/disable (check and verify methods?)

* integrate load balancing

* finish TODOs on new failover path and remove old code in joinExistingCluster()

* finish TODOs on timeout monitor in assignmentmanager

* review filesystemmanager calls

* figure how to handle the very rare but possible race condition where two
  RSs will update META and the later one can squash the valid one if there was
  a long gc pause

* synchronize all access to the boolean in ActiveMasterManager
  (now this is probably just move it to extend ZKNodeTracker)

* there are some races with master wanting to connect for rpc
  to regionserver and the rs starting its rpc server, need to address

* migrate TestMasterTransitions or make new?

* fix or remove last couple master tests that used RSOQ

* write new tests!!!


This addresses bug HBASE-2697.
    http://issues.apache.org/jira/browse/HBASE-2697


Diffs
-----

  branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyException.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootEditor.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 

Diff: http://review.cloudera.org/r/484/diff


Testing
-------

simple cluster tests passing, need to run (and write) more tests


Thanks,

Jonathan


Re: Review Request: HBASE-2697 The master rewrite

Posted by st...@duboce.net.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/#review676
-----------------------------------------------------------


I sat w/ Jon today.  We went over it.  Loads of issues still but its overwhelmed by mountains of improvements.  I'm good w/ committing this massive patch on branch.

- stack


On 2010-08-04 10:37:33, Jonathan Gray wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/484/
> -----------------------------------------------------------
> 
> (Updated 2010-08-04 10:37:33)
> 
> 
> Review request for hbase, stack and Karthik Ranganathan.
> 
> 
> Summary
> -------
> 
> This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.
> 
> I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.
> 
> All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.
> 
> All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.
> 
> The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...
> 
> Some new methods in FileSystemManager, collected from being scattered around.
> 
> This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.
> 
> Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)
> 
> RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...
> 
> I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.
> 
> Among other changes....
> 
> 
> Here's my rough list of things left to do (covers most but not all TODOs in the branch)
> 
> 
> * make final decisions on root/meta timeouts.  almost everyone is coordinating
>   access through CatalogTracker which should make it easy to standardize.
>   if there are operations that should just retry indefinitely, they need to
>   resubmit themselves to their executor service.
> 
> * move splits to RS side, integrate new patch from stack on trunk
>   might need a new CREATED unassigned now, or new rpc, but get rid of sending
>   split notification on heartbeat?
>   how to handle splits concurrent with disable?
> 
> * review master startup order
> 
> * figure what to do with client table admin ops (flush, split, compact)
> 
> * MasterAddressManager move to extending ZooKeeperNodeTracker
> 
> * on region open (and wherever split children notify master) should check if
>   if the table is disabled and should close the regions... maybe.
> 
> * regionserver exit needs to be reimplemented in servermanager
>   also rs expiration
> 
> * add priorities and pool size to handlers (bring in from flush patch)
> 
> * in RootEditor there is a race condition between delete and watch?
> 
> * migrate TestZKBased* tests to use new handlers
> 
> * make sync calls for enable/disable (check and verify methods?)
> 
> * integrate load balancing
> 
> * finish TODOs on new failover path and remove old code in joinExistingCluster()
> 
> * finish TODOs on timeout monitor in assignmentmanager
> 
> * review filesystemmanager calls
> 
> * figure how to handle the very rare but possible race condition where two
>   RSs will update META and the later one can squash the valid one if there was
>   a long gc pause
> 
> * synchronize all access to the boolean in ActiveMasterManager
>   (now this is probably just move it to extend ZKNodeTracker)
> 
> * there are some races with master wanting to connect for rpc
>   to regionserver and the rs starting its rpc server, need to address
> 
> * migrate TestMasterTransitions or make new?
> 
> * fix or remove last couple master tests that used RSOQ
> 
> * write new tests!!!
> 
> 
> This addresses bug HBASE-2697.
>     http://issues.apache.org/jira/browse/HBASE-2697
> 
> 
> Diffs
> -----
> 
>   branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyException.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnection.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ClusterStatusTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedCloseRegion.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedReopenRegion.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java 979909 
> 
> Diff: http://review.cloudera.org/r/484/diff
> 
> 
> Testing
> -------
> 
> simple cluster tests passing, need to run (and write) more tests
> 
> 
> Thanks,
> 
> Jonathan
> 
>


Re: Review Request: HBASE-2697 The master rewrite

Posted by st...@duboce.net.

> On 2010-08-11 16:45:48, Ted Yu wrote:
> > branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, line 123
> > <http://review.cloudera.org/r/484/diff/6/?file=5360#file5360line123>
> >
> >     Please change description to reflect what the code does - throwing NotAllMetaRegionsOnlineException

@Ted I took a look at it says that it throws NotAllMetaRegionsOnlineException if if root is not available before the timeout expires... so I think the doc is 'correct' but I think this method a little odd in that why would you have a timeout waiting on root in first place.  I'll take a look into it....


- stack


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/#review850
-----------------------------------------------------------


On 2010-08-05 00:19:37, Jonathan Gray wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/484/
> -----------------------------------------------------------
> 
> (Updated 2010-08-05 00:19:37)
> 
> 
> Review request for hbase, stack and Karthik Ranganathan.
> 
> 
> Summary
> -------
> 
> This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.
> 
> I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.
> 
> All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.
> 
> All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.
> 
> The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...
> 
> Some new methods in FileSystemManager, collected from being scattered around.
> 
> This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.
> 
> Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)
> 
> RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...
> 
> I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.
> 
> Among other changes....
> 
> 
> Here's my rough list of things left to do (covers most but not all TODOs in the branch)
> 
> 
> * make final decisions on root/meta timeouts.  almost everyone is coordinating
>   access through CatalogTracker which should make it easy to standardize.
>   if there are operations that should just retry indefinitely, they need to
>   resubmit themselves to their executor service.
> 
> * move splits to RS side, integrate new patch from stack on trunk
>   might need a new CREATED unassigned now, or new rpc, but get rid of sending
>   split notification on heartbeat?
>   how to handle splits concurrent with disable?
> 
> * review master startup order
> 
> * figure what to do with client table admin ops (flush, split, compact)
> 
> * MasterAddressManager move to extending ZooKeeperNodeTracker
> 
> * on region open (and wherever split children notify master) should check if
>   if the table is disabled and should close the regions... maybe.
> 
> * regionserver exit needs to be reimplemented in servermanager
>   also rs expiration
> 
> * add priorities and pool size to handlers (bring in from flush patch)
> 
> * in RootEditor there is a race condition between delete and watch?
> 
> * migrate TestZKBased* tests to use new handlers
> 
> * make sync calls for enable/disable (check and verify methods?)
> 
> * integrate load balancing
> 
> * finish TODOs on new failover path and remove old code in joinExistingCluster()
> 
> * finish TODOs on timeout monitor in assignmentmanager
> 
> * review filesystemmanager calls
> 
> * figure how to handle the very rare but possible race condition where two
>   RSs will update META and the later one can squash the valid one if there was
>   a long gc pause
> 
> * synchronize all access to the boolean in ActiveMasterManager
>   (now this is probably just move it to extend ZKNodeTracker)
> 
> * there are some races with master wanting to connect for rpc
>   to regionserver and the rs starting its rpc server, need to address
> 
> * migrate TestMasterTransitions or make new?
> 
> * fix or remove last couple master tests that used RSOQ
> 
> * write new tests!!!
> 
> 
> This addresses bug HBASE-2697.
>     http://issues.apache.org/jira/browse/HBASE-2697
> 
> 
> Diffs
> -----
> 
>   branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyOperationException.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnection.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseMetaHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRootHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenMetaHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRootHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ClusterStatusTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedCloseRegion.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedReopenRegion.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/util/TestBytes.java 979909 
> 
> Diff: http://review.cloudera.org/r/484/diff
> 
> 
> Testing
> -------
> 
> simple cluster tests passing, need to run (and write) more tests
> 
> 
> Thanks,
> 
> Jonathan
> 
>


Re: Review Request: HBASE-2697 The master rewrite

Posted by Ted Yu <te...@yahoo.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/#review850
-----------------------------------------------------------



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
<http://review.cloudera.org/r/484/#comment2851>

    Please change description to reflect what the code does - throwing NotAllMetaRegionsOnlineException


- Ted


On 2010-08-05 00:19:37, Jonathan Gray wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/484/
> -----------------------------------------------------------
> 
> (Updated 2010-08-05 00:19:37)
> 
> 
> Review request for hbase, stack and Karthik Ranganathan.
> 
> 
> Summary
> -------
> 
> This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.
> 
> I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.
> 
> All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.
> 
> All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.
> 
> The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...
> 
> Some new methods in FileSystemManager, collected from being scattered around.
> 
> This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.
> 
> Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)
> 
> RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...
> 
> I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.
> 
> Among other changes....
> 
> 
> Here's my rough list of things left to do (covers most but not all TODOs in the branch)
> 
> 
> * make final decisions on root/meta timeouts.  almost everyone is coordinating
>   access through CatalogTracker which should make it easy to standardize.
>   if there are operations that should just retry indefinitely, they need to
>   resubmit themselves to their executor service.
> 
> * move splits to RS side, integrate new patch from stack on trunk
>   might need a new CREATED unassigned now, or new rpc, but get rid of sending
>   split notification on heartbeat?
>   how to handle splits concurrent with disable?
> 
> * review master startup order
> 
> * figure what to do with client table admin ops (flush, split, compact)
> 
> * MasterAddressManager move to extending ZooKeeperNodeTracker
> 
> * on region open (and wherever split children notify master) should check if
>   if the table is disabled and should close the regions... maybe.
> 
> * regionserver exit needs to be reimplemented in servermanager
>   also rs expiration
> 
> * add priorities and pool size to handlers (bring in from flush patch)
> 
> * in RootEditor there is a race condition between delete and watch?
> 
> * migrate TestZKBased* tests to use new handlers
> 
> * make sync calls for enable/disable (check and verify methods?)
> 
> * integrate load balancing
> 
> * finish TODOs on new failover path and remove old code in joinExistingCluster()
> 
> * finish TODOs on timeout monitor in assignmentmanager
> 
> * review filesystemmanager calls
> 
> * figure how to handle the very rare but possible race condition where two
>   RSs will update META and the later one can squash the valid one if there was
>   a long gc pause
> 
> * synchronize all access to the boolean in ActiveMasterManager
>   (now this is probably just move it to extend ZKNodeTracker)
> 
> * there are some races with master wanting to connect for rpc
>   to regionserver and the rs starting its rpc server, need to address
> 
> * migrate TestMasterTransitions or make new?
> 
> * fix or remove last couple master tests that used RSOQ
> 
> * write new tests!!!
> 
> 
> This addresses bug HBASE-2697.
>     http://issues.apache.org/jira/browse/HBASE-2697
> 
> 
> Diffs
> -----
> 
>   branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyOperationException.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnection.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseMetaHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRootHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenMetaHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRootHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ClusterStatusTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedCloseRegion.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedReopenRegion.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/util/TestBytes.java 979909 
> 
> Diff: http://review.cloudera.org/r/484/diff
> 
> 
> Testing
> -------
> 
> simple cluster tests passing, need to run (and write) more tests
> 
> 
> Thanks,
> 
> Jonathan
> 
>


Re: Review Request: HBASE-2697 The master rewrite

Posted by Jonathan Gray <jg...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/
-----------------------------------------------------------

(Updated 2010-08-05 00:19:37.074497)


Review request for hbase, stack and Karthik Ranganathan.


Changes
-------

notes from 8/4 (with what i did tonight for them, which is most of what is different in this update)
---

* in CatalogTracker need to stabilize on one getRoot and one getMeta method to
  use that waits and uses the default wait-for-catalogs timeout.
  
  We should get rid of the 'refresh' boolean that I have in there and
  should always ping the server to ensure it is serving the region before we
  return it.  If we do eventually drop root and put the meta locations into zk
  we would no longer need this, so will not always have to pay this tax.
  
  >>  This is done.  You pass default timeout in constructor.  Two methods now are:

      waitForRootServerConnectionDefault()
      waitForMetaServerConnectionDefault()


* ROOT changes

  RootEditor -> RootLocationEditor, delete -> unset

  Change the way we unset the root location.  Set the data to null rather than
  deleting the node.  Requires changes to RootLocationEditor and RootRegionTracker.

  >>  Thought there was a race condition here, but there is not.  In fact, we do
  not even need to set the watch in the delete method.  It is already properly
  being handled by RootRegionTracker.


* In AssignmentManager.processFailure() need to insert RegionState into RIT map

  >>  This is done.  This needs tests but I think failover is all in place now.


* On RS-side, make separate OpenRootHandler and OpenMetaHandler

  >>  Added four new handlers for open/close of root/meta and associated
      executors


* Add priorities to Opened/Closed handlers on Master

  >>  Added ROOT, META, USER priorities


* In RegionTransitionData, store the actual byte[] regionName rather than
  the encoded name

  >>  Done.  We should also get in practice of naming variables encodedName if
      it is that.


* Executor services need to be using a priority queue

  >>  Done.  I think all stuff to set pool size and add priorities is in.


* In EventType, completely remove differentiating between Master and RS.
  This means for a given EventType, it will map to the same handler whether it
  is on RS or Master.

  >>  Done.

  Also in EventType, remove fromByte() and use an ordinal() method
  
  >>  Done.  Can we remove even having the (int) values for the enums now?


Summary
-------

This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.

I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.

All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.

All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.

The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...

Some new methods in FileSystemManager, collected from being scattered around.

This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.

Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)

RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...

I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.

Among other changes....


Here's my rough list of things left to do (covers most but not all TODOs in the branch)


* make final decisions on root/meta timeouts.  almost everyone is coordinating
  access through CatalogTracker which should make it easy to standardize.
  if there are operations that should just retry indefinitely, they need to
  resubmit themselves to their executor service.

* move splits to RS side, integrate new patch from stack on trunk
  might need a new CREATED unassigned now, or new rpc, but get rid of sending
  split notification on heartbeat?
  how to handle splits concurrent with disable?

* review master startup order

* figure what to do with client table admin ops (flush, split, compact)

* MasterAddressManager move to extending ZooKeeperNodeTracker

* on region open (and wherever split children notify master) should check if
  if the table is disabled and should close the regions... maybe.

* regionserver exit needs to be reimplemented in servermanager
  also rs expiration

* add priorities and pool size to handlers (bring in from flush patch)

* in RootEditor there is a race condition between delete and watch?

* migrate TestZKBased* tests to use new handlers

* make sync calls for enable/disable (check and verify methods?)

* integrate load balancing

* finish TODOs on new failover path and remove old code in joinExistingCluster()

* finish TODOs on timeout monitor in assignmentmanager

* review filesystemmanager calls

* figure how to handle the very rare but possible race condition where two
  RSs will update META and the later one can squash the valid one if there was
  a long gc pause

* synchronize all access to the boolean in ActiveMasterManager
  (now this is probably just move it to extend ZKNodeTracker)

* there are some races with master wanting to connect for rpc
  to regionserver and the rs starting its rpc server, need to address

* migrate TestMasterTransitions or make new?

* fix or remove last couple master tests that used RSOQ

* write new tests!!!


This addresses bug HBASE-2697.
    http://issues.apache.org/jira/browse/HBASE-2697


Diffs (updated)
-----

  branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyOperationException.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnection.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseMetaHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRootHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenMetaHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRootHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ClusterStatusTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedCloseRegion.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedReopenRegion.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/util/TestBytes.java 979909 

Diff: http://review.cloudera.org/r/484/diff


Testing
-------

simple cluster tests passing, need to run (and write) more tests


Thanks,

Jonathan


Re: Review Request: HBASE-2697 The master rewrite

Posted by Jonathan Gray <jg...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/
-----------------------------------------------------------

(Updated 2010-08-04 10:37:33.110269)


Review request for hbase, stack and Karthik Ranganathan.


Changes
-------

Not much new, just started work on client-side new admin functions.


Summary
-------

This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.

I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.

All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.

All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.

The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...

Some new methods in FileSystemManager, collected from being scattered around.

This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.

Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)

RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...

I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.

Among other changes....


Here's my rough list of things left to do (covers most but not all TODOs in the branch)


* make final decisions on root/meta timeouts.  almost everyone is coordinating
  access through CatalogTracker which should make it easy to standardize.
  if there are operations that should just retry indefinitely, they need to
  resubmit themselves to their executor service.

* move splits to RS side, integrate new patch from stack on trunk
  might need a new CREATED unassigned now, or new rpc, but get rid of sending
  split notification on heartbeat?
  how to handle splits concurrent with disable?

* review master startup order

* figure what to do with client table admin ops (flush, split, compact)

* MasterAddressManager move to extending ZooKeeperNodeTracker

* on region open (and wherever split children notify master) should check if
  if the table is disabled and should close the regions... maybe.

* regionserver exit needs to be reimplemented in servermanager
  also rs expiration

* add priorities and pool size to handlers (bring in from flush patch)

* in RootEditor there is a race condition between delete and watch?

* migrate TestZKBased* tests to use new handlers

* make sync calls for enable/disable (check and verify methods?)

* integrate load balancing

* finish TODOs on new failover path and remove old code in joinExistingCluster()

* finish TODOs on timeout monitor in assignmentmanager

* review filesystemmanager calls

* figure how to handle the very rare but possible race condition where two
  RSs will update META and the later one can squash the valid one if there was
  a long gc pause

* synchronize all access to the boolean in ActiveMasterManager
  (now this is probably just move it to extend ZKNodeTracker)

* there are some races with master wanting to connect for rpc
  to regionserver and the rs starting its rpc server, need to address

* migrate TestMasterTransitions or make new?

* fix or remove last couple master tests that used RSOQ

* write new tests!!!


This addresses bug HBASE-2697.
    http://issues.apache.org/jira/browse/HBASE-2697


Diffs (updated)
-----

  branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyException.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootEditor.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnection.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ClusterStatusTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedCloseRegion.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedReopenRegion.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java 979909 

Diff: http://review.cloudera.org/r/484/diff


Testing
-------

simple cluster tests passing, need to run (and write) more tests


Thanks,

Jonathan


Re: Review Request: HBASE-2697 The master rewrite

Posted by Jonathan Gray <jg...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/
-----------------------------------------------------------

(Updated 2010-08-03 21:55:46.272720)


Review request for hbase, stack and Karthik Ranganathan.


Changes
-------

Changes:

* added three new rpc methods to RS: flushRegion, splitRegion, compactRegion
  still needs to be changed in the client

* implemented remaining TODOs on new failover path

* remove old failover code

* full renovation of HMaster.modifyTable() back to just updating table htd

* implemented remaining TODOs on timeout monitor


Summary
-------

This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.

I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.

All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.

All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.

The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...

Some new methods in FileSystemManager, collected from being scattered around.

This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.

Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)

RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...

I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.

Among other changes....


Here's my rough list of things left to do (covers most but not all TODOs in the branch)


* make final decisions on root/meta timeouts.  almost everyone is coordinating
  access through CatalogTracker which should make it easy to standardize.
  if there are operations that should just retry indefinitely, they need to
  resubmit themselves to their executor service.

* move splits to RS side, integrate new patch from stack on trunk
  might need a new CREATED unassigned now, or new rpc, but get rid of sending
  split notification on heartbeat?
  how to handle splits concurrent with disable?

* review master startup order

* figure what to do with client table admin ops (flush, split, compact)

* MasterAddressManager move to extending ZooKeeperNodeTracker

* on region open (and wherever split children notify master) should check if
  if the table is disabled and should close the regions... maybe.

* regionserver exit needs to be reimplemented in servermanager
  also rs expiration

* add priorities and pool size to handlers (bring in from flush patch)

* in RootEditor there is a race condition between delete and watch?

* migrate TestZKBased* tests to use new handlers

* make sync calls for enable/disable (check and verify methods?)

* integrate load balancing

* finish TODOs on new failover path and remove old code in joinExistingCluster()

* finish TODOs on timeout monitor in assignmentmanager

* review filesystemmanager calls

* figure how to handle the very rare but possible race condition where two
  RSs will update META and the later one can squash the valid one if there was
  a long gc pause

* synchronize all access to the boolean in ActiveMasterManager
  (now this is probably just move it to extend ZKNodeTracker)

* there are some races with master wanting to connect for rpc
  to regionserver and the rs starting its rpc server, need to address

* migrate TestMasterTransitions or make new?

* fix or remove last couple master tests that used RSOQ

* write new tests!!!


This addresses bug HBASE-2697.
    http://issues.apache.org/jira/browse/HBASE-2697


Diffs (updated)
-----

  branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyException.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootEditor.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ClusterStatusTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedCloseRegion.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedReopenRegion.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java 979909 

Diff: http://review.cloudera.org/r/484/diff


Testing
-------

simple cluster tests passing, need to run (and write) more tests


Thanks,

Jonathan


Re: Review Request: HBASE-2697 The master rewrite

Posted by st...@duboce.net.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/#review658
-----------------------------------------------------------


Here's a bit more on the TODO and the comments on patch.


branches/0.90_master_rewrite/BRANCH_TODO.txt
<http://review.cloudera.org/r/484/#comment2443>

    First some comments on your summary:
    
    "I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former ... The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.
    "
    
    How are splits to work then? The RS does insert into META? But if it doesn't do all inserts -- fails between offlining of parent and addition of daughters to meta -- we'll need the fixup for when a split is not fully reported.  This used to be done in basescanner.
    
    "might need a new CREATED unassigned now, or new rpc, but get rid of sending split notification on heartbeat?"
    
    Who'd make the above CREATED?
    
    "figure what to do with client table admin ops (flush, split, compact)"
    
    The operations are in place, you are just asking about how to send the messages?
    
    
    



branches/0.90_master_rewrite/BRANCH_TODO.txt
<http://review.cloudera.org/r/484/#comment2444>

    Why would they be updating the same row?



branches/0.90_master_rewrite/BRANCH_TODO.txt
<http://review.cloudera.org/r/484/#comment2446>

    You renamed these classes or the ZK changes are done?



branches/0.90_master_rewrite/BRANCH_TODO.txt
<http://review.cloudera.org/r/484/#comment2447>

    I like just reusing is meta idea for now



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java
<http://review.cloudera.org/r/484/#comment2448>

    No benefit to be got making this a NavigableMap/SortedMap?


- stack


On 2010-08-03 17:13:11, Jonathan Gray wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/484/
> -----------------------------------------------------------
> 
> (Updated 2010-08-03 17:13:11)
> 
> 
> Review request for hbase, stack and Karthik Ranganathan.
> 
> 
> Summary
> -------
> 
> This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.
> 
> I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.
> 
> All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.
> 
> All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.
> 
> The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...
> 
> Some new methods in FileSystemManager, collected from being scattered around.
> 
> This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.
> 
> Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)
> 
> RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...
> 
> I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.
> 
> Among other changes....
> 
> 
> Here's my rough list of things left to do (covers most but not all TODOs in the branch)
> 
> 
> * make final decisions on root/meta timeouts.  almost everyone is coordinating
>   access through CatalogTracker which should make it easy to standardize.
>   if there are operations that should just retry indefinitely, they need to
>   resubmit themselves to their executor service.
> 
> * move splits to RS side, integrate new patch from stack on trunk
>   might need a new CREATED unassigned now, or new rpc, but get rid of sending
>   split notification on heartbeat?
>   how to handle splits concurrent with disable?
> 
> * review master startup order
> 
> * figure what to do with client table admin ops (flush, split, compact)
> 
> * MasterAddressManager move to extending ZooKeeperNodeTracker
> 
> * on region open (and wherever split children notify master) should check if
>   if the table is disabled and should close the regions... maybe.
> 
> * regionserver exit needs to be reimplemented in servermanager
>   also rs expiration
> 
> * add priorities and pool size to handlers (bring in from flush patch)
> 
> * in RootEditor there is a race condition between delete and watch?
> 
> * migrate TestZKBased* tests to use new handlers
> 
> * make sync calls for enable/disable (check and verify methods?)
> 
> * integrate load balancing
> 
> * finish TODOs on new failover path and remove old code in joinExistingCluster()
> 
> * finish TODOs on timeout monitor in assignmentmanager
> 
> * review filesystemmanager calls
> 
> * figure how to handle the very rare but possible race condition where two
>   RSs will update META and the later one can squash the valid one if there was
>   a long gc pause
> 
> * synchronize all access to the boolean in ActiveMasterManager
>   (now this is probably just move it to extend ZKNodeTracker)
> 
> * there are some races with master wanting to connect for rpc
>   to regionserver and the rs starting its rpc server, need to address
> 
> * migrate TestMasterTransitions or make new?
> 
> * fix or remove last couple master tests that used RSOQ
> 
> * write new tests!!!
> 
> 
> This addresses bug HBASE-2697.
>     http://issues.apache.org/jira/browse/HBASE-2697
> 
> 
> Diffs
> -----
> 
>   branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyException.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ClusterStatusTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedCloseRegion.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedReopenRegion.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java 979909 
> 
> Diff: http://review.cloudera.org/r/484/diff
> 
> 
> Testing
> -------
> 
> simple cluster tests passing, need to run (and write) more tests
> 
> 
> Thanks,
> 
> Jonathan
> 
>


Re: Review Request: HBASE-2697 The master rewrite

Posted by Jonathan Gray <jg...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/
-----------------------------------------------------------

(Updated 2010-08-03 17:13:11.559426)


Review request for hbase, stack and Karthik Ranganathan.


Changes
-------

Changes:

* MasterAddressManager stripped down and now extending ZooKeeperNodeTracker

* cleaned up synchronization a bit in AssignmentManager.assign()

* added priorities and configurable thread pool sizes to EventHandlers

* added priorities ROOT/META/USER to OpenRegionHandler

* fixed bug in CloseRegionHandler

* migrated TestZKBasedCloseRegion to new stuff and it works (but needs cluster
  shutdown to be finished)

* migrated TestZKBasedReopenRegion but not working.  for some reason the master
  does not get the ZK notifications for the close of the region.


Summary
-------

This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.

I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.

All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.

All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.

The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...

Some new methods in FileSystemManager, collected from being scattered around.

This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.

Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)

RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...

I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.

Among other changes....


Here's my rough list of things left to do (covers most but not all TODOs in the branch)


* make final decisions on root/meta timeouts.  almost everyone is coordinating
  access through CatalogTracker which should make it easy to standardize.
  if there are operations that should just retry indefinitely, they need to
  resubmit themselves to their executor service.

* move splits to RS side, integrate new patch from stack on trunk
  might need a new CREATED unassigned now, or new rpc, but get rid of sending
  split notification on heartbeat?
  how to handle splits concurrent with disable?

* review master startup order

* figure what to do with client table admin ops (flush, split, compact)

* MasterAddressManager move to extending ZooKeeperNodeTracker

* on region open (and wherever split children notify master) should check if
  if the table is disabled and should close the regions... maybe.

* regionserver exit needs to be reimplemented in servermanager
  also rs expiration

* add priorities and pool size to handlers (bring in from flush patch)

* in RootEditor there is a race condition between delete and watch?

* migrate TestZKBased* tests to use new handlers

* make sync calls for enable/disable (check and verify methods?)

* integrate load balancing

* finish TODOs on new failover path and remove old code in joinExistingCluster()

* finish TODOs on timeout monitor in assignmentmanager

* review filesystemmanager calls

* figure how to handle the very rare but possible race condition where two
  RSs will update META and the later one can squash the valid one if there was
  a long gc pause

* synchronize all access to the boolean in ActiveMasterManager
  (now this is probably just move it to extend ZKNodeTracker)

* there are some races with master wanting to connect for rpc
  to regionserver and the rs starting its rpc server, need to address

* migrate TestMasterTransitions or make new?

* fix or remove last couple master tests that used RSOQ

* write new tests!!!


This addresses bug HBASE-2697.
    http://issues.apache.org/jira/browse/HBASE-2697


Diffs (updated)
-----

  branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyException.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootEditor.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ClusterStatusTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
  branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedCloseRegion.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestZKBasedReopenRegion.java 979909 
  branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java 979909 

Diff: http://review.cloudera.org/r/484/diff


Testing
-------

simple cluster tests passing, need to run (and write) more tests


Thanks,

Jonathan


Re: Review Request: HBASE-2697 The master rewrite

Posted by st...@duboce.net.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/#review646
-----------------------------------------------------------


I only got to first file and didn't even finish that... will do more later


branches/0.90_master_rewrite/BRANCH_TODO.txt
<http://review.cloudera.org/r/484/#comment2424>

    Why not let it be on the heartbeat? We have to have a handler for case where it does not make it across anyways?  Or is it because there is no longer a mechanism for taking action on heartbeat other than the passing of server load?



branches/0.90_master_rewrite/BRANCH_TODO.txt
<http://review.cloudera.org/r/484/#comment2425>

    Disable/Enable has to be sloppy and allow splits to come in.  You could set a disabling flag in zk but there'd be a race between the "point of no return", the updating of meta that parent is offlined, and checking flag in zk.



branches/0.90_master_rewrite/BRANCH_TODO.txt
<http://review.cloudera.org/r/484/#comment2426>

    Do the old ones works still?



branches/0.90_master_rewrite/BRANCH_TODO.txt
<http://review.cloudera.org/r/484/#comment2427>

    Nah


- stack


On 2010-08-03 05:02:49, Jonathan Gray wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/484/
> -----------------------------------------------------------
> 
> (Updated 2010-08-03 05:02:49)
> 
> 
> Review request for hbase, stack and Karthik Ranganathan.
> 
> 
> Summary
> -------
> 
> This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.
> 
> I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.
> 
> All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.
> 
> All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.
> 
> The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...
> 
> Some new methods in FileSystemManager, collected from being scattered around.
> 
> This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.
> 
> Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)
> 
> RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...
> 
> I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.
> 
> Among other changes....
> 
> 
> Here's my rough list of things left to do (covers most but not all TODOs in the branch)
> 
> 
> * make final decisions on root/meta timeouts.  almost everyone is coordinating
>   access through CatalogTracker which should make it easy to standardize.
>   if there are operations that should just retry indefinitely, they need to
>   resubmit themselves to their executor service.
> 
> * move splits to RS side, integrate new patch from stack on trunk
>   might need a new CREATED unassigned now, or new rpc, but get rid of sending
>   split notification on heartbeat?
>   how to handle splits concurrent with disable?
> 
> * review master startup order
> 
> * figure what to do with client table admin ops (flush, split, compact)
> 
> * MasterAddressManager move to extending ZooKeeperNodeTracker
> 
> * on region open (and wherever split children notify master) should check if
>   if the table is disabled and should close the regions... maybe.
> 
> * regionserver exit needs to be reimplemented in servermanager
>   also rs expiration
> 
> * add priorities and pool size to handlers (bring in from flush patch)
> 
> * in RootEditor there is a race condition between delete and watch?
> 
> * migrate TestZKBased* tests to use new handlers
> 
> * make sync calls for enable/disable (check and verify methods?)
> 
> * integrate load balancing
> 
> * finish TODOs on new failover path and remove old code in joinExistingCluster()
> 
> * finish TODOs on timeout monitor in assignmentmanager
> 
> * review filesystemmanager calls
> 
> * figure how to handle the very rare but possible race condition where two
>   RSs will update META and the later one can squash the valid one if there was
>   a long gc pause
> 
> * synchronize all access to the boolean in ActiveMasterManager
>   (now this is probably just move it to extend ZKNodeTracker)
> 
> * there are some races with master wanting to connect for rpc
>   to regionserver and the rs starting its rpc server, need to address
> 
> * migrate TestMasterTransitions or make new?
> 
> * fix or remove last couple master tests that used RSOQ
> 
> * write new tests!!!
> 
> 
> This addresses bug HBASE-2697.
>     http://issues.apache.org/jira/browse/HBASE-2697
> 
> 
> Diffs
> -----
> 
>   branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyException.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 
> 
> Diff: http://review.cloudera.org/r/484/diff
> 
> 
> Testing
> -------
> 
> simple cluster tests passing, need to run (and write) more tests
> 
> 
> Thanks,
> 
> Jonathan
> 
>


Re: Review Request: HBASE-2697 The master rewrite

Posted by Jean-Daniel Cryans <jd...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/484/#review636
-----------------------------------------------------------


First pass, still trying to understand what's going on


branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
<http://review.cloudera.org/r/484/#comment2393>

    unused?



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java
<http://review.cloudera.org/r/484/#comment2383>

    Possible NPE



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java
<http://review.cloudera.org/r/484/#comment2388>

    Possible NPE, server can be null



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java
<http://review.cloudera.org/r/484/#comment2384>

    Possible NPE



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java
<http://review.cloudera.org/r/484/#comment2385>

    Possible NPE



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
<http://review.cloudera.org/r/484/#comment2386>

    Possible NPE



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
<http://review.cloudera.org/r/484/#comment2387>

    Possible NPE, metaServer can be null



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
<http://review.cloudera.org/r/484/#comment2389>

    Possible NPE



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
<http://review.cloudera.org/r/484/#comment2397>

    javadoc for this method and the next one



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
<http://review.cloudera.org/r/484/#comment2401>

    This method and a few other ones in this class looks almost the same, I think there's a refactoring potential



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
<http://review.cloudera.org/r/484/#comment2390>

    Possible NPE



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
<http://review.cloudera.org/r/484/#comment2398>

    Somehow this line looks weird to me, not sure if it's how you build the row or that you don't use a final for ",,"



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
<http://review.cloudera.org/r/484/#comment2391>

    NPE



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java
<http://review.cloudera.org/r/484/#comment2392>

    NPE



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java
<http://review.cloudera.org/r/484/#comment2402>

    could enclose in a ifDebugEnabled



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
<http://review.cloudera.org/r/484/#comment2404>

    Javadoc looks anemic compared to the importance of that class



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
<http://review.cloudera.org/r/484/#comment2405>

    Could you handle if regionState is null at this level?



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
<http://review.cloudera.org/r/484/#comment2406>

    Is this code coming back later?



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
<http://review.cloudera.org/r/484/#comment2407>

    Looks fishy, what happens if anyone uses that state outside of that method?



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
<http://review.cloudera.org/r/484/#comment2409>

    Any investigation needed?



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
<http://review.cloudera.org/r/484/#comment2410>

    What happens if it never returns? Also I wonder if it's feasible that the assignment somehow ends up waiting on itself because of region server failure



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
<http://review.cloudera.org/r/484/#comment2412>

    I see red boxes :)



branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
<http://review.cloudera.org/r/484/#comment2413>

    red


- Jean-Daniel


On 2010-08-03 05:02:49, Jonathan Gray wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/484/
> -----------------------------------------------------------
> 
> (Updated 2010-08-03 05:02:49)
> 
> 
> Review request for hbase, stack and Karthik Ranganathan.
> 
> 
> Summary
> -------
> 
> This is basically the master rewrite.  It's massive, I apologize.  I can start a cluster, create a table, insert a row, and get the row.  Restarting should work too.
> 
> I completely ripped out BaseScanner/MetaScanner as well as the RegionServerOperationQueue stuff.  I'm sold on the former but there's a bit of work to fully replace the latter though I think it's a desirable direction to get rid of it.  The RSOQ stuff has been replaced with EventHandlers.  The BaseScanner is no longer necessary because we are not utilizing META to determine when things are unassigned.
> 
> All the table operations (enable/disable, create/delete, addcolumn/del/modify) have been reimplemented in handlers.
> 
> All META/ROOT server access is done through the new CatalogManager.  This uses ZK and ROOT/META to keep active track of the current locations of the catalog regions.  A bit more work needs to be done on defining our retry semantics for different operations.
> 
> The actual read/write operations on ROOT/META are done with new classes RootEditor, MetaEditor, and MetaReader.  A lot of this code was scattered around HMaster, RegionManager, ServerManager, HRegion, etc...
> 
> Some new methods in FileSystemManager, collected from being scattered around.
> 
> This also does final straw to old ZKWrapper and removes it.  All ZK stuff is going through the new stuff.
> 
> Open/close is now going via RPC and ZK, no longer piggybacking any of these messages on heartbeats (and with splits not working right now, even that is gone, so there will be nothing it seems)
> 
> RegionManager was completely removed and replaced with a shiny, brand new AssignmentManager.  It's already managed to get a bit hectic but is pretty straightforward.  It is the heart of zk-based region assignment and most logic is in here.  It contains regionsInTransition, the region plans, etc...
> 
> I added a new /disabled node in ZK to keep track of tables that are disabled.  I think we need this to be able to reliably do disables across failures.
> 
> Among other changes....
> 
> 
> Here's my rough list of things left to do (covers most but not all TODOs in the branch)
> 
> 
> * make final decisions on root/meta timeouts.  almost everyone is coordinating
>   access through CatalogTracker which should make it easy to standardize.
>   if there are operations that should just retry indefinitely, they need to
>   resubmit themselves to their executor service.
> 
> * move splits to RS side, integrate new patch from stack on trunk
>   might need a new CREATED unassigned now, or new rpc, but get rid of sending
>   split notification on heartbeat?
>   how to handle splits concurrent with disable?
> 
> * review master startup order
> 
> * figure what to do with client table admin ops (flush, split, compact)
> 
> * MasterAddressManager move to extending ZooKeeperNodeTracker
> 
> * on region open (and wherever split children notify master) should check if
>   if the table is disabled and should close the regions... maybe.
> 
> * regionserver exit needs to be reimplemented in servermanager
>   also rs expiration
> 
> * add priorities and pool size to handlers (bring in from flush patch)
> 
> * in RootEditor there is a race condition between delete and watch?
> 
> * migrate TestZKBased* tests to use new handlers
> 
> * make sync calls for enable/disable (check and verify methods?)
> 
> * integrate load balancing
> 
> * finish TODOs on new failover path and remove old code in joinExistingCluster()
> 
> * finish TODOs on timeout monitor in assignmentmanager
> 
> * review filesystemmanager calls
> 
> * figure how to handle the very rare but possible race condition where two
>   RSs will update META and the later one can squash the valid one if there was
>   a long gc pause
> 
> * synchronize all access to the boolean in ActiveMasterManager
>   (now this is probably just move it to extend ZKNodeTracker)
> 
> * there are some races with master wanting to connect for rpc
>   to regionserver and the rs starting its rpc server, need to address
> 
> * migrate TestMasterTransitions or make new?
> 
> * fix or remove last couple master tests that used RSOQ
> 
> * write new tests!!!
> 
> 
> This addresses bug HBASE-2697.
>     http://issues.apache.org/jira/browse/HBASE-2697
> 
> 
> Diffs
> -----
> 
>   branches/0.90_master_rewrite/BRANCH_TODO.txt 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ClusterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/HServerInfo.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/InvalidFamilyException.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/MetaReader.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/catalog/RootEditor.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/EventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AddColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/BaseScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ChangeTableState.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ColumnOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/DeleteColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/FileSystemManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/LoadBalancer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MasterStatus.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/MetaScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyColumn.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ModifyTableMeta.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionClose.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionStatusChange.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessServerShutdown.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionServerOperationQueue.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RetryableMetaOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RootScanner.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableDelete.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/TableOperation.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DeleteTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/DisableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/EnableTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ModifyTableHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/OpenedRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableAddFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableDeleteFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableEventHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/TableModifyFamilyHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerController.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaNodeTracker.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKTableDisable.java PRE-CREATION 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperListener.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 979909 
>   branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWrapper.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMaster.java 979909 
>   branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 979909 
> 
> Diff: http://review.cloudera.org/r/484/diff
> 
> 
> Testing
> -------
> 
> simple cluster tests passing, need to run (and write) more tests
> 
> 
> Thanks,
> 
> Jonathan
> 
>