You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jimmy Xiang (JIRA)" <ji...@apache.org> on 2014/03/13 18:57:43 UTC

[jira] [Comment Edited] (HBASE-10569) Co-locate meta and master

    [ https://issues.apache.org/jira/browse/HBASE-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13933578#comment-13933578 ] 

Jimmy Xiang edited comment on HBASE-10569 at 3/13/14 5:56 PM:
--------------------------------------------------------------

Attached a patch that passed unit tests, integration tests (including ITBLL), and some live cluster tests. Will put it on RB soon when RB is up.

Here is what I have done in this patch:
# Moved RPC related code out of HRegionServer and HMaster so that they are smaller for easier change/maintenance.
# Make HMaster extends HRegionServer so that HMaster is also a HRegionServer, removed duplicate code/parameters.
# Due to 2, HMaster#getMetrics is renamed to getMasterMetrics to avoid naming conflict with HRegionServer#getMetrics. The same has been done to HMaster#getCoprocessors, #getCoprocessorHost.
# Added HRegionServer#getRpcServices and HMaster#getMasterRpcServices to expose the RPC functionalities.
# Changed references related to 3 and 4 (a lot, especially in tests).
# HMaster and HRegionServer share one RPC server and one InfoServer.
# RpcServiceInterface is changed a little. Method #startThreads and #openServer are removed since backup master doesn’t hold the RPC server any more. A parameter HMaster#serviceStarted is introduced to indicate if a master is active so as ServerNotRunningYetException can be thrown before a master is active.
# Master recovery in case of ZK connection loss is removed since it doesn’t recover listeners added in HRegionServer. We can get this feature back if needed. The other reason I didn’t try to get it back is because we are going to use raft to choose active master instead of relying on ZK.
# HRegionServer on the active HMaster communicates with the active HMaster directly instead of going through the RPC. Shortcut helps.
# Master(active/backup) web UI contains info about the corresponding region server.
# Backup master moves users regions away (and meta/namespace region to the master if already assigned somewhere else) after becoming active.
# Integration testing doesn’t restart the master as a region server, or restart the region server that holds the meta. One reason is because the startup script can’t tell if a region server should be master.

Here is a list of things to be done (in separate issues):
# Need to make sure the master listens to the old ports (RPC + webUI) too, so as to support rolling upgrade from old versions (0.96+), and be backward compatible.
# Need to consolidate(?) chores/threads/handlers in master/regionserver, so that the active master manager in the backup master has a high priority so that it can grab the ZK node faster, before we move to raft.
# Clean up MetaServerShutdownHandler and HMaster#assignMeta in next major release when rolling upgrade is not an issue any more. This should be done much later.



was (Author: jxiang):
Attached a patch that passed unit tests, integration tests (including ITBLL), and some live cluster tests. Will put it on RB soon.

Here is what I have done in this patch:
* Moved RPC related code out of HRegionServer and HMaster so that they are smaller for easier change/maintenance.
* Make HMaster extends HRegionServer so that HMaster is also a HRegionServer, removed duplicate code/parameters.
* Due to B, HMaster#getMetrics is renamed to getMasterMetrics to avoid naming conflict with HRegionServer#getMetrics. The same has been done to HMaster#getCoprocessors, #getCoprocessorHost.
* Added HRegionServer#getRpcServices and HMaster#getMasterRpcServices to expose the RPC functionalities.
* Changed references related to C and D (a lot, especially in tests).
* HMaster and HRegionServer share one RPC server and one InfoServer.
* RpcServiceInterface is changed a little. Method #startThreads and #openServer are removed since backup master doesn’t hold the RPC server any more. A parameter HMaster#serviceStarted is introduced to indicate if a master is active so as ServerNotRunningYetException can be thrown before a master is active.
* Master recovery in case of ZK connection loss is removed since it doesn’t recover listeners added in HRegionServer. We can get this feature back if needed. The other reason I didn’t try to get it back is because we are going to use raft to choose active master instead of relying on ZK.
* HRegionServer on the active HMaster communicates with the active HMaster directly instead of going through the RPC. Shortcut helps.
* Master(active/backup) web UI contains info about the corresponding region server.
* Backup master moves users regions away (and meta/namespace region to the master if already assigned somewhere else) after becoming active.
* Integration testing doesn’t restart the master as a region server, or restart the region server that holds the meta. One reason is because the startup script can’t tell if a region server should be master.

Here is a list of things to be done (in separate issues):
* Need to make sure the master listens to the old ports (RPC + webUI) too, so as to support rolling upgrade from old versions (0.96+), and be backward compatible.
* Need to consolidate(?) chores/threads/handlers in master/regionserver, so that the active master manager in the backup master has a high priority so that it can grab the ZK node faster, before we move to raft.
* Clean up MetaServerShutdownHandler and HMaster#assignMeta in next major release when rolling upgrade is not an issue any more. This should be done much later.


> Co-locate meta and master
> -------------------------
>
>                 Key: HBASE-10569
>                 URL: https://issues.apache.org/jira/browse/HBASE-10569
>             Project: HBase
>          Issue Type: Improvement
>          Components: master, Region Assignment
>            Reporter: Jimmy Xiang
>            Assignee: Jimmy Xiang
>         Attachments: hbase-10569_v1.patch
>
>
> I was thinking simplifying/improving the region assignments. The first step is to co-locate the meta and the master as many people agreed on HBASE-5487.



--
This message was sent by Atlassian JIRA
(v6.2#6252)