You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Jim Kellerman (JIRA)" <ji...@apache.org> on 2007/10/03 00:25:50 UTC

[jira] Commented: (HADOOP-1937) [hbase] when the master times out a region server's lease, it is too aggressive in reclaiming the server's log

    [ https://issues.apache.org/jira/browse/HADOOP-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12531933 ] 

Jim Kellerman commented on HADOOP-1937:
---------------------------------------

Revised strategy:

With HADOOP-1960, if the region server cannot talk to the master
before its lease expires it shuts itself down. Thus the likelihood of
a region server checking in after its lease has expired is low. In the
event this does happen, however, the master will tell the region
server to restart; that is close all open regions and flush its log.

However, the master should defer processing the server's log and
reassigning its regions as the server may still be in the process of
shutting down. Consequently, all PendingServerShutdowns will be placed
in a delay queue for 1/2 a lease period to ensure the region server
has shut down.

Finally, we will add the server start code to the log file name, so
that if the region server restarts before the master processes the old
log file, the new log file will not be included.


> [hbase] when the master times out a region server's lease, it is too aggressive in reclaiming the server's log
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1937
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1937
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>    Affects Versions: 0.15.0
>            Reporter: Jim Kellerman
>            Assignee: Jim Kellerman
>             Fix For: 0.15.0
>
>
> When a region server's lease times out, the master immediately begins trying to split the server's log file. There have been cases where a region server was just a little late reporting to the master and the master had already started trying to reclaim the server's log, even though the server was still writing to it. 
> There needs to be some kind of "grace period" in which, if the region server reports in, the master re-instates the server. If the "grace period" expires, then the master should start processing the server's log.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.