You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2011/04/19 06:12:05 UTC
[jira] [Commented] (HBASE-3580) Remove RS from DeadServer when new
instance checks in
[ https://issues.apache.org/jira/browse/HBASE-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021400#comment-13021400 ]
stack commented on HBASE-3580:
------------------------------
Are port numbers hardcoded?
{code}
+ // Already dead = 127.0.0.1,9090,112321
+ // Coming back alive = 127.0.0.1,9090,223341
{code}
If so, do they have to be?
Otherwise patch looks good to me.
> Remove RS from DeadServer when new instance checks in
> -----------------------------------------------------
>
> Key: HBASE-3580
> URL: https://issues.apache.org/jira/browse/HBASE-3580
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 0.90.0
> Reporter: Jean-Daniel Cryans
> Assignee: Jean-Daniel Cryans
> Fix For: 0.90.3
>
> Attachments: HBASE-3580-Remove-RS-from-DeadServer-when-new-instance-checks-in.patch, HBASE-3580-v3.patch, HBASE-3580_-_Remove_RS_from_dead_server_when_the_RS_when_new_instance_checks_in3.patch
>
>
> Keeping the servers in DeadServer until it reaches some maximum isn't super friendly, it confuses even the best of our users:
> {quote}
> 09:27 < gbowyer> Hi all, I have apparently three dead RS in my cluster, I cannot find references to them in HDFS or in ZK, how do I still report dead RS
> 09:27 < gbowyer> also the same nodes are reported as live region servers
> {quote}
> The subtil startcode difference can be hard to catch, also this behavior differs from 0.20 (so old users get confused, like I did when debugging this problem) and it also differs from Hadoop's handling of dead DataNodes. It was introduced in HBASE-3282.
> I think this should be improved by doing like Hadoop does, removing the RS from DeadServers when a new instance with the same hostname+port checks in. Stack says we should do it in ServerManager.checkIsDead
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira