You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Jim Kellerman (JIRA)" <ji...@apache.org> on 2007/09/10 18:22:30 UTC

[jira] Commented: (HADOOP-1816) [hbase] Scan of .META. does socket timeout over and over again (rather than

    [ https://issues.apache.org/jira/browse/HADOOP-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12526183 ] 

Jim Kellerman commented on HADOOP-1816:
---------------------------------------

If a region server cannot contact the HDFS, it should shut itself down. In this case the master will notice when the region server's lease times out and reassign the region.

> [hbase] Scan of .META. does socket timeout over and over again (rather than 
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-1816
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1816
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: Jim Kellerman
>            Priority: Trivial
>         Attachments: excerpt.txt
>
>
> A mismatch in the code on the cluster revealed an infinite loop.  The .META. scanner is doing a socket timeout trying to contact a borked region server (The borked server was having trouble contacting hdfs because of of code version mismatch -- it was sort-of-working).  We retry the timeout up to the retry limit but then rather than try and redeploy the unreachable .META. we just drop back into scanning at the old location.... I'll attach a log that illustrates the goings-on.
> I think this likely a trivial issue since it shouldn't really ever happen....

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.