You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/07/21 23:43:14 UTC

[jira] Created: (HBASE-1679) Flapping DNS does us more harm than it need to

Flapping DNS does us more harm than it need to
----------------------------------------------

                 Key: HBASE-1679
                 URL: https://issues.apache.org/jira/browse/HBASE-1679
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack


Over in HBASE-1675, JSharp has posted logs where a temporary DNS outage does his cluster a death blow.

When cluster members report in, the master composes the regionserver name by doing a hostname lookup and appending it to port and startcode passed over by the regionserver.  The host lookup during a DNS outage when from name to IP.  Master then thought this regionserver an unknown host and told it restart.... and so on.

If the regionserver composed its name once, it could pass this the master and avoid a DNS lookup per regionserver report.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1679) Flapping DNS does us more harm than it need to

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806579#action_12806579 ] 

Kannan Muthukkaruppan commented on HBASE-1679:
----------------------------------------------

Yes, having some of these key fixes in a 0.20.4 release would be very helpful.



> Flapping DNS does us more harm than it need to
> ----------------------------------------------
>
>                 Key: HBASE-1679
>                 URL: https://issues.apache.org/jira/browse/HBASE-1679
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.20.4, 0.21.0
>
>
> Over in HBASE-1675, JSharp has posted logs where a temporary DNS outage does his cluster a death blow.
> When cluster members report in, the master composes the regionserver name by doing a hostname lookup and appending it to port and startcode passed over by the regionserver.  The host lookup during a DNS outage when from name to IP.  Master then thought this regionserver an unknown host and told it restart.... and so on.
> If the regionserver composed its name once, it could pass this the master and avoid a DNS lookup per regionserver report.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1679) Flapping DNS does us more harm than it need to

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1679:
-------------------------

         Priority: Critical  (was: Major)
    Fix Version/s: 0.20.4

Upped priority and marking as a fix for 0.20.4 as well as 0.21,  if we do a 0.20.4.

@Kannan, you fellas might want a 0.20.4 at some stage?  In general thought has been 0.20.3 is likely last on that branch since focus is over on 0.21 now but maybe you could do with a release that has fixes for the likes of this?

> Flapping DNS does us more harm than it need to
> ----------------------------------------------
>
>                 Key: HBASE-1679
>                 URL: https://issues.apache.org/jira/browse/HBASE-1679
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.20.4, 0.21.0
>
>
> Over in HBASE-1675, JSharp has posted logs where a temporary DNS outage does his cluster a death blow.
> When cluster members report in, the master composes the regionserver name by doing a hostname lookup and appending it to port and startcode passed over by the regionserver.  The host lookup during a DNS outage when from name to IP.  Master then thought this regionserver an unknown host and told it restart.... and so on.
> If the regionserver composed its name once, it could pass this the master and avoid a DNS lookup per regionserver report.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1679) Flapping DNS does us more harm than it need to

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806543#action_12806543 ] 

Kannan Muthukkaruppan commented on HBASE-1679:
----------------------------------------------

Yup, this is pretty much the issue we ran into.

And when the regionserver was asked to restarted, it registered itself again under /hbase/rs in zookeeper using a new startcode as the element (znode) name.

> Flapping DNS does us more harm than it need to
> ----------------------------------------------
>
>                 Key: HBASE-1679
>                 URL: https://issues.apache.org/jira/browse/HBASE-1679
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.21.0
>
>
> Over in HBASE-1675, JSharp has posted logs where a temporary DNS outage does his cluster a death blow.
> When cluster members report in, the master composes the regionserver name by doing a hostname lookup and appending it to port and startcode passed over by the regionserver.  The host lookup during a DNS outage when from name to IP.  Master then thought this regionserver an unknown host and told it restart.... and so on.
> If the regionserver composed its name once, it could pass this the master and avoid a DNS lookup per regionserver report.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-1679) Flapping DNS does us more harm than it need to

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-1679.
--------------------------

    Resolution: Duplicate

Fixed by HBASE-2174.

> Flapping DNS does us more harm than it need to
> ----------------------------------------------
>
>                 Key: HBASE-1679
>                 URL: https://issues.apache.org/jira/browse/HBASE-1679
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>             Fix For: 0.20.4, 0.21.0
>
>
> Over in HBASE-1675, JSharp has posted logs where a temporary DNS outage does his cluster a death blow.
> When cluster members report in, the master composes the regionserver name by doing a hostname lookup and appending it to port and startcode passed over by the regionserver.  The host lookup during a DNS outage when from name to IP.  Master then thought this regionserver an unknown host and told it restart.... and so on.
> If the regionserver composed its name once, it could pass this the master and avoid a DNS lookup per regionserver report.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1679) Flapping DNS does us more harm than it need to

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806544#action_12806544 ] 

Kannan Muthukkaruppan commented on HBASE-1679:
----------------------------------------------

s/asked to restarted/asked to restart

> Flapping DNS does us more harm than it need to
> ----------------------------------------------
>
>                 Key: HBASE-1679
>                 URL: https://issues.apache.org/jira/browse/HBASE-1679
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.21.0
>
>
> Over in HBASE-1675, JSharp has posted logs where a temporary DNS outage does his cluster a death blow.
> When cluster members report in, the master composes the regionserver name by doing a hostname lookup and appending it to port and startcode passed over by the regionserver.  The host lookup during a DNS outage when from name to IP.  Master then thought this regionserver an unknown host and told it restart.... and so on.
> If the regionserver composed its name once, it could pass this the master and avoid a DNS lookup per regionserver report.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.