You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2009/09/07 17:06:57 UTC

[jira] Created: (MAPREDUCE-958) TT should bail out early when mapred.job.tracker is bound to 0:0:0:0

TT should bail out early when mapred.job.tracker is bound to 0:0:0:0
--------------------------------------------------------------------

                 Key: MAPREDUCE-958
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-958
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: tasktracker
    Affects Versions: 0.21.0
            Reporter: Steve Loughran


It's OK for your job tracker's config to tells the JobTracker to come up on port 0:0:0:0, but its not OK for the TaskTrackers to get the same mapred.job.tracker configuration value, as it stops the TT from being able to report its heartbeat.

This misconfiguration surfaces in the TT's {{offerService()}} routine catching and logging a ConnectionRefused exception every time it tries to heartbeat. Now we have improved the error message in such a situation, it is still a bit late in the process to encounter a problem which should be obvious the moment the TT looks at its configuration.


Better to have the TT refuse to start up if {{jobTrackAddr.getAddress().isAnyLocalAddress()}}. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-958) TT should bail out early when mapred.job.tracker is bound to 0:0:0:0

Posted by "Allen Wittenauer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752622#action_12752622 ] 

Allen Wittenauer commented on MAPREDUCE-958:
--------------------------------------------

I'm likely misunderstanding, but won't this prevent hadoop working on machines with only a loopback?  [I'm thinking primarily of single node clusters on laptops on an airplane.]  It would also be interesting to see how Solaris Zones would interpret this changes.

> TT should bail out early when mapred.job.tracker is bound to 0:0:0:0
> --------------------------------------------------------------------
>
>                 Key: MAPREDUCE-958
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-958
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Steve Loughran
>
> It's OK for your job tracker's config to tells the JobTracker to come up on port 0:0:0:0, but its not OK for the TaskTrackers to get the same mapred.job.tracker configuration value, as it stops the TT from being able to report its heartbeat.
> This misconfiguration surfaces in the TT's {{offerService()}} routine catching and logging a ConnectionRefused exception every time it tries to heartbeat. Now we have improved the error message in such a situation, it is still a bit late in the process to encounter a problem which should be obvious the moment the TT looks at its configuration.
> Better to have the TT refuse to start up if {{jobTrackAddr.getAddress().isAnyLocalAddress()}}. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-958) TT should bail out early when mapred.job.tracker is bound to 0:0:0:0

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753009#action_12753009 ] 

Steve Loughran commented on MAPREDUCE-958:
------------------------------------------

That's correct, loopback still works happily. All we need to check for is the {{isAnyLocalAddress()}}  value, and not {{isLoopbackAddress()}}, which is different. Something like (from a TT subclass that has this test and did find my error) :-

{code}
        if(jobTrackAddr.getAddress().isAnyLocalAddress()) {
            throw new IOException("Cannot start the Task Tracker as it has been started with "
            + "mapred.job.tracker set to icp://"+getJobTrackerAddress()+"/");
        }
{code}

Allen, if you fly with colleagues you could set up a WLAN and run work across everyone's machines, though someone needs to bring up a DNS server unless Hadoop works with Bonjour. Me, I stick to virtualized linux images with hand-edited {{/etc/hosts}} files

> TT should bail out early when mapred.job.tracker is bound to 0:0:0:0
> --------------------------------------------------------------------
>
>                 Key: MAPREDUCE-958
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-958
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Steve Loughran
>
> It's OK for your job tracker's config to tells the JobTracker to come up on port 0:0:0:0, but its not OK for the TaskTrackers to get the same mapred.job.tracker configuration value, as it stops the TT from being able to report its heartbeat.
> This misconfiguration surfaces in the TT's {{offerService()}} routine catching and logging a ConnectionRefused exception every time it tries to heartbeat. Now we have improved the error message in such a situation, it is still a bit late in the process to encounter a problem which should be obvious the moment the TT looks at its configuration.
> Better to have the TT refuse to start up if {{jobTrackAddr.getAddress().isAnyLocalAddress()}}. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAPREDUCE-958) TT should bail out early when mapred.job.tracker is bound to 0:0:0:0

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752886#action_12752886 ] 

Vinod K V commented on MAPREDUCE-958:
-------------------------------------

Even I thought so, but when I verified it("http://java.sun.com/javase/6/docs/api/java/net/InetAddress.html#isAnyLocalAddress()"), I realized that this method checks if the InetAddress in a wildcard addresses only and not the loopback/localhost addresses. So I guess single node clusters will still work fine. Steve?

> TT should bail out early when mapred.job.tracker is bound to 0:0:0:0
> --------------------------------------------------------------------
>
>                 Key: MAPREDUCE-958
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-958
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Steve Loughran
>
> It's OK for your job tracker's config to tells the JobTracker to come up on port 0:0:0:0, but its not OK for the TaskTrackers to get the same mapred.job.tracker configuration value, as it stops the TT from being able to report its heartbeat.
> This misconfiguration surfaces in the TT's {{offerService()}} routine catching and logging a ConnectionRefused exception every time it tries to heartbeat. Now we have improved the error message in such a situation, it is still a bit late in the process to encounter a problem which should be obvious the moment the TT looks at its configuration.
> Better to have the TT refuse to start up if {{jobTrackAddr.getAddress().isAnyLocalAddress()}}. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.