You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2013/12/30 13:49:03 UTC

[Hadoop Wiki] Update of "YourNetworkYourProblem" by SteveLoughran

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "YourNetworkYourProblem" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/YourNetworkYourProblem

Comment:
add an explicit page about your network being your problem

New page:
= Your Network Your Problem =

Hadoop is a distributed application that runs across a cluster of machines.

For it to work, all these machines must be able to find each other, to talk to each other, and indeed, simply identify themselves so that other machines in the cluster can find them.

Externally accessible Hadoop clusters need to be visible across the rest of the network which needs access to it. 

And of course, all these machines need to be wired together using network switches and routers.

For that reason, network setup is a critical part of a Hadoop cluster. If you do not do this, Hadoop will not work and you will be left staring at stack traces in Hadoop logs trying to diagnose what is wrong. You may even file bug reports saying "Help! Hadoop doesn't work!"

It does work for everybody else -and the reason it does not work for you is because the network is misconfigured it doesn't.

And, because it is your network, nobody else is going to fix it for you --except in the special case that you are using a paid packaging of Hadoop, where you should contact your vendor and ask them for help. The Hadoop developers cannot and will not help you: filing bug reports will simply result in the issue being closed as invalid along with a link to the InvalidJiraIssues page.

Here are some of the common problems in network and host configurations

 1. DNS and reverse DNS broken/non-existent.
 2. Host tables in the machines invalid.
 3. Firewalls in the hosts blocking connections.
 4. Routers blocking traffic.
 5. Hosts with multiple network cards listening/talking on the wrong NIC.
 5. Difference between the hadoop configuration files' definition of the cluster (especially hostnames and ports) from that of the actual cluster setup.

The TroubleShooting page lists some recurrent error messages, possible root causes and ways to track down the problem.

If these don't work, you could consider asking for help on the hadoop user list -but remember, it is your network, and nobody else is going to be able to fix it.

The key point to remember is this: it is your network that