You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "Lanati, Matteo" <Ma...@lrz.de> on 2013/08/06 10:58:23 UTC

Hadoop network setup

Hi all,

the question about the setup of Hadoop with multiple network card has been asked many times, but I couldn't find the info that I needed. Sorry if this is a duplicate: in this case just point me to the right documents.

My nodes have two interfaces, eth0 with a public IP and eth1 with a private one. I would like the users to use the public interface while the private one should be reserved for intra-cluster communication. At the moment I'm focusing on HDFS.
Having assigned the first interface eth0 to the public network, HDFS daemons listen to it, so I was able to use the filesystem from any machine. The same interface is also used for the communication between Namenode and Datanode or among Datanodes involving ports defined by dfs.datanode.address and fs.default.name. Is it possible to instruct HDFS to use eth1 for this?
I tried to play with the following options:

- dfs.datanode.dns.interface: it only affects the DFS node status page published by the Namenode
- dfs.client.local.interfaces: I tried to set this up on the Datanode, but I didn't see any change

Finally, can you please explain the role of the hdfs-site.xml option 'dfs.datanode.ipc.address'? What is this IPC server used for? By default it is set to 50020. I tried to dump the network traffic while performing some operations but the port was never used.

Thanks,

Matteo




Matteo Lanati
Distributed Resources Group
Leibniz-Rechenzentrum (LRZ)
Boltzmannstrasse 1
85748	Garching b. München	(Germany)
Phone: +49 89 35831 8724