You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Daniel Watrous <dw...@gmail.com> on 2015/09/24 19:51:04 UTC
Datanodes not connecting to the cluster
I have a multi-node cluster with two datanodes. After running start-dfs.sh,
I show the following processes running
hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
hadoop-master: 10933 DataNode
hadoop-master: 10759 NameNode
hadoop-master: 11145 SecondaryNameNode
hadoop-master: 11567 Jps
hadoop-data1: 5186 Jps
hadoop-data1: 5059 DataNode
hadoop-data2: 5180 Jps
hadoop-data2: 5053 DataNode
However, the other two DataNodes aren't visible.
http://screencast.com/t/icsLnXXDk
Where can I look for clues?
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
phew, I finally added the property below to yarn-site.xml
<property>
<name>yarn.resourcemanager.bind-host</name>
<value>0.0.0.0</value>
</property>
I now see the datanodes, but not at the same time. Under datanodes in
operation I see the master and either one of the other datanodes. Is that
typical behavior? Perhaps it's switching between them for redundancy?
Daniel
On Thu, Sep 24, 2015 at 2:49 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> I'm making a little progress here.
>
> I added the following properties to hdfs-site.xml
>
> <property>
> <name>dfs.namenode.rpc-bind-host</name>
> <value>0.0.0.0</value>
> </property>
> <property>
> <name>dfs.namenode.servicerpc-bind-host</name>
> <value>0.0.0.0</value>
> </property>
>
> I can now connect to hadoop-master:
>
> hadoop@hadoop-data1:~$ telnet hadoop-master 54310
> Trying 192.168.51.4...
> Connected to hadoop-master.
> Escape character is '^]'.
>
> BUT I'm now getting the error below. I'm confused that it's trying to
> connect to 192.168.51.1 because that's not even a valid IP in my
> installation.
>
> 2015-09-24 19:40:08,821 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
> Block pool BP-852923283-127.0.1.1-1443119668806 (Datanode Uuid null)
> service to hadoop-master/192.168.51.4:54310 Datanode denied communication
> with namenode because hostname cannot be resolved (ip=192.168.51.1,
> hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010,
> datanodeUuid=cd8107ac-f1dd-4c33-a054-2dee95cc4a16, infoPort=50075,
> infoSecurePort=0, ipcPort=50020,
> storageInfo=lv=-56;cid=CID-8e9d0e10-5744-448b-a45a-87644364e714;nsid=1441606918;c=0)
>
> Any idea what's happening here?
>
>
> On Thu, Sep 24, 2015 at 2:26 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> In a further test, I tried connecting to the NameNode from hadoop-master
>> (where it's running) using both the hostname and the IP address.
>>
>> vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
>> Trying 192.168.51.4...
>> telnet: Unable to connect to remote host: Connection refused
>> vagrant@hadoop-master:~/src$ telnet localhost 54310
>> Trying 127.0.0.1...
>> telnet: Unable to connect to remote host: Connection refused
>> vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
>> Trying 127.0.1.1...
>> Connected to hadoop-master.
>> Escape character is '^]'.
>>
>>
>> As you can see the IP address or localhost connection is refused, but the
>> hostname connection succeeds. Is there some way to configure the namenode
>> to accept connections from all hosts?
>>
>> On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
>> wrote:
>>
>>> On one of the namenodes I have found the following warning:
>>>
>>> 2015-09-24 18:40:17,639 WARN
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
>>> server: hadoop-master/192.168.51.4:54310
>>>
>>> On my master node I see that the process is running and has bound that
>>> port
>>>
>>> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>>> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
>>> (LISTEN)
>>> java 7480 hadoop 212u IPv4 28758 0t0 TCP
>>> hadoop-master:54310->localhost:47226 (ESTABLISHED)
>>> java 7651 hadoop 238u IPv4 28247 0t0 TCP
>>> localhost:47226->hadoop-master:54310 (ESTABLISHED)
>>> hadoop@hadoop-master:~$ jps
>>> 7856 SecondaryNameNode
>>> 7651 DataNode
>>> 7480 NameNode
>>> 8106 Jps
>>>
>>> I don't appear to have any firewall rules interfering with traffic
>>> vagrant@hadoop-master:~/src$ sudo iptables --list
>>> Chain INPUT (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> Chain FORWARD (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> Chain OUTPUT (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> The iptables --list output is identical on hadoop-data1. I also show a
>>> process attempting to connect to hadoop-master
>>> vagrant@hadoop-data1:~$ sudo lsof -i :54310
>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>>> java 3823 hadoop 238u IPv4 19304 0t0 TCP
>>> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>>>
>>> I am confused by the notation of hostname/IP:port.
>>> All help appreciated.
>>>
>>> Daniel
>>>
>>>
>>>
>>> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
>>> wrote:
>>>
>>>> I have a multi-node cluster with two datanodes. After running
>>>> start-dfs.sh, I show the following processes running
>>>>
>>>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>>>> hadoop-master: 10933 DataNode
>>>> hadoop-master: 10759 NameNode
>>>> hadoop-master: 11145 SecondaryNameNode
>>>> hadoop-master: 11567 Jps
>>>> hadoop-data1: 5186 Jps
>>>> hadoop-data1: 5059 DataNode
>>>> hadoop-data2: 5180 Jps
>>>> hadoop-data2: 5053 DataNode
>>>>
>>>>
>>>> However, the other two DataNodes aren't visible.
>>>> http://screencast.com/t/icsLnXXDk
>>>>
>>>> Where can I look for clues?
>>>>
>>>
>>>
>>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
phew, I finally added the property below to yarn-site.xml
<property>
<name>yarn.resourcemanager.bind-host</name>
<value>0.0.0.0</value>
</property>
I now see the datanodes, but not at the same time. Under datanodes in
operation I see the master and either one of the other datanodes. Is that
typical behavior? Perhaps it's switching between them for redundancy?
Daniel
On Thu, Sep 24, 2015 at 2:49 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> I'm making a little progress here.
>
> I added the following properties to hdfs-site.xml
>
> <property>
> <name>dfs.namenode.rpc-bind-host</name>
> <value>0.0.0.0</value>
> </property>
> <property>
> <name>dfs.namenode.servicerpc-bind-host</name>
> <value>0.0.0.0</value>
> </property>
>
> I can now connect to hadoop-master:
>
> hadoop@hadoop-data1:~$ telnet hadoop-master 54310
> Trying 192.168.51.4...
> Connected to hadoop-master.
> Escape character is '^]'.
>
> BUT I'm now getting the error below. I'm confused that it's trying to
> connect to 192.168.51.1 because that's not even a valid IP in my
> installation.
>
> 2015-09-24 19:40:08,821 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
> Block pool BP-852923283-127.0.1.1-1443119668806 (Datanode Uuid null)
> service to hadoop-master/192.168.51.4:54310 Datanode denied communication
> with namenode because hostname cannot be resolved (ip=192.168.51.1,
> hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010,
> datanodeUuid=cd8107ac-f1dd-4c33-a054-2dee95cc4a16, infoPort=50075,
> infoSecurePort=0, ipcPort=50020,
> storageInfo=lv=-56;cid=CID-8e9d0e10-5744-448b-a45a-87644364e714;nsid=1441606918;c=0)
>
> Any idea what's happening here?
>
>
> On Thu, Sep 24, 2015 at 2:26 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> In a further test, I tried connecting to the NameNode from hadoop-master
>> (where it's running) using both the hostname and the IP address.
>>
>> vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
>> Trying 192.168.51.4...
>> telnet: Unable to connect to remote host: Connection refused
>> vagrant@hadoop-master:~/src$ telnet localhost 54310
>> Trying 127.0.0.1...
>> telnet: Unable to connect to remote host: Connection refused
>> vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
>> Trying 127.0.1.1...
>> Connected to hadoop-master.
>> Escape character is '^]'.
>>
>>
>> As you can see the IP address or localhost connection is refused, but the
>> hostname connection succeeds. Is there some way to configure the namenode
>> to accept connections from all hosts?
>>
>> On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
>> wrote:
>>
>>> On one of the namenodes I have found the following warning:
>>>
>>> 2015-09-24 18:40:17,639 WARN
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
>>> server: hadoop-master/192.168.51.4:54310
>>>
>>> On my master node I see that the process is running and has bound that
>>> port
>>>
>>> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>>> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
>>> (LISTEN)
>>> java 7480 hadoop 212u IPv4 28758 0t0 TCP
>>> hadoop-master:54310->localhost:47226 (ESTABLISHED)
>>> java 7651 hadoop 238u IPv4 28247 0t0 TCP
>>> localhost:47226->hadoop-master:54310 (ESTABLISHED)
>>> hadoop@hadoop-master:~$ jps
>>> 7856 SecondaryNameNode
>>> 7651 DataNode
>>> 7480 NameNode
>>> 8106 Jps
>>>
>>> I don't appear to have any firewall rules interfering with traffic
>>> vagrant@hadoop-master:~/src$ sudo iptables --list
>>> Chain INPUT (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> Chain FORWARD (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> Chain OUTPUT (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> The iptables --list output is identical on hadoop-data1. I also show a
>>> process attempting to connect to hadoop-master
>>> vagrant@hadoop-data1:~$ sudo lsof -i :54310
>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>>> java 3823 hadoop 238u IPv4 19304 0t0 TCP
>>> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>>>
>>> I am confused by the notation of hostname/IP:port.
>>> All help appreciated.
>>>
>>> Daniel
>>>
>>>
>>>
>>> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
>>> wrote:
>>>
>>>> I have a multi-node cluster with two datanodes. After running
>>>> start-dfs.sh, I show the following processes running
>>>>
>>>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>>>> hadoop-master: 10933 DataNode
>>>> hadoop-master: 10759 NameNode
>>>> hadoop-master: 11145 SecondaryNameNode
>>>> hadoop-master: 11567 Jps
>>>> hadoop-data1: 5186 Jps
>>>> hadoop-data1: 5059 DataNode
>>>> hadoop-data2: 5180 Jps
>>>> hadoop-data2: 5053 DataNode
>>>>
>>>>
>>>> However, the other two DataNodes aren't visible.
>>>> http://screencast.com/t/icsLnXXDk
>>>>
>>>> Where can I look for clues?
>>>>
>>>
>>>
>>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
phew, I finally added the property below to yarn-site.xml
<property>
<name>yarn.resourcemanager.bind-host</name>
<value>0.0.0.0</value>
</property>
I now see the datanodes, but not at the same time. Under datanodes in
operation I see the master and either one of the other datanodes. Is that
typical behavior? Perhaps it's switching between them for redundancy?
Daniel
On Thu, Sep 24, 2015 at 2:49 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> I'm making a little progress here.
>
> I added the following properties to hdfs-site.xml
>
> <property>
> <name>dfs.namenode.rpc-bind-host</name>
> <value>0.0.0.0</value>
> </property>
> <property>
> <name>dfs.namenode.servicerpc-bind-host</name>
> <value>0.0.0.0</value>
> </property>
>
> I can now connect to hadoop-master:
>
> hadoop@hadoop-data1:~$ telnet hadoop-master 54310
> Trying 192.168.51.4...
> Connected to hadoop-master.
> Escape character is '^]'.
>
> BUT I'm now getting the error below. I'm confused that it's trying to
> connect to 192.168.51.1 because that's not even a valid IP in my
> installation.
>
> 2015-09-24 19:40:08,821 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
> Block pool BP-852923283-127.0.1.1-1443119668806 (Datanode Uuid null)
> service to hadoop-master/192.168.51.4:54310 Datanode denied communication
> with namenode because hostname cannot be resolved (ip=192.168.51.1,
> hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010,
> datanodeUuid=cd8107ac-f1dd-4c33-a054-2dee95cc4a16, infoPort=50075,
> infoSecurePort=0, ipcPort=50020,
> storageInfo=lv=-56;cid=CID-8e9d0e10-5744-448b-a45a-87644364e714;nsid=1441606918;c=0)
>
> Any idea what's happening here?
>
>
> On Thu, Sep 24, 2015 at 2:26 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> In a further test, I tried connecting to the NameNode from hadoop-master
>> (where it's running) using both the hostname and the IP address.
>>
>> vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
>> Trying 192.168.51.4...
>> telnet: Unable to connect to remote host: Connection refused
>> vagrant@hadoop-master:~/src$ telnet localhost 54310
>> Trying 127.0.0.1...
>> telnet: Unable to connect to remote host: Connection refused
>> vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
>> Trying 127.0.1.1...
>> Connected to hadoop-master.
>> Escape character is '^]'.
>>
>>
>> As you can see the IP address or localhost connection is refused, but the
>> hostname connection succeeds. Is there some way to configure the namenode
>> to accept connections from all hosts?
>>
>> On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
>> wrote:
>>
>>> On one of the namenodes I have found the following warning:
>>>
>>> 2015-09-24 18:40:17,639 WARN
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
>>> server: hadoop-master/192.168.51.4:54310
>>>
>>> On my master node I see that the process is running and has bound that
>>> port
>>>
>>> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>>> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
>>> (LISTEN)
>>> java 7480 hadoop 212u IPv4 28758 0t0 TCP
>>> hadoop-master:54310->localhost:47226 (ESTABLISHED)
>>> java 7651 hadoop 238u IPv4 28247 0t0 TCP
>>> localhost:47226->hadoop-master:54310 (ESTABLISHED)
>>> hadoop@hadoop-master:~$ jps
>>> 7856 SecondaryNameNode
>>> 7651 DataNode
>>> 7480 NameNode
>>> 8106 Jps
>>>
>>> I don't appear to have any firewall rules interfering with traffic
>>> vagrant@hadoop-master:~/src$ sudo iptables --list
>>> Chain INPUT (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> Chain FORWARD (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> Chain OUTPUT (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> The iptables --list output is identical on hadoop-data1. I also show a
>>> process attempting to connect to hadoop-master
>>> vagrant@hadoop-data1:~$ sudo lsof -i :54310
>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>>> java 3823 hadoop 238u IPv4 19304 0t0 TCP
>>> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>>>
>>> I am confused by the notation of hostname/IP:port.
>>> All help appreciated.
>>>
>>> Daniel
>>>
>>>
>>>
>>> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
>>> wrote:
>>>
>>>> I have a multi-node cluster with two datanodes. After running
>>>> start-dfs.sh, I show the following processes running
>>>>
>>>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>>>> hadoop-master: 10933 DataNode
>>>> hadoop-master: 10759 NameNode
>>>> hadoop-master: 11145 SecondaryNameNode
>>>> hadoop-master: 11567 Jps
>>>> hadoop-data1: 5186 Jps
>>>> hadoop-data1: 5059 DataNode
>>>> hadoop-data2: 5180 Jps
>>>> hadoop-data2: 5053 DataNode
>>>>
>>>>
>>>> However, the other two DataNodes aren't visible.
>>>> http://screencast.com/t/icsLnXXDk
>>>>
>>>> Where can I look for clues?
>>>>
>>>
>>>
>>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
phew, I finally added the property below to yarn-site.xml
<property>
<name>yarn.resourcemanager.bind-host</name>
<value>0.0.0.0</value>
</property>
I now see the datanodes, but not at the same time. Under datanodes in
operation I see the master and either one of the other datanodes. Is that
typical behavior? Perhaps it's switching between them for redundancy?
Daniel
On Thu, Sep 24, 2015 at 2:49 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> I'm making a little progress here.
>
> I added the following properties to hdfs-site.xml
>
> <property>
> <name>dfs.namenode.rpc-bind-host</name>
> <value>0.0.0.0</value>
> </property>
> <property>
> <name>dfs.namenode.servicerpc-bind-host</name>
> <value>0.0.0.0</value>
> </property>
>
> I can now connect to hadoop-master:
>
> hadoop@hadoop-data1:~$ telnet hadoop-master 54310
> Trying 192.168.51.4...
> Connected to hadoop-master.
> Escape character is '^]'.
>
> BUT I'm now getting the error below. I'm confused that it's trying to
> connect to 192.168.51.1 because that's not even a valid IP in my
> installation.
>
> 2015-09-24 19:40:08,821 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
> Block pool BP-852923283-127.0.1.1-1443119668806 (Datanode Uuid null)
> service to hadoop-master/192.168.51.4:54310 Datanode denied communication
> with namenode because hostname cannot be resolved (ip=192.168.51.1,
> hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010,
> datanodeUuid=cd8107ac-f1dd-4c33-a054-2dee95cc4a16, infoPort=50075,
> infoSecurePort=0, ipcPort=50020,
> storageInfo=lv=-56;cid=CID-8e9d0e10-5744-448b-a45a-87644364e714;nsid=1441606918;c=0)
>
> Any idea what's happening here?
>
>
> On Thu, Sep 24, 2015 at 2:26 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> In a further test, I tried connecting to the NameNode from hadoop-master
>> (where it's running) using both the hostname and the IP address.
>>
>> vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
>> Trying 192.168.51.4...
>> telnet: Unable to connect to remote host: Connection refused
>> vagrant@hadoop-master:~/src$ telnet localhost 54310
>> Trying 127.0.0.1...
>> telnet: Unable to connect to remote host: Connection refused
>> vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
>> Trying 127.0.1.1...
>> Connected to hadoop-master.
>> Escape character is '^]'.
>>
>>
>> As you can see the IP address or localhost connection is refused, but the
>> hostname connection succeeds. Is there some way to configure the namenode
>> to accept connections from all hosts?
>>
>> On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
>> wrote:
>>
>>> On one of the namenodes I have found the following warning:
>>>
>>> 2015-09-24 18:40:17,639 WARN
>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
>>> server: hadoop-master/192.168.51.4:54310
>>>
>>> On my master node I see that the process is running and has bound that
>>> port
>>>
>>> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>>> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
>>> (LISTEN)
>>> java 7480 hadoop 212u IPv4 28758 0t0 TCP
>>> hadoop-master:54310->localhost:47226 (ESTABLISHED)
>>> java 7651 hadoop 238u IPv4 28247 0t0 TCP
>>> localhost:47226->hadoop-master:54310 (ESTABLISHED)
>>> hadoop@hadoop-master:~$ jps
>>> 7856 SecondaryNameNode
>>> 7651 DataNode
>>> 7480 NameNode
>>> 8106 Jps
>>>
>>> I don't appear to have any firewall rules interfering with traffic
>>> vagrant@hadoop-master:~/src$ sudo iptables --list
>>> Chain INPUT (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> Chain FORWARD (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> Chain OUTPUT (policy ACCEPT)
>>> target prot opt source destination
>>>
>>> The iptables --list output is identical on hadoop-data1. I also show a
>>> process attempting to connect to hadoop-master
>>> vagrant@hadoop-data1:~$ sudo lsof -i :54310
>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>>> java 3823 hadoop 238u IPv4 19304 0t0 TCP
>>> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>>>
>>> I am confused by the notation of hostname/IP:port.
>>> All help appreciated.
>>>
>>> Daniel
>>>
>>>
>>>
>>> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
>>> wrote:
>>>
>>>> I have a multi-node cluster with two datanodes. After running
>>>> start-dfs.sh, I show the following processes running
>>>>
>>>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>>>> hadoop-master: 10933 DataNode
>>>> hadoop-master: 10759 NameNode
>>>> hadoop-master: 11145 SecondaryNameNode
>>>> hadoop-master: 11567 Jps
>>>> hadoop-data1: 5186 Jps
>>>> hadoop-data1: 5059 DataNode
>>>> hadoop-data2: 5180 Jps
>>>> hadoop-data2: 5053 DataNode
>>>>
>>>>
>>>> However, the other two DataNodes aren't visible.
>>>> http://screencast.com/t/icsLnXXDk
>>>>
>>>> Where can I look for clues?
>>>>
>>>
>>>
>>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
I'm making a little progress here.
I added the following properties to hdfs-site.xml
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.namenode.servicerpc-bind-host</name>
<value>0.0.0.0</value>
</property>
I can now connect to hadoop-master:
hadoop@hadoop-data1:~$ telnet hadoop-master 54310
Trying 192.168.51.4...
Connected to hadoop-master.
Escape character is '^]'.
BUT I'm now getting the error below. I'm confused that it's trying to
connect to 192.168.51.1 because that's not even a valid IP in my
installation.
2015-09-24 19:40:08,821 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
Block pool BP-852923283-127.0.1.1-1443119668806 (Datanode Uuid null)
service to hadoop-master/192.168.51.4:54310 Datanode denied communication
with namenode because hostname cannot be resolved (ip=192.168.51.1,
hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010,
datanodeUuid=cd8107ac-f1dd-4c33-a054-2dee95cc4a16, infoPort=50075,
infoSecurePort=0, ipcPort=50020,
storageInfo=lv=-56;cid=CID-8e9d0e10-5744-448b-a45a-87644364e714;nsid=1441606918;c=0)
Any idea what's happening here?
On Thu, Sep 24, 2015 at 2:26 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> In a further test, I tried connecting to the NameNode from hadoop-master
> (where it's running) using both the hostname and the IP address.
>
> vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
> Trying 192.168.51.4...
> telnet: Unable to connect to remote host: Connection refused
> vagrant@hadoop-master:~/src$ telnet localhost 54310
> Trying 127.0.0.1...
> telnet: Unable to connect to remote host: Connection refused
> vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
> Trying 127.0.1.1...
> Connected to hadoop-master.
> Escape character is '^]'.
>
>
> As you can see the IP address or localhost connection is refused, but the
> hostname connection succeeds. Is there some way to configure the namenode
> to accept connections from all hosts?
>
> On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> On one of the namenodes I have found the following warning:
>>
>> 2015-09-24 18:40:17,639 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
>> server: hadoop-master/192.168.51.4:54310
>>
>> On my master node I see that the process is running and has bound that
>> port
>>
>> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
>> (LISTEN)
>> java 7480 hadoop 212u IPv4 28758 0t0 TCP
>> hadoop-master:54310->localhost:47226 (ESTABLISHED)
>> java 7651 hadoop 238u IPv4 28247 0t0 TCP
>> localhost:47226->hadoop-master:54310 (ESTABLISHED)
>> hadoop@hadoop-master:~$ jps
>> 7856 SecondaryNameNode
>> 7651 DataNode
>> 7480 NameNode
>> 8106 Jps
>>
>> I don't appear to have any firewall rules interfering with traffic
>> vagrant@hadoop-master:~/src$ sudo iptables --list
>> Chain INPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain FORWARD (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain OUTPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> The iptables --list output is identical on hadoop-data1. I also show a
>> process attempting to connect to hadoop-master
>> vagrant@hadoop-data1:~$ sudo lsof -i :54310
>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>> java 3823 hadoop 238u IPv4 19304 0t0 TCP
>> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>>
>> I am confused by the notation of hostname/IP:port.
>> All help appreciated.
>>
>> Daniel
>>
>>
>>
>> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
>> wrote:
>>
>>> I have a multi-node cluster with two datanodes. After running
>>> start-dfs.sh, I show the following processes running
>>>
>>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>>> hadoop-master: 10933 DataNode
>>> hadoop-master: 10759 NameNode
>>> hadoop-master: 11145 SecondaryNameNode
>>> hadoop-master: 11567 Jps
>>> hadoop-data1: 5186 Jps
>>> hadoop-data1: 5059 DataNode
>>> hadoop-data2: 5180 Jps
>>> hadoop-data2: 5053 DataNode
>>>
>>>
>>> However, the other two DataNodes aren't visible.
>>> http://screencast.com/t/icsLnXXDk
>>>
>>> Where can I look for clues?
>>>
>>
>>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
I'm making a little progress here.
I added the following properties to hdfs-site.xml
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.namenode.servicerpc-bind-host</name>
<value>0.0.0.0</value>
</property>
I can now connect to hadoop-master:
hadoop@hadoop-data1:~$ telnet hadoop-master 54310
Trying 192.168.51.4...
Connected to hadoop-master.
Escape character is '^]'.
BUT I'm now getting the error below. I'm confused that it's trying to
connect to 192.168.51.1 because that's not even a valid IP in my
installation.
2015-09-24 19:40:08,821 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
Block pool BP-852923283-127.0.1.1-1443119668806 (Datanode Uuid null)
service to hadoop-master/192.168.51.4:54310 Datanode denied communication
with namenode because hostname cannot be resolved (ip=192.168.51.1,
hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010,
datanodeUuid=cd8107ac-f1dd-4c33-a054-2dee95cc4a16, infoPort=50075,
infoSecurePort=0, ipcPort=50020,
storageInfo=lv=-56;cid=CID-8e9d0e10-5744-448b-a45a-87644364e714;nsid=1441606918;c=0)
Any idea what's happening here?
On Thu, Sep 24, 2015 at 2:26 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> In a further test, I tried connecting to the NameNode from hadoop-master
> (where it's running) using both the hostname and the IP address.
>
> vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
> Trying 192.168.51.4...
> telnet: Unable to connect to remote host: Connection refused
> vagrant@hadoop-master:~/src$ telnet localhost 54310
> Trying 127.0.0.1...
> telnet: Unable to connect to remote host: Connection refused
> vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
> Trying 127.0.1.1...
> Connected to hadoop-master.
> Escape character is '^]'.
>
>
> As you can see the IP address or localhost connection is refused, but the
> hostname connection succeeds. Is there some way to configure the namenode
> to accept connections from all hosts?
>
> On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> On one of the namenodes I have found the following warning:
>>
>> 2015-09-24 18:40:17,639 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
>> server: hadoop-master/192.168.51.4:54310
>>
>> On my master node I see that the process is running and has bound that
>> port
>>
>> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
>> (LISTEN)
>> java 7480 hadoop 212u IPv4 28758 0t0 TCP
>> hadoop-master:54310->localhost:47226 (ESTABLISHED)
>> java 7651 hadoop 238u IPv4 28247 0t0 TCP
>> localhost:47226->hadoop-master:54310 (ESTABLISHED)
>> hadoop@hadoop-master:~$ jps
>> 7856 SecondaryNameNode
>> 7651 DataNode
>> 7480 NameNode
>> 8106 Jps
>>
>> I don't appear to have any firewall rules interfering with traffic
>> vagrant@hadoop-master:~/src$ sudo iptables --list
>> Chain INPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain FORWARD (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain OUTPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> The iptables --list output is identical on hadoop-data1. I also show a
>> process attempting to connect to hadoop-master
>> vagrant@hadoop-data1:~$ sudo lsof -i :54310
>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>> java 3823 hadoop 238u IPv4 19304 0t0 TCP
>> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>>
>> I am confused by the notation of hostname/IP:port.
>> All help appreciated.
>>
>> Daniel
>>
>>
>>
>> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
>> wrote:
>>
>>> I have a multi-node cluster with two datanodes. After running
>>> start-dfs.sh, I show the following processes running
>>>
>>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>>> hadoop-master: 10933 DataNode
>>> hadoop-master: 10759 NameNode
>>> hadoop-master: 11145 SecondaryNameNode
>>> hadoop-master: 11567 Jps
>>> hadoop-data1: 5186 Jps
>>> hadoop-data1: 5059 DataNode
>>> hadoop-data2: 5180 Jps
>>> hadoop-data2: 5053 DataNode
>>>
>>>
>>> However, the other two DataNodes aren't visible.
>>> http://screencast.com/t/icsLnXXDk
>>>
>>> Where can I look for clues?
>>>
>>
>>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
I'm making a little progress here.
I added the following properties to hdfs-site.xml
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.namenode.servicerpc-bind-host</name>
<value>0.0.0.0</value>
</property>
I can now connect to hadoop-master:
hadoop@hadoop-data1:~$ telnet hadoop-master 54310
Trying 192.168.51.4...
Connected to hadoop-master.
Escape character is '^]'.
BUT I'm now getting the error below. I'm confused that it's trying to
connect to 192.168.51.1 because that's not even a valid IP in my
installation.
2015-09-24 19:40:08,821 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
Block pool BP-852923283-127.0.1.1-1443119668806 (Datanode Uuid null)
service to hadoop-master/192.168.51.4:54310 Datanode denied communication
with namenode because hostname cannot be resolved (ip=192.168.51.1,
hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010,
datanodeUuid=cd8107ac-f1dd-4c33-a054-2dee95cc4a16, infoPort=50075,
infoSecurePort=0, ipcPort=50020,
storageInfo=lv=-56;cid=CID-8e9d0e10-5744-448b-a45a-87644364e714;nsid=1441606918;c=0)
Any idea what's happening here?
On Thu, Sep 24, 2015 at 2:26 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> In a further test, I tried connecting to the NameNode from hadoop-master
> (where it's running) using both the hostname and the IP address.
>
> vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
> Trying 192.168.51.4...
> telnet: Unable to connect to remote host: Connection refused
> vagrant@hadoop-master:~/src$ telnet localhost 54310
> Trying 127.0.0.1...
> telnet: Unable to connect to remote host: Connection refused
> vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
> Trying 127.0.1.1...
> Connected to hadoop-master.
> Escape character is '^]'.
>
>
> As you can see the IP address or localhost connection is refused, but the
> hostname connection succeeds. Is there some way to configure the namenode
> to accept connections from all hosts?
>
> On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> On one of the namenodes I have found the following warning:
>>
>> 2015-09-24 18:40:17,639 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
>> server: hadoop-master/192.168.51.4:54310
>>
>> On my master node I see that the process is running and has bound that
>> port
>>
>> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
>> (LISTEN)
>> java 7480 hadoop 212u IPv4 28758 0t0 TCP
>> hadoop-master:54310->localhost:47226 (ESTABLISHED)
>> java 7651 hadoop 238u IPv4 28247 0t0 TCP
>> localhost:47226->hadoop-master:54310 (ESTABLISHED)
>> hadoop@hadoop-master:~$ jps
>> 7856 SecondaryNameNode
>> 7651 DataNode
>> 7480 NameNode
>> 8106 Jps
>>
>> I don't appear to have any firewall rules interfering with traffic
>> vagrant@hadoop-master:~/src$ sudo iptables --list
>> Chain INPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain FORWARD (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain OUTPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> The iptables --list output is identical on hadoop-data1. I also show a
>> process attempting to connect to hadoop-master
>> vagrant@hadoop-data1:~$ sudo lsof -i :54310
>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>> java 3823 hadoop 238u IPv4 19304 0t0 TCP
>> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>>
>> I am confused by the notation of hostname/IP:port.
>> All help appreciated.
>>
>> Daniel
>>
>>
>>
>> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
>> wrote:
>>
>>> I have a multi-node cluster with two datanodes. After running
>>> start-dfs.sh, I show the following processes running
>>>
>>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>>> hadoop-master: 10933 DataNode
>>> hadoop-master: 10759 NameNode
>>> hadoop-master: 11145 SecondaryNameNode
>>> hadoop-master: 11567 Jps
>>> hadoop-data1: 5186 Jps
>>> hadoop-data1: 5059 DataNode
>>> hadoop-data2: 5180 Jps
>>> hadoop-data2: 5053 DataNode
>>>
>>>
>>> However, the other two DataNodes aren't visible.
>>> http://screencast.com/t/icsLnXXDk
>>>
>>> Where can I look for clues?
>>>
>>
>>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
I'm making a little progress here.
I added the following properties to hdfs-site.xml
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>dfs.namenode.servicerpc-bind-host</name>
<value>0.0.0.0</value>
</property>
I can now connect to hadoop-master:
hadoop@hadoop-data1:~$ telnet hadoop-master 54310
Trying 192.168.51.4...
Connected to hadoop-master.
Escape character is '^]'.
BUT I'm now getting the error below. I'm confused that it's trying to
connect to 192.168.51.1 because that's not even a valid IP in my
installation.
2015-09-24 19:40:08,821 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for
Block pool BP-852923283-127.0.1.1-1443119668806 (Datanode Uuid null)
service to hadoop-master/192.168.51.4:54310 Datanode denied communication
with namenode because hostname cannot be resolved (ip=192.168.51.1,
hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010,
datanodeUuid=cd8107ac-f1dd-4c33-a054-2dee95cc4a16, infoPort=50075,
infoSecurePort=0, ipcPort=50020,
storageInfo=lv=-56;cid=CID-8e9d0e10-5744-448b-a45a-87644364e714;nsid=1441606918;c=0)
Any idea what's happening here?
On Thu, Sep 24, 2015 at 2:26 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> In a further test, I tried connecting to the NameNode from hadoop-master
> (where it's running) using both the hostname and the IP address.
>
> vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
> Trying 192.168.51.4...
> telnet: Unable to connect to remote host: Connection refused
> vagrant@hadoop-master:~/src$ telnet localhost 54310
> Trying 127.0.0.1...
> telnet: Unable to connect to remote host: Connection refused
> vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
> Trying 127.0.1.1...
> Connected to hadoop-master.
> Escape character is '^]'.
>
>
> As you can see the IP address or localhost connection is refused, but the
> hostname connection succeeds. Is there some way to configure the namenode
> to accept connections from all hosts?
>
> On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> On one of the namenodes I have found the following warning:
>>
>> 2015-09-24 18:40:17,639 WARN
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
>> server: hadoop-master/192.168.51.4:54310
>>
>> On my master node I see that the process is running and has bound that
>> port
>>
>> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
>> (LISTEN)
>> java 7480 hadoop 212u IPv4 28758 0t0 TCP
>> hadoop-master:54310->localhost:47226 (ESTABLISHED)
>> java 7651 hadoop 238u IPv4 28247 0t0 TCP
>> localhost:47226->hadoop-master:54310 (ESTABLISHED)
>> hadoop@hadoop-master:~$ jps
>> 7856 SecondaryNameNode
>> 7651 DataNode
>> 7480 NameNode
>> 8106 Jps
>>
>> I don't appear to have any firewall rules interfering with traffic
>> vagrant@hadoop-master:~/src$ sudo iptables --list
>> Chain INPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain FORWARD (policy ACCEPT)
>> target prot opt source destination
>>
>> Chain OUTPUT (policy ACCEPT)
>> target prot opt source destination
>>
>> The iptables --list output is identical on hadoop-data1. I also show a
>> process attempting to connect to hadoop-master
>> vagrant@hadoop-data1:~$ sudo lsof -i :54310
>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
>> java 3823 hadoop 238u IPv4 19304 0t0 TCP
>> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>>
>> I am confused by the notation of hostname/IP:port.
>> All help appreciated.
>>
>> Daniel
>>
>>
>>
>> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
>> wrote:
>>
>>> I have a multi-node cluster with two datanodes. After running
>>> start-dfs.sh, I show the following processes running
>>>
>>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>>> hadoop-master: 10933 DataNode
>>> hadoop-master: 10759 NameNode
>>> hadoop-master: 11145 SecondaryNameNode
>>> hadoop-master: 11567 Jps
>>> hadoop-data1: 5186 Jps
>>> hadoop-data1: 5059 DataNode
>>> hadoop-data2: 5180 Jps
>>> hadoop-data2: 5053 DataNode
>>>
>>>
>>> However, the other two DataNodes aren't visible.
>>> http://screencast.com/t/icsLnXXDk
>>>
>>> Where can I look for clues?
>>>
>>
>>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
In a further test, I tried connecting to the NameNode from hadoop-master
(where it's running) using both the hostname and the IP address.
vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
Trying 192.168.51.4...
telnet: Unable to connect to remote host: Connection refused
vagrant@hadoop-master:~/src$ telnet localhost 54310
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused
vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
Trying 127.0.1.1...
Connected to hadoop-master.
Escape character is '^]'.
As you can see the IP address or localhost connection is refused, but the
hostname connection succeeds. Is there some way to configure the namenode
to accept connections from all hosts?
On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> On one of the namenodes I have found the following warning:
>
> 2015-09-24 18:40:17,639 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
> server: hadoop-master/192.168.51.4:54310
>
> On my master node I see that the process is running and has bound that port
>
> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
> (LISTEN)
> java 7480 hadoop 212u IPv4 28758 0t0 TCP
> hadoop-master:54310->localhost:47226 (ESTABLISHED)
> java 7651 hadoop 238u IPv4 28247 0t0 TCP
> localhost:47226->hadoop-master:54310 (ESTABLISHED)
> hadoop@hadoop-master:~$ jps
> 7856 SecondaryNameNode
> 7651 DataNode
> 7480 NameNode
> 8106 Jps
>
> I don't appear to have any firewall rules interfering with traffic
> vagrant@hadoop-master:~/src$ sudo iptables --list
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
>
> The iptables --list output is identical on hadoop-data1. I also show a
> process attempting to connect to hadoop-master
> vagrant@hadoop-data1:~$ sudo lsof -i :54310
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> java 3823 hadoop 238u IPv4 19304 0t0 TCP
> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>
> I am confused by the notation of hostname/IP:port.
> All help appreciated.
>
> Daniel
>
>
>
> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> I have a multi-node cluster with two datanodes. After running
>> start-dfs.sh, I show the following processes running
>>
>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>> hadoop-master: 10933 DataNode
>> hadoop-master: 10759 NameNode
>> hadoop-master: 11145 SecondaryNameNode
>> hadoop-master: 11567 Jps
>> hadoop-data1: 5186 Jps
>> hadoop-data1: 5059 DataNode
>> hadoop-data2: 5180 Jps
>> hadoop-data2: 5053 DataNode
>>
>>
>> However, the other two DataNodes aren't visible.
>> http://screencast.com/t/icsLnXXDk
>>
>> Where can I look for clues?
>>
>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
In a further test, I tried connecting to the NameNode from hadoop-master
(where it's running) using both the hostname and the IP address.
vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
Trying 192.168.51.4...
telnet: Unable to connect to remote host: Connection refused
vagrant@hadoop-master:~/src$ telnet localhost 54310
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused
vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
Trying 127.0.1.1...
Connected to hadoop-master.
Escape character is '^]'.
As you can see the IP address or localhost connection is refused, but the
hostname connection succeeds. Is there some way to configure the namenode
to accept connections from all hosts?
On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> On one of the namenodes I have found the following warning:
>
> 2015-09-24 18:40:17,639 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
> server: hadoop-master/192.168.51.4:54310
>
> On my master node I see that the process is running and has bound that port
>
> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
> (LISTEN)
> java 7480 hadoop 212u IPv4 28758 0t0 TCP
> hadoop-master:54310->localhost:47226 (ESTABLISHED)
> java 7651 hadoop 238u IPv4 28247 0t0 TCP
> localhost:47226->hadoop-master:54310 (ESTABLISHED)
> hadoop@hadoop-master:~$ jps
> 7856 SecondaryNameNode
> 7651 DataNode
> 7480 NameNode
> 8106 Jps
>
> I don't appear to have any firewall rules interfering with traffic
> vagrant@hadoop-master:~/src$ sudo iptables --list
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
>
> The iptables --list output is identical on hadoop-data1. I also show a
> process attempting to connect to hadoop-master
> vagrant@hadoop-data1:~$ sudo lsof -i :54310
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> java 3823 hadoop 238u IPv4 19304 0t0 TCP
> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>
> I am confused by the notation of hostname/IP:port.
> All help appreciated.
>
> Daniel
>
>
>
> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> I have a multi-node cluster with two datanodes. After running
>> start-dfs.sh, I show the following processes running
>>
>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>> hadoop-master: 10933 DataNode
>> hadoop-master: 10759 NameNode
>> hadoop-master: 11145 SecondaryNameNode
>> hadoop-master: 11567 Jps
>> hadoop-data1: 5186 Jps
>> hadoop-data1: 5059 DataNode
>> hadoop-data2: 5180 Jps
>> hadoop-data2: 5053 DataNode
>>
>>
>> However, the other two DataNodes aren't visible.
>> http://screencast.com/t/icsLnXXDk
>>
>> Where can I look for clues?
>>
>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
In a further test, I tried connecting to the NameNode from hadoop-master
(where it's running) using both the hostname and the IP address.
vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
Trying 192.168.51.4...
telnet: Unable to connect to remote host: Connection refused
vagrant@hadoop-master:~/src$ telnet localhost 54310
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused
vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
Trying 127.0.1.1...
Connected to hadoop-master.
Escape character is '^]'.
As you can see the IP address or localhost connection is refused, but the
hostname connection succeeds. Is there some way to configure the namenode
to accept connections from all hosts?
On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> On one of the namenodes I have found the following warning:
>
> 2015-09-24 18:40:17,639 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
> server: hadoop-master/192.168.51.4:54310
>
> On my master node I see that the process is running and has bound that port
>
> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
> (LISTEN)
> java 7480 hadoop 212u IPv4 28758 0t0 TCP
> hadoop-master:54310->localhost:47226 (ESTABLISHED)
> java 7651 hadoop 238u IPv4 28247 0t0 TCP
> localhost:47226->hadoop-master:54310 (ESTABLISHED)
> hadoop@hadoop-master:~$ jps
> 7856 SecondaryNameNode
> 7651 DataNode
> 7480 NameNode
> 8106 Jps
>
> I don't appear to have any firewall rules interfering with traffic
> vagrant@hadoop-master:~/src$ sudo iptables --list
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
>
> The iptables --list output is identical on hadoop-data1. I also show a
> process attempting to connect to hadoop-master
> vagrant@hadoop-data1:~$ sudo lsof -i :54310
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> java 3823 hadoop 238u IPv4 19304 0t0 TCP
> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>
> I am confused by the notation of hostname/IP:port.
> All help appreciated.
>
> Daniel
>
>
>
> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> I have a multi-node cluster with two datanodes. After running
>> start-dfs.sh, I show the following processes running
>>
>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>> hadoop-master: 10933 DataNode
>> hadoop-master: 10759 NameNode
>> hadoop-master: 11145 SecondaryNameNode
>> hadoop-master: 11567 Jps
>> hadoop-data1: 5186 Jps
>> hadoop-data1: 5059 DataNode
>> hadoop-data2: 5180 Jps
>> hadoop-data2: 5053 DataNode
>>
>>
>> However, the other two DataNodes aren't visible.
>> http://screencast.com/t/icsLnXXDk
>>
>> Where can I look for clues?
>>
>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
In a further test, I tried connecting to the NameNode from hadoop-master
(where it's running) using both the hostname and the IP address.
vagrant@hadoop-master:~/src$ telnet 192.168.51.4 54310
Trying 192.168.51.4...
telnet: Unable to connect to remote host: Connection refused
vagrant@hadoop-master:~/src$ telnet localhost 54310
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused
vagrant@hadoop-master:~/src$ telnet hadoop-master 54310
Trying 127.0.1.1...
Connected to hadoop-master.
Escape character is '^]'.
As you can see the IP address or localhost connection is refused, but the
hostname connection succeeds. Is there some way to configure the namenode
to accept connections from all hosts?
On Thu, Sep 24, 2015 at 2:14 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> On one of the namenodes I have found the following warning:
>
> 2015-09-24 18:40:17,639 WARN
> org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
> server: hadoop-master/192.168.51.4:54310
>
> On my master node I see that the process is running and has bound that port
>
> vagrant@hadoop-master:~/src$ sudo lsof -i :54310
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
> (LISTEN)
> java 7480 hadoop 212u IPv4 28758 0t0 TCP
> hadoop-master:54310->localhost:47226 (ESTABLISHED)
> java 7651 hadoop 238u IPv4 28247 0t0 TCP
> localhost:47226->hadoop-master:54310 (ESTABLISHED)
> hadoop@hadoop-master:~$ jps
> 7856 SecondaryNameNode
> 7651 DataNode
> 7480 NameNode
> 8106 Jps
>
> I don't appear to have any firewall rules interfering with traffic
> vagrant@hadoop-master:~/src$ sudo iptables --list
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
>
> The iptables --list output is identical on hadoop-data1. I also show a
> process attempting to connect to hadoop-master
> vagrant@hadoop-data1:~$ sudo lsof -i :54310
> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
> java 3823 hadoop 238u IPv4 19304 0t0 TCP
> hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
>
> I am confused by the notation of hostname/IP:port.
> All help appreciated.
>
> Daniel
>
>
>
> On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> I have a multi-node cluster with two datanodes. After running
>> start-dfs.sh, I show the following processes running
>>
>> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
>> hadoop-master: 10933 DataNode
>> hadoop-master: 10759 NameNode
>> hadoop-master: 11145 SecondaryNameNode
>> hadoop-master: 11567 Jps
>> hadoop-data1: 5186 Jps
>> hadoop-data1: 5059 DataNode
>> hadoop-data2: 5180 Jps
>> hadoop-data2: 5053 DataNode
>>
>>
>> However, the other two DataNodes aren't visible.
>> http://screencast.com/t/icsLnXXDk
>>
>> Where can I look for clues?
>>
>
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
On one of the namenodes I have found the following warning:
2015-09-24 18:40:17,639 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
server: hadoop-master/192.168.51.4:54310
On my master node I see that the process is running and has bound that port
vagrant@hadoop-master:~/src$ sudo lsof -i :54310
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
(LISTEN)
java 7480 hadoop 212u IPv4 28758 0t0 TCP
hadoop-master:54310->localhost:47226 (ESTABLISHED)
java 7651 hadoop 238u IPv4 28247 0t0 TCP
localhost:47226->hadoop-master:54310 (ESTABLISHED)
hadoop@hadoop-master:~$ jps
7856 SecondaryNameNode
7651 DataNode
7480 NameNode
8106 Jps
I don't appear to have any firewall rules interfering with traffic
vagrant@hadoop-master:~/src$ sudo iptables --list
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
The iptables --list output is identical on hadoop-data1. I also show a
process attempting to connect to hadoop-master
vagrant@hadoop-data1:~$ sudo lsof -i :54310
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 3823 hadoop 238u IPv4 19304 0t0 TCP
hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
I am confused by the notation of hostname/IP:port.
All help appreciated.
Daniel
On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> I have a multi-node cluster with two datanodes. After running
> start-dfs.sh, I show the following processes running
>
> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
> hadoop-master: 10933 DataNode
> hadoop-master: 10759 NameNode
> hadoop-master: 11145 SecondaryNameNode
> hadoop-master: 11567 Jps
> hadoop-data1: 5186 Jps
> hadoop-data1: 5059 DataNode
> hadoop-data2: 5180 Jps
> hadoop-data2: 5053 DataNode
>
>
> However, the other two DataNodes aren't visible.
> http://screencast.com/t/icsLnXXDk
>
> Where can I look for clues?
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
On one of the namenodes I have found the following warning:
2015-09-24 18:40:17,639 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
server: hadoop-master/192.168.51.4:54310
On my master node I see that the process is running and has bound that port
vagrant@hadoop-master:~/src$ sudo lsof -i :54310
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
(LISTEN)
java 7480 hadoop 212u IPv4 28758 0t0 TCP
hadoop-master:54310->localhost:47226 (ESTABLISHED)
java 7651 hadoop 238u IPv4 28247 0t0 TCP
localhost:47226->hadoop-master:54310 (ESTABLISHED)
hadoop@hadoop-master:~$ jps
7856 SecondaryNameNode
7651 DataNode
7480 NameNode
8106 Jps
I don't appear to have any firewall rules interfering with traffic
vagrant@hadoop-master:~/src$ sudo iptables --list
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
The iptables --list output is identical on hadoop-data1. I also show a
process attempting to connect to hadoop-master
vagrant@hadoop-data1:~$ sudo lsof -i :54310
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 3823 hadoop 238u IPv4 19304 0t0 TCP
hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
I am confused by the notation of hostname/IP:port.
All help appreciated.
Daniel
On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> I have a multi-node cluster with two datanodes. After running
> start-dfs.sh, I show the following processes running
>
> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
> hadoop-master: 10933 DataNode
> hadoop-master: 10759 NameNode
> hadoop-master: 11145 SecondaryNameNode
> hadoop-master: 11567 Jps
> hadoop-data1: 5186 Jps
> hadoop-data1: 5059 DataNode
> hadoop-data2: 5180 Jps
> hadoop-data2: 5053 DataNode
>
>
> However, the other two DataNodes aren't visible.
> http://screencast.com/t/icsLnXXDk
>
> Where can I look for clues?
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
On one of the namenodes I have found the following warning:
2015-09-24 18:40:17,639 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
server: hadoop-master/192.168.51.4:54310
On my master node I see that the process is running and has bound that port
vagrant@hadoop-master:~/src$ sudo lsof -i :54310
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
(LISTEN)
java 7480 hadoop 212u IPv4 28758 0t0 TCP
hadoop-master:54310->localhost:47226 (ESTABLISHED)
java 7651 hadoop 238u IPv4 28247 0t0 TCP
localhost:47226->hadoop-master:54310 (ESTABLISHED)
hadoop@hadoop-master:~$ jps
7856 SecondaryNameNode
7651 DataNode
7480 NameNode
8106 Jps
I don't appear to have any firewall rules interfering with traffic
vagrant@hadoop-master:~/src$ sudo iptables --list
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
The iptables --list output is identical on hadoop-data1. I also show a
process attempting to connect to hadoop-master
vagrant@hadoop-data1:~$ sudo lsof -i :54310
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 3823 hadoop 238u IPv4 19304 0t0 TCP
hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
I am confused by the notation of hostname/IP:port.
All help appreciated.
Daniel
On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> I have a multi-node cluster with two datanodes. After running
> start-dfs.sh, I show the following processes running
>
> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
> hadoop-master: 10933 DataNode
> hadoop-master: 10759 NameNode
> hadoop-master: 11145 SecondaryNameNode
> hadoop-master: 11567 Jps
> hadoop-data1: 5186 Jps
> hadoop-data1: 5059 DataNode
> hadoop-data2: 5180 Jps
> hadoop-data2: 5053 DataNode
>
>
> However, the other two DataNodes aren't visible.
> http://screencast.com/t/icsLnXXDk
>
> Where can I look for clues?
>
Re: Datanodes not connecting to the cluster
Posted by Daniel Watrous <dw...@gmail.com>.
On one of the namenodes I have found the following warning:
2015-09-24 18:40:17,639 WARN
org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to
server: hadoop-master/192.168.51.4:54310
On my master node I see that the process is running and has bound that port
vagrant@hadoop-master:~/src$ sudo lsof -i :54310
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 7480 hadoop 202u IPv4 26931 0t0 TCP hadoop-master:54310
(LISTEN)
java 7480 hadoop 212u IPv4 28758 0t0 TCP
hadoop-master:54310->localhost:47226 (ESTABLISHED)
java 7651 hadoop 238u IPv4 28247 0t0 TCP
localhost:47226->hadoop-master:54310 (ESTABLISHED)
hadoop@hadoop-master:~$ jps
7856 SecondaryNameNode
7651 DataNode
7480 NameNode
8106 Jps
I don't appear to have any firewall rules interfering with traffic
vagrant@hadoop-master:~/src$ sudo iptables --list
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
The iptables --list output is identical on hadoop-data1. I also show a
process attempting to connect to hadoop-master
vagrant@hadoop-data1:~$ sudo lsof -i :54310
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 3823 hadoop 238u IPv4 19304 0t0 TCP
hadoop-data1:45600->hadoop-master:54310 (SYN_SENT)
I am confused by the notation of hostname/IP:port.
All help appreciated.
Daniel
On Thu, Sep 24, 2015 at 12:51 PM, Daniel Watrous <dw...@gmail.com>
wrote:
> I have a multi-node cluster with two datanodes. After running
> start-dfs.sh, I show the following processes running
>
> hadoop@hadoop-master:~$ $HADOOP_HOME/sbin/slaves.sh jps
> hadoop-master: 10933 DataNode
> hadoop-master: 10759 NameNode
> hadoop-master: 11145 SecondaryNameNode
> hadoop-master: 11567 Jps
> hadoop-data1: 5186 Jps
> hadoop-data1: 5059 DataNode
> hadoop-data2: 5180 Jps
> hadoop-data2: 5053 DataNode
>
>
> However, the other two DataNodes aren't visible.
> http://screencast.com/t/icsLnXXDk
>
> Where can I look for clues?
>