You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Daniel Watrous <dw...@gmail.com> on 2015/09/23 20:58:27 UTC

Help troubleshooting multi-cluster setup

Hi,

I have deployed a multi-node cluster with one master and two data nodes.
Here's what jps shows:

hadoop@hadoop-master:~$ jps
24641 SecondaryNameNode
24435 DataNode
24261 NameNode
24791 ResourceManager
25483 Jps
24940 NodeManager

hadoop@hadoop-data1:~$ jps
15556 DataNode
16198 NodeManager
16399 Jps

hadoop@hadoop-data2:~$ jps
16418 Jps
15575 DataNode
16216 NodeManager

When I open the web console, I only see one node running:
http://screencast.com/t/E6yehRvUbt

Where are the other two nodes? Why don't they show up?

Next I run one of the example scripts

hadoop@hadoop-master:~$ hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
pi 10 30
Number of Maps  = 10
Samples per Map = 30
Wrote input for Map #0
Wrote input for Map #1
...
Job Finished in 2.956 seconds
Estimated value of Pi is 3.14146666666666666667

I can't see this anywhere in the web interface. I thought it might show in
the Applications sub-menu. Should I be able to see this? It appears to run
successfully.

Daniel

Re: Help troubleshooting multi-cluster setup

Posted by Kuhu Shukla <ks...@yahoo-inc.com>.
Hi Daniel,
The RM will list only NodeManagers and not the datanodes. You can view the datanodes on the NameNode page (eg. 192.168.51.4:50070).
The one node you see on the RM page 'Nodes' list is from this:
hadoop@hadoop-master:~$ jps24641 SecondaryNameNode24435 DataNode24261 NameNode24791 ResourceManager25483 Jps24940 NodeManager   <<<<<<<<<<
You might want NodeManagers and dataNodes to run on the same physical hosts in most of the cases, AFAIK. 
Hope this helps.Regards,Kuhu 


     On Wednesday, September 23, 2015 3:31 PM, Daniel Watrous <dw...@gmail.com> wrote:
   

 I was able to get the jobs submitting to the cluster by adding the following property to mapred-site.xml
   <property>  
    <name>mapreduce.framework.name</name>  
    <value>yarn</value>  
  </property>

I also had to add the following properties to yarn-site.xml
   <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

I'm still not sure why the datanodes don't show up in the nodes view. Is the idea that a data node is only used for HDFS and yarn doesn't schedule jobs there? If so, how can I add additional compute hosts? What are those called?
On Wed, Sep 23, 2015 at 3:08 PM, Daniel Watrous <dw...@gmail.com> wrote:

I'm not sure if this is related, but I'm seeing some errors in hadoop-hadoop-namenode-hadoop-master.log
2015-09-23 19:56:27,798 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1)
2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 54310, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.51.1:54554 Call#373 Retry#0
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

I don't have a server with the IP 192.168.51.1 and I don't think I'm referencing that anywhere. Is there some reason that it's trying to add that host as a namenode?
On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com> wrote:

Hi,
I have deployed a multi-node cluster with one master and two data nodes. Here's what jps shows:
hadoop@hadoop-master:~$ jps24641 SecondaryNameNode24435 DataNode24261 NameNode24791 ResourceManager25483 Jps24940 NodeManager
hadoop@hadoop-data1:~$ jps15556 DataNode16198 NodeManager16399 Jps
hadoop@hadoop-data2:~$ jps16418 Jps15575 DataNode16216 NodeManager
When I open the web console, I only see one node running: http://screencast.com/t/E6yehRvUbt
Where are the other two nodes? Why don't they show up?
Next I run one of the example scripts
hadoop@hadoop-master:~$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 10 30Number of Maps  = 10Samples per Map = 30Wrote input for Map #0
Wrote input for Map #1...Job Finished in 2.956 secondsEstimated value of Pi is 3.14146666666666666667
I can't see this anywhere in the web interface. I thought it might show in the Applications sub-menu. Should I be able to see this? It appears to run successfully.
Daniel





  

Re: Help troubleshooting multi-cluster setup

Posted by Kuhu Shukla <ks...@yahoo-inc.com>.
Hi Daniel,
The RM will list only NodeManagers and not the datanodes. You can view the datanodes on the NameNode page (eg. 192.168.51.4:50070).
The one node you see on the RM page 'Nodes' list is from this:
hadoop@hadoop-master:~$ jps24641 SecondaryNameNode24435 DataNode24261 NameNode24791 ResourceManager25483 Jps24940 NodeManager   <<<<<<<<<<
You might want NodeManagers and dataNodes to run on the same physical hosts in most of the cases, AFAIK. 
Hope this helps.Regards,Kuhu 


     On Wednesday, September 23, 2015 3:31 PM, Daniel Watrous <dw...@gmail.com> wrote:
   

 I was able to get the jobs submitting to the cluster by adding the following property to mapred-site.xml
   <property>  
    <name>mapreduce.framework.name</name>  
    <value>yarn</value>  
  </property>

I also had to add the following properties to yarn-site.xml
   <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

I'm still not sure why the datanodes don't show up in the nodes view. Is the idea that a data node is only used for HDFS and yarn doesn't schedule jobs there? If so, how can I add additional compute hosts? What are those called?
On Wed, Sep 23, 2015 at 3:08 PM, Daniel Watrous <dw...@gmail.com> wrote:

I'm not sure if this is related, but I'm seeing some errors in hadoop-hadoop-namenode-hadoop-master.log
2015-09-23 19:56:27,798 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1)
2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 54310, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.51.1:54554 Call#373 Retry#0
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

I don't have a server with the IP 192.168.51.1 and I don't think I'm referencing that anywhere. Is there some reason that it's trying to add that host as a namenode?
On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com> wrote:

Hi,
I have deployed a multi-node cluster with one master and two data nodes. Here's what jps shows:
hadoop@hadoop-master:~$ jps24641 SecondaryNameNode24435 DataNode24261 NameNode24791 ResourceManager25483 Jps24940 NodeManager
hadoop@hadoop-data1:~$ jps15556 DataNode16198 NodeManager16399 Jps
hadoop@hadoop-data2:~$ jps16418 Jps15575 DataNode16216 NodeManager
When I open the web console, I only see one node running: http://screencast.com/t/E6yehRvUbt
Where are the other two nodes? Why don't they show up?
Next I run one of the example scripts
hadoop@hadoop-master:~$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 10 30Number of Maps  = 10Samples per Map = 30Wrote input for Map #0
Wrote input for Map #1...Job Finished in 2.956 secondsEstimated value of Pi is 3.14146666666666666667
I can't see this anywhere in the web interface. I thought it might show in the Applications sub-menu. Should I be able to see this? It appears to run successfully.
Daniel





  

Re: Help troubleshooting multi-cluster setup

Posted by Kuhu Shukla <ks...@yahoo-inc.com>.
Hi Daniel,
The RM will list only NodeManagers and not the datanodes. You can view the datanodes on the NameNode page (eg. 192.168.51.4:50070).
The one node you see on the RM page 'Nodes' list is from this:
hadoop@hadoop-master:~$ jps24641 SecondaryNameNode24435 DataNode24261 NameNode24791 ResourceManager25483 Jps24940 NodeManager   <<<<<<<<<<
You might want NodeManagers and dataNodes to run on the same physical hosts in most of the cases, AFAIK. 
Hope this helps.Regards,Kuhu 


     On Wednesday, September 23, 2015 3:31 PM, Daniel Watrous <dw...@gmail.com> wrote:
   

 I was able to get the jobs submitting to the cluster by adding the following property to mapred-site.xml
   <property>  
    <name>mapreduce.framework.name</name>  
    <value>yarn</value>  
  </property>

I also had to add the following properties to yarn-site.xml
   <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

I'm still not sure why the datanodes don't show up in the nodes view. Is the idea that a data node is only used for HDFS and yarn doesn't schedule jobs there? If so, how can I add additional compute hosts? What are those called?
On Wed, Sep 23, 2015 at 3:08 PM, Daniel Watrous <dw...@gmail.com> wrote:

I'm not sure if this is related, but I'm seeing some errors in hadoop-hadoop-namenode-hadoop-master.log
2015-09-23 19:56:27,798 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1)
2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 54310, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.51.1:54554 Call#373 Retry#0
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

I don't have a server with the IP 192.168.51.1 and I don't think I'm referencing that anywhere. Is there some reason that it's trying to add that host as a namenode?
On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com> wrote:

Hi,
I have deployed a multi-node cluster with one master and two data nodes. Here's what jps shows:
hadoop@hadoop-master:~$ jps24641 SecondaryNameNode24435 DataNode24261 NameNode24791 ResourceManager25483 Jps24940 NodeManager
hadoop@hadoop-data1:~$ jps15556 DataNode16198 NodeManager16399 Jps
hadoop@hadoop-data2:~$ jps16418 Jps15575 DataNode16216 NodeManager
When I open the web console, I only see one node running: http://screencast.com/t/E6yehRvUbt
Where are the other two nodes? Why don't they show up?
Next I run one of the example scripts
hadoop@hadoop-master:~$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 10 30Number of Maps  = 10Samples per Map = 30Wrote input for Map #0
Wrote input for Map #1...Job Finished in 2.956 secondsEstimated value of Pi is 3.14146666666666666667
I can't see this anywhere in the web interface. I thought it might show in the Applications sub-menu. Should I be able to see this? It appears to run successfully.
Daniel





  

Re: Help troubleshooting multi-cluster setup

Posted by Kuhu Shukla <ks...@yahoo-inc.com>.
Hi Daniel,
The RM will list only NodeManagers and not the datanodes. You can view the datanodes on the NameNode page (eg. 192.168.51.4:50070).
The one node you see on the RM page 'Nodes' list is from this:
hadoop@hadoop-master:~$ jps24641 SecondaryNameNode24435 DataNode24261 NameNode24791 ResourceManager25483 Jps24940 NodeManager   <<<<<<<<<<
You might want NodeManagers and dataNodes to run on the same physical hosts in most of the cases, AFAIK. 
Hope this helps.Regards,Kuhu 


     On Wednesday, September 23, 2015 3:31 PM, Daniel Watrous <dw...@gmail.com> wrote:
   

 I was able to get the jobs submitting to the cluster by adding the following property to mapred-site.xml
   <property>  
    <name>mapreduce.framework.name</name>  
    <value>yarn</value>  
  </property>

I also had to add the following properties to yarn-site.xml
   <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

I'm still not sure why the datanodes don't show up in the nodes view. Is the idea that a data node is only used for HDFS and yarn doesn't schedule jobs there? If so, how can I add additional compute hosts? What are those called?
On Wed, Sep 23, 2015 at 3:08 PM, Daniel Watrous <dw...@gmail.com> wrote:

I'm not sure if this is related, but I'm seeing some errors in hadoop-hadoop-namenode-hadoop-master.log
2015-09-23 19:56:27,798 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1)
2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 54310, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.51.1:54554 Call#373 Retry#0
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

I don't have a server with the IP 192.168.51.1 and I don't think I'm referencing that anywhere. Is there some reason that it's trying to add that host as a namenode?
On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com> wrote:

Hi,
I have deployed a multi-node cluster with one master and two data nodes. Here's what jps shows:
hadoop@hadoop-master:~$ jps24641 SecondaryNameNode24435 DataNode24261 NameNode24791 ResourceManager25483 Jps24940 NodeManager
hadoop@hadoop-data1:~$ jps15556 DataNode16198 NodeManager16399 Jps
hadoop@hadoop-data2:~$ jps16418 Jps15575 DataNode16216 NodeManager
When I open the web console, I only see one node running: http://screencast.com/t/E6yehRvUbt
Where are the other two nodes? Why don't they show up?
Next I run one of the example scripts
hadoop@hadoop-master:~$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 10 30Number of Maps  = 10Samples per Map = 30Wrote input for Map #0
Wrote input for Map #1...Job Finished in 2.956 secondsEstimated value of Pi is 3.14146666666666666667
I can't see this anywhere in the web interface. I thought it might show in the Applications sub-menu. Should I be able to see this? It appears to run successfully.
Daniel





  

Re: Help troubleshooting multi-cluster setup

Posted by Daniel Watrous <dw...@gmail.com>.
I was able to get the jobs submitting to the cluster by adding the
following property to mapred-site.xml

   <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>

I also had to add the following properties to yarn-site.xml

   <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

I'm still not sure why the datanodes don't show up in the nodes view. Is
the idea that a data node is only used for HDFS and yarn doesn't schedule
jobs there? If so, how can I add additional compute hosts? What are those
called?

On Wed, Sep 23, 2015 at 3:08 PM, Daniel Watrous <dw...@gmail.com>
wrote:

> I'm not sure if this is related, but I'm seeing some errors
> in hadoop-hadoop-namenode-hadoop-master.log
>
> 2015-09-23 19:56:27,798 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1)
> 2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 54310, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.51.1:54554 Call#373 Retry#0
> org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
> 	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
> 	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
> 	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>
>
> I don't have a server with the IP 192.168.51.1 and I don't think I'm
> referencing that anywhere. Is there some reason that it's trying to add
> that host as a namenode?
>
> On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have deployed a multi-node cluster with one master and two data nodes.
>> Here's what jps shows:
>>
>> hadoop@hadoop-master:~$ jps
>> 24641 SecondaryNameNode
>> 24435 DataNode
>> 24261 NameNode
>> 24791 ResourceManager
>> 25483 Jps
>> 24940 NodeManager
>>
>> hadoop@hadoop-data1:~$ jps
>> 15556 DataNode
>> 16198 NodeManager
>> 16399 Jps
>>
>> hadoop@hadoop-data2:~$ jps
>> 16418 Jps
>> 15575 DataNode
>> 16216 NodeManager
>>
>> When I open the web console, I only see one node running:
>> http://screencast.com/t/E6yehRvUbt
>>
>> Where are the other two nodes? Why don't they show up?
>>
>> Next I run one of the example scripts
>>
>> hadoop@hadoop-master:~$ hadoop jar
>> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
>> pi 10 30
>> Number of Maps  = 10
>> Samples per Map = 30
>> Wrote input for Map #0
>> Wrote input for Map #1
>> ...
>> Job Finished in 2.956 seconds
>> Estimated value of Pi is 3.14146666666666666667
>>
>> I can't see this anywhere in the web interface. I thought it might show
>> in the Applications sub-menu. Should I be able to see this? It appears to
>> run successfully.
>>
>> Daniel
>>
>
>

Re: Help troubleshooting multi-cluster setup

Posted by Daniel Watrous <dw...@gmail.com>.
I was able to get the jobs submitting to the cluster by adding the
following property to mapred-site.xml

   <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>

I also had to add the following properties to yarn-site.xml

   <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

I'm still not sure why the datanodes don't show up in the nodes view. Is
the idea that a data node is only used for HDFS and yarn doesn't schedule
jobs there? If so, how can I add additional compute hosts? What are those
called?

On Wed, Sep 23, 2015 at 3:08 PM, Daniel Watrous <dw...@gmail.com>
wrote:

> I'm not sure if this is related, but I'm seeing some errors
> in hadoop-hadoop-namenode-hadoop-master.log
>
> 2015-09-23 19:56:27,798 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1)
> 2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 54310, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.51.1:54554 Call#373 Retry#0
> org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
> 	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
> 	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
> 	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>
>
> I don't have a server with the IP 192.168.51.1 and I don't think I'm
> referencing that anywhere. Is there some reason that it's trying to add
> that host as a namenode?
>
> On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have deployed a multi-node cluster with one master and two data nodes.
>> Here's what jps shows:
>>
>> hadoop@hadoop-master:~$ jps
>> 24641 SecondaryNameNode
>> 24435 DataNode
>> 24261 NameNode
>> 24791 ResourceManager
>> 25483 Jps
>> 24940 NodeManager
>>
>> hadoop@hadoop-data1:~$ jps
>> 15556 DataNode
>> 16198 NodeManager
>> 16399 Jps
>>
>> hadoop@hadoop-data2:~$ jps
>> 16418 Jps
>> 15575 DataNode
>> 16216 NodeManager
>>
>> When I open the web console, I only see one node running:
>> http://screencast.com/t/E6yehRvUbt
>>
>> Where are the other two nodes? Why don't they show up?
>>
>> Next I run one of the example scripts
>>
>> hadoop@hadoop-master:~$ hadoop jar
>> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
>> pi 10 30
>> Number of Maps  = 10
>> Samples per Map = 30
>> Wrote input for Map #0
>> Wrote input for Map #1
>> ...
>> Job Finished in 2.956 seconds
>> Estimated value of Pi is 3.14146666666666666667
>>
>> I can't see this anywhere in the web interface. I thought it might show
>> in the Applications sub-menu. Should I be able to see this? It appears to
>> run successfully.
>>
>> Daniel
>>
>
>

Re: Help troubleshooting multi-cluster setup

Posted by Daniel Watrous <dw...@gmail.com>.
I was able to get the jobs submitting to the cluster by adding the
following property to mapred-site.xml

   <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>

I also had to add the following properties to yarn-site.xml

   <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

I'm still not sure why the datanodes don't show up in the nodes view. Is
the idea that a data node is only used for HDFS and yarn doesn't schedule
jobs there? If so, how can I add additional compute hosts? What are those
called?

On Wed, Sep 23, 2015 at 3:08 PM, Daniel Watrous <dw...@gmail.com>
wrote:

> I'm not sure if this is related, but I'm seeing some errors
> in hadoop-hadoop-namenode-hadoop-master.log
>
> 2015-09-23 19:56:27,798 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1)
> 2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 54310, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.51.1:54554 Call#373 Retry#0
> org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
> 	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
> 	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
> 	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>
>
> I don't have a server with the IP 192.168.51.1 and I don't think I'm
> referencing that anywhere. Is there some reason that it's trying to add
> that host as a namenode?
>
> On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have deployed a multi-node cluster with one master and two data nodes.
>> Here's what jps shows:
>>
>> hadoop@hadoop-master:~$ jps
>> 24641 SecondaryNameNode
>> 24435 DataNode
>> 24261 NameNode
>> 24791 ResourceManager
>> 25483 Jps
>> 24940 NodeManager
>>
>> hadoop@hadoop-data1:~$ jps
>> 15556 DataNode
>> 16198 NodeManager
>> 16399 Jps
>>
>> hadoop@hadoop-data2:~$ jps
>> 16418 Jps
>> 15575 DataNode
>> 16216 NodeManager
>>
>> When I open the web console, I only see one node running:
>> http://screencast.com/t/E6yehRvUbt
>>
>> Where are the other two nodes? Why don't they show up?
>>
>> Next I run one of the example scripts
>>
>> hadoop@hadoop-master:~$ hadoop jar
>> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
>> pi 10 30
>> Number of Maps  = 10
>> Samples per Map = 30
>> Wrote input for Map #0
>> Wrote input for Map #1
>> ...
>> Job Finished in 2.956 seconds
>> Estimated value of Pi is 3.14146666666666666667
>>
>> I can't see this anywhere in the web interface. I thought it might show
>> in the Applications sub-menu. Should I be able to see this? It appears to
>> run successfully.
>>
>> Daniel
>>
>
>

Re: Help troubleshooting multi-cluster setup

Posted by Daniel Watrous <dw...@gmail.com>.
I was able to get the jobs submitting to the cluster by adding the
following property to mapred-site.xml

   <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>

I also had to add the following properties to yarn-site.xml

   <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

I'm still not sure why the datanodes don't show up in the nodes view. Is
the idea that a data node is only used for HDFS and yarn doesn't schedule
jobs there? If so, how can I add additional compute hosts? What are those
called?

On Wed, Sep 23, 2015 at 3:08 PM, Daniel Watrous <dw...@gmail.com>
wrote:

> I'm not sure if this is related, but I'm seeing some errors
> in hadoop-hadoop-namenode-hadoop-master.log
>
> 2015-09-23 19:56:27,798 WARN org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Unresolved datanode registration: hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1)
> 2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 54310, call org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 192.168.51.1:54554 Call#373 Retry#0
> org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=192.168.51.1, hostname=192.168.51.1): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
> 	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
> 	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
> 	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>
>
> I don't have a server with the IP 192.168.51.1 and I don't think I'm
> referencing that anywhere. Is there some reason that it's trying to add
> that host as a namenode?
>
> On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I have deployed a multi-node cluster with one master and two data nodes.
>> Here's what jps shows:
>>
>> hadoop@hadoop-master:~$ jps
>> 24641 SecondaryNameNode
>> 24435 DataNode
>> 24261 NameNode
>> 24791 ResourceManager
>> 25483 Jps
>> 24940 NodeManager
>>
>> hadoop@hadoop-data1:~$ jps
>> 15556 DataNode
>> 16198 NodeManager
>> 16399 Jps
>>
>> hadoop@hadoop-data2:~$ jps
>> 16418 Jps
>> 15575 DataNode
>> 16216 NodeManager
>>
>> When I open the web console, I only see one node running:
>> http://screencast.com/t/E6yehRvUbt
>>
>> Where are the other two nodes? Why don't they show up?
>>
>> Next I run one of the example scripts
>>
>> hadoop@hadoop-master:~$ hadoop jar
>> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
>> pi 10 30
>> Number of Maps  = 10
>> Samples per Map = 30
>> Wrote input for Map #0
>> Wrote input for Map #1
>> ...
>> Job Finished in 2.956 seconds
>> Estimated value of Pi is 3.14146666666666666667
>>
>> I can't see this anywhere in the web interface. I thought it might show
>> in the Applications sub-menu. Should I be able to see this? It appears to
>> run successfully.
>>
>> Daniel
>>
>
>

Re: Help troubleshooting multi-cluster setup

Posted by Daniel Watrous <dw...@gmail.com>.
I'm not sure if this is related, but I'm seeing some errors
in hadoop-hadoop-namenode-hadoop-master.log

2015-09-23 19:56:27,798 WARN
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
Unresolved datanode registration: hostname cannot be resolved
(ip=192.168.51.1, hostname=192.168.51.1)
2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 54310, call
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode
from 192.168.51.1:54554 Call#373 Retry#0
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException:
Datanode denied communication with namenode because hostname cannot be
resolved (ip=192.168.51.1, hostname=192.168.51.1):
DatanodeRegistration(0.0.0.0:50010,
datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075,
infoSecurePort=0, ipcPort=50020,
storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)


I don't have a server with the IP 192.168.51.1 and I don't think I'm
referencing that anywhere. Is there some reason that it's trying to add
that host as a namenode?

On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com>
wrote:

> Hi,
>
> I have deployed a multi-node cluster with one master and two data nodes.
> Here's what jps shows:
>
> hadoop@hadoop-master:~$ jps
> 24641 SecondaryNameNode
> 24435 DataNode
> 24261 NameNode
> 24791 ResourceManager
> 25483 Jps
> 24940 NodeManager
>
> hadoop@hadoop-data1:~$ jps
> 15556 DataNode
> 16198 NodeManager
> 16399 Jps
>
> hadoop@hadoop-data2:~$ jps
> 16418 Jps
> 15575 DataNode
> 16216 NodeManager
>
> When I open the web console, I only see one node running:
> http://screencast.com/t/E6yehRvUbt
>
> Where are the other two nodes? Why don't they show up?
>
> Next I run one of the example scripts
>
> hadoop@hadoop-master:~$ hadoop jar
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
> pi 10 30
> Number of Maps  = 10
> Samples per Map = 30
> Wrote input for Map #0
> Wrote input for Map #1
> ...
> Job Finished in 2.956 seconds
> Estimated value of Pi is 3.14146666666666666667
>
> I can't see this anywhere in the web interface. I thought it might show in
> the Applications sub-menu. Should I be able to see this? It appears to run
> successfully.
>
> Daniel
>

Re: Help troubleshooting multi-cluster setup

Posted by Daniel Watrous <dw...@gmail.com>.
I'm not sure if this is related, but I'm seeing some errors
in hadoop-hadoop-namenode-hadoop-master.log

2015-09-23 19:56:27,798 WARN
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
Unresolved datanode registration: hostname cannot be resolved
(ip=192.168.51.1, hostname=192.168.51.1)
2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 54310, call
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode
from 192.168.51.1:54554 Call#373 Retry#0
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException:
Datanode denied communication with namenode because hostname cannot be
resolved (ip=192.168.51.1, hostname=192.168.51.1):
DatanodeRegistration(0.0.0.0:50010,
datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075,
infoSecurePort=0, ipcPort=50020,
storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)


I don't have a server with the IP 192.168.51.1 and I don't think I'm
referencing that anywhere. Is there some reason that it's trying to add
that host as a namenode?

On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com>
wrote:

> Hi,
>
> I have deployed a multi-node cluster with one master and two data nodes.
> Here's what jps shows:
>
> hadoop@hadoop-master:~$ jps
> 24641 SecondaryNameNode
> 24435 DataNode
> 24261 NameNode
> 24791 ResourceManager
> 25483 Jps
> 24940 NodeManager
>
> hadoop@hadoop-data1:~$ jps
> 15556 DataNode
> 16198 NodeManager
> 16399 Jps
>
> hadoop@hadoop-data2:~$ jps
> 16418 Jps
> 15575 DataNode
> 16216 NodeManager
>
> When I open the web console, I only see one node running:
> http://screencast.com/t/E6yehRvUbt
>
> Where are the other two nodes? Why don't they show up?
>
> Next I run one of the example scripts
>
> hadoop@hadoop-master:~$ hadoop jar
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
> pi 10 30
> Number of Maps  = 10
> Samples per Map = 30
> Wrote input for Map #0
> Wrote input for Map #1
> ...
> Job Finished in 2.956 seconds
> Estimated value of Pi is 3.14146666666666666667
>
> I can't see this anywhere in the web interface. I thought it might show in
> the Applications sub-menu. Should I be able to see this? It appears to run
> successfully.
>
> Daniel
>

Re: Help troubleshooting multi-cluster setup

Posted by Daniel Watrous <dw...@gmail.com>.
I'm not sure if this is related, but I'm seeing some errors
in hadoop-hadoop-namenode-hadoop-master.log

2015-09-23 19:56:27,798 WARN
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
Unresolved datanode registration: hostname cannot be resolved
(ip=192.168.51.1, hostname=192.168.51.1)
2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 54310, call
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode
from 192.168.51.1:54554 Call#373 Retry#0
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException:
Datanode denied communication with namenode because hostname cannot be
resolved (ip=192.168.51.1, hostname=192.168.51.1):
DatanodeRegistration(0.0.0.0:50010,
datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075,
infoSecurePort=0, ipcPort=50020,
storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)


I don't have a server with the IP 192.168.51.1 and I don't think I'm
referencing that anywhere. Is there some reason that it's trying to add
that host as a namenode?

On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com>
wrote:

> Hi,
>
> I have deployed a multi-node cluster with one master and two data nodes.
> Here's what jps shows:
>
> hadoop@hadoop-master:~$ jps
> 24641 SecondaryNameNode
> 24435 DataNode
> 24261 NameNode
> 24791 ResourceManager
> 25483 Jps
> 24940 NodeManager
>
> hadoop@hadoop-data1:~$ jps
> 15556 DataNode
> 16198 NodeManager
> 16399 Jps
>
> hadoop@hadoop-data2:~$ jps
> 16418 Jps
> 15575 DataNode
> 16216 NodeManager
>
> When I open the web console, I only see one node running:
> http://screencast.com/t/E6yehRvUbt
>
> Where are the other two nodes? Why don't they show up?
>
> Next I run one of the example scripts
>
> hadoop@hadoop-master:~$ hadoop jar
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
> pi 10 30
> Number of Maps  = 10
> Samples per Map = 30
> Wrote input for Map #0
> Wrote input for Map #1
> ...
> Job Finished in 2.956 seconds
> Estimated value of Pi is 3.14146666666666666667
>
> I can't see this anywhere in the web interface. I thought it might show in
> the Applications sub-menu. Should I be able to see this? It appears to run
> successfully.
>
> Daniel
>

Re: Help troubleshooting multi-cluster setup

Posted by Daniel Watrous <dw...@gmail.com>.
I'm not sure if this is related, but I'm seeing some errors
in hadoop-hadoop-namenode-hadoop-master.log

2015-09-23 19:56:27,798 WARN
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
Unresolved datanode registration: hostname cannot be resolved
(ip=192.168.51.1, hostname=192.168.51.1)
2015-09-23 19:56:27,800 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 54310, call
org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode
from 192.168.51.1:54554 Call#373 Retry#0
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException:
Datanode denied communication with namenode because hostname cannot be
resolved (ip=192.168.51.1, hostname=192.168.51.1):
DatanodeRegistration(0.0.0.0:50010,
datanodeUuid=8a5d90c8-b909-46d3-80ec-2a3a8f1fe904, infoPort=50075,
infoSecurePort=0, ipcPort=50020,
storageInfo=lv=-56;cid=CID-bc60d031-11b0-4eb5-8f9b-da0f8a069ea6;nsid=1223814533;c=0)
	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:863)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4529)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1279)
	at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:95)
	at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28539)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)


I don't have a server with the IP 192.168.51.1 and I don't think I'm
referencing that anywhere. Is there some reason that it's trying to add
that host as a namenode?

On Wed, Sep 23, 2015 at 1:58 PM, Daniel Watrous <dw...@gmail.com>
wrote:

> Hi,
>
> I have deployed a multi-node cluster with one master and two data nodes.
> Here's what jps shows:
>
> hadoop@hadoop-master:~$ jps
> 24641 SecondaryNameNode
> 24435 DataNode
> 24261 NameNode
> 24791 ResourceManager
> 25483 Jps
> 24940 NodeManager
>
> hadoop@hadoop-data1:~$ jps
> 15556 DataNode
> 16198 NodeManager
> 16399 Jps
>
> hadoop@hadoop-data2:~$ jps
> 16418 Jps
> 15575 DataNode
> 16216 NodeManager
>
> When I open the web console, I only see one node running:
> http://screencast.com/t/E6yehRvUbt
>
> Where are the other two nodes? Why don't they show up?
>
> Next I run one of the example scripts
>
> hadoop@hadoop-master:~$ hadoop jar
> /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar
> pi 10 30
> Number of Maps  = 10
> Samples per Map = 30
> Wrote input for Map #0
> Wrote input for Map #1
> ...
> Job Finished in 2.956 seconds
> Estimated value of Pi is 3.14146666666666666667
>
> I can't see this anywhere in the web interface. I thought it might show in
> the Applications sub-menu. Should I be able to see this? It appears to run
> successfully.
>
> Daniel
>