You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Sergey Gerasimov <ge...@mlab.cs.msu.su> on 2013/11/07 00:20:05 UTC

access to hadoop cluster to post tasks remotely

Hello,

 

I have problems with posting jar to my cluster remotely from client machine
located somewhere in the Web. I use original hadoop-1.2.1.

 

I installed hadoop  on client machine (same version as in the cluster),
configured fs.default.name and mapred.job.tracker.

Access to DFS works fine remotely. I can successfully play with "hadoop fs"
commands. 

 

But when I send some job, for example: 

hadoop jar hadoop-examples -1.2.1.jar sleep 1

 

I see output like:

13/11/07 02:44:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:44:42 INFO hdfs.DFSClient: Abandoning
blk_1089181243677159149_31717

13/11/07 02:44:42 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:45:45 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:45:45 INFO hdfs.DFSClient: Abandoning
blk_6550586867464091073_31717

13/11/07 02:45:45 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:46:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:46:48 INFO hdfs.DFSClient: Abandoning
blk_5814098597599107248_31717

13/11/07 02:46:48 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:47:51 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:47:51 INFO hdfs.DFSClient: Abandoning
blk_6368219524592897749_31717

 

The same jar sent from inside the cluster runs fine.

 

The network where cluster lives  is protected by firewall with only NameNode
and JobTracker ports opened externally. 

iptables on all nodes are off.

 

I have no ideas about reasons of these messages in the log. To the moment I
were sure that entry point to hadoop cluster  contains just NameNode and
JobTracker ports.

Both are open.

 

Please help!

 

 

 


Re: access to hadoop cluster to post tasks remotely

Posted by Harsh J <ha...@cloudera.com>.
Data in HDFS is read and written via the individual DN's 50010 ports,
which you would also need to open up to avoid these errors. Data isn't
written/read through the NameNode.

On Thu, Nov 7, 2013 at 4:50 AM, Sergey Gerasimov
<ge...@mlab.cs.msu.su> wrote:
> Hello,
>
>
>
> I have problems with posting jar to my cluster remotely from client machine
> located somewhere in the Web. I use original hadoop-1.2.1.
>
>
>
> I installed hadoop  on client machine (same version as in the cluster),
> configured fs.default.name and mapred.job.tracker.
>
> Access to DFS works fine remotely. I can successfully play with “hadoop fs”
> commands.
>
>
>
> But when I send some job, for example:
>
> hadoop jar hadoop-examples -1.2.1.jar sleep 1
>
>
>
> I see output like:
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Abandoning
> blk_1089181243677159149_31717
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Abandoning
> blk_6550586867464091073_31717
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Abandoning
> blk_5814098597599107248_31717
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:47:51 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:47:51 INFO hdfs.DFSClient: Abandoning
> blk_6368219524592897749_31717
>
>
>
> The same jar sent from inside the cluster runs fine.
>
>
>
> The network where cluster lives  is protected by firewall with only NameNode
> and JobTracker ports opened externally.
>
> iptables on all nodes are off.
>
>
>
> I have no ideas about reasons of these messages in the log. To the moment I
> were sure that entry point to hadoop cluster  contains just NameNode and
> JobTracker ports.
>
> Both are open.
>
>
>
> Please help!
>
>
>
>
>
>



-- 
Harsh J

Re: access to hadoop cluster to post tasks remotely

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Wed, Nov 6, 2013 at 3:55 PM, Sergey Gerasimov
<ge...@mlab.cs.msu.su> wrote:
> But I still don’t understand why hadoop engine tries  to connect to
> DataNodes from client(!) machine during posting jar from client machine to
> the cluster.

Only metadata traffic goes to the NN, once metadata operations resolve
the block locations after that direct streaming to the DNs begins.

Thanks,
Roman.

Re: access to hadoop cluster to post tasks remotely

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Wed, Nov 6, 2013 at 3:55 PM, Sergey Gerasimov
<ge...@mlab.cs.msu.su> wrote:
> But I still don’t understand why hadoop engine tries  to connect to
> DataNodes from client(!) machine during posting jar from client machine to
> the cluster.

Only metadata traffic goes to the NN, once metadata operations resolve
the block locations after that direct streaming to the DNs begins.

Thanks,
Roman.

Re: access to hadoop cluster to post tasks remotely

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Wed, Nov 6, 2013 at 3:55 PM, Sergey Gerasimov
<ge...@mlab.cs.msu.su> wrote:
> But I still don’t understand why hadoop engine tries  to connect to
> DataNodes from client(!) machine during posting jar from client machine to
> the cluster.

Only metadata traffic goes to the NN, once metadata operations resolve
the block locations after that direct streaming to the DNs begins.

Thanks,
Roman.

Re: access to hadoop cluster to post tasks remotely

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Wed, Nov 6, 2013 at 3:55 PM, Sergey Gerasimov
<ge...@mlab.cs.msu.su> wrote:
> But I still don’t understand why hadoop engine tries  to connect to
> DataNodes from client(!) machine during posting jar from client machine to
> the cluster.

Only metadata traffic goes to the NN, once metadata operations resolve
the block locations after that direct streaming to the DNs begins.

Thanks,
Roman.

RE: access to hadoop cluster to post tasks remotely

Posted by Sergey Gerasimov <ge...@mlab.cs.msu.su>.
Oooops.

 

Not all "hadoop fs" commands works fine..

 

-ls is OK

-put/-get give similar error.

 

Looks like port 50010 of data nodes should be accessible externally.. Does
anybody know some config param to work around?

 

But I still don't understand why hadoop engine tries  to connect to
DataNodes from client(!) machine during posting jar from client machine to
the cluster.

 

From: Sergey Gerasimov [mailto:gerasimov@mlab.cs.msu.su] 
Sent: Thursday, November 07, 2013 3:20 AM
To: user@hadoop.apache.org
Subject: access to hadoop cluster to post tasks remotely

 

Hello,

 

I have problems with posting jar to my cluster remotely from client machine
located somewhere in the Web. I use original hadoop-1.2.1.

 

I installed hadoop  on client machine (same version as in the cluster),
configured fs.default.name and mapred.job.tracker.

Access to DFS works fine remotely. I can successfully play with "hadoop fs"
commands. 

 

But when I send some job, for example: 

hadoop jar hadoop-examples -1.2.1.jar sleep 1

 

I see output like:

13/11/07 02:44:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:44:42 INFO hdfs.DFSClient: Abandoning
blk_1089181243677159149_31717

13/11/07 02:44:42 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:45:45 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:45:45 INFO hdfs.DFSClient: Abandoning
blk_6550586867464091073_31717

13/11/07 02:45:45 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:46:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:46:48 INFO hdfs.DFSClient: Abandoning
blk_5814098597599107248_31717

13/11/07 02:46:48 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:47:51 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:47:51 INFO hdfs.DFSClient: Abandoning
blk_6368219524592897749_31717

 

The same jar sent from inside the cluster runs fine.

 

The network where cluster lives  is protected by firewall with only NameNode
and JobTracker ports opened externally. 

iptables on all nodes are off.

 

I have no ideas about reasons of these messages in the log. To the moment I
were sure that entry point to hadoop cluster  contains just NameNode and
JobTracker ports.

Both are open.

 

Please help!

 

 

 


RE: access to hadoop cluster to post tasks remotely

Posted by Sergey Gerasimov <ge...@mlab.cs.msu.su>.
Oooops.

 

Not all "hadoop fs" commands works fine..

 

-ls is OK

-put/-get give similar error.

 

Looks like port 50010 of data nodes should be accessible externally.. Does
anybody know some config param to work around?

 

But I still don't understand why hadoop engine tries  to connect to
DataNodes from client(!) machine during posting jar from client machine to
the cluster.

 

From: Sergey Gerasimov [mailto:gerasimov@mlab.cs.msu.su] 
Sent: Thursday, November 07, 2013 3:20 AM
To: user@hadoop.apache.org
Subject: access to hadoop cluster to post tasks remotely

 

Hello,

 

I have problems with posting jar to my cluster remotely from client machine
located somewhere in the Web. I use original hadoop-1.2.1.

 

I installed hadoop  on client machine (same version as in the cluster),
configured fs.default.name and mapred.job.tracker.

Access to DFS works fine remotely. I can successfully play with "hadoop fs"
commands. 

 

But when I send some job, for example: 

hadoop jar hadoop-examples -1.2.1.jar sleep 1

 

I see output like:

13/11/07 02:44:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:44:42 INFO hdfs.DFSClient: Abandoning
blk_1089181243677159149_31717

13/11/07 02:44:42 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:45:45 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:45:45 INFO hdfs.DFSClient: Abandoning
blk_6550586867464091073_31717

13/11/07 02:45:45 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:46:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:46:48 INFO hdfs.DFSClient: Abandoning
blk_5814098597599107248_31717

13/11/07 02:46:48 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:47:51 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:47:51 INFO hdfs.DFSClient: Abandoning
blk_6368219524592897749_31717

 

The same jar sent from inside the cluster runs fine.

 

The network where cluster lives  is protected by firewall with only NameNode
and JobTracker ports opened externally. 

iptables on all nodes are off.

 

I have no ideas about reasons of these messages in the log. To the moment I
were sure that entry point to hadoop cluster  contains just NameNode and
JobTracker ports.

Both are open.

 

Please help!

 

 

 


RE: access to hadoop cluster to post tasks remotely

Posted by Sergey Gerasimov <ge...@mlab.cs.msu.su>.
Oooops.

 

Not all "hadoop fs" commands works fine..

 

-ls is OK

-put/-get give similar error.

 

Looks like port 50010 of data nodes should be accessible externally.. Does
anybody know some config param to work around?

 

But I still don't understand why hadoop engine tries  to connect to
DataNodes from client(!) machine during posting jar from client machine to
the cluster.

 

From: Sergey Gerasimov [mailto:gerasimov@mlab.cs.msu.su] 
Sent: Thursday, November 07, 2013 3:20 AM
To: user@hadoop.apache.org
Subject: access to hadoop cluster to post tasks remotely

 

Hello,

 

I have problems with posting jar to my cluster remotely from client machine
located somewhere in the Web. I use original hadoop-1.2.1.

 

I installed hadoop  on client machine (same version as in the cluster),
configured fs.default.name and mapred.job.tracker.

Access to DFS works fine remotely. I can successfully play with "hadoop fs"
commands. 

 

But when I send some job, for example: 

hadoop jar hadoop-examples -1.2.1.jar sleep 1

 

I see output like:

13/11/07 02:44:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:44:42 INFO hdfs.DFSClient: Abandoning
blk_1089181243677159149_31717

13/11/07 02:44:42 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:45:45 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:45:45 INFO hdfs.DFSClient: Abandoning
blk_6550586867464091073_31717

13/11/07 02:45:45 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:46:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:46:48 INFO hdfs.DFSClient: Abandoning
blk_5814098597599107248_31717

13/11/07 02:46:48 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:47:51 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:47:51 INFO hdfs.DFSClient: Abandoning
blk_6368219524592897749_31717

 

The same jar sent from inside the cluster runs fine.

 

The network where cluster lives  is protected by firewall with only NameNode
and JobTracker ports opened externally. 

iptables on all nodes are off.

 

I have no ideas about reasons of these messages in the log. To the moment I
were sure that entry point to hadoop cluster  contains just NameNode and
JobTracker ports.

Both are open.

 

Please help!

 

 

 


Re: access to hadoop cluster to post tasks remotely

Posted by Harsh J <ha...@cloudera.com>.
Data in HDFS is read and written via the individual DN's 50010 ports,
which you would also need to open up to avoid these errors. Data isn't
written/read through the NameNode.

On Thu, Nov 7, 2013 at 4:50 AM, Sergey Gerasimov
<ge...@mlab.cs.msu.su> wrote:
> Hello,
>
>
>
> I have problems with posting jar to my cluster remotely from client machine
> located somewhere in the Web. I use original hadoop-1.2.1.
>
>
>
> I installed hadoop  on client machine (same version as in the cluster),
> configured fs.default.name and mapred.job.tracker.
>
> Access to DFS works fine remotely. I can successfully play with “hadoop fs”
> commands.
>
>
>
> But when I send some job, for example:
>
> hadoop jar hadoop-examples -1.2.1.jar sleep 1
>
>
>
> I see output like:
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Abandoning
> blk_1089181243677159149_31717
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Abandoning
> blk_6550586867464091073_31717
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Abandoning
> blk_5814098597599107248_31717
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:47:51 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:47:51 INFO hdfs.DFSClient: Abandoning
> blk_6368219524592897749_31717
>
>
>
> The same jar sent from inside the cluster runs fine.
>
>
>
> The network where cluster lives  is protected by firewall with only NameNode
> and JobTracker ports opened externally.
>
> iptables on all nodes are off.
>
>
>
> I have no ideas about reasons of these messages in the log. To the moment I
> were sure that entry point to hadoop cluster  contains just NameNode and
> JobTracker ports.
>
> Both are open.
>
>
>
> Please help!
>
>
>
>
>
>



-- 
Harsh J

Re: access to hadoop cluster to post tasks remotely

Posted by Harsh J <ha...@cloudera.com>.
Data in HDFS is read and written via the individual DN's 50010 ports,
which you would also need to open up to avoid these errors. Data isn't
written/read through the NameNode.

On Thu, Nov 7, 2013 at 4:50 AM, Sergey Gerasimov
<ge...@mlab.cs.msu.su> wrote:
> Hello,
>
>
>
> I have problems with posting jar to my cluster remotely from client machine
> located somewhere in the Web. I use original hadoop-1.2.1.
>
>
>
> I installed hadoop  on client machine (same version as in the cluster),
> configured fs.default.name and mapred.job.tracker.
>
> Access to DFS works fine remotely. I can successfully play with “hadoop fs”
> commands.
>
>
>
> But when I send some job, for example:
>
> hadoop jar hadoop-examples -1.2.1.jar sleep 1
>
>
>
> I see output like:
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Abandoning
> blk_1089181243677159149_31717
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Abandoning
> blk_6550586867464091073_31717
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Abandoning
> blk_5814098597599107248_31717
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:47:51 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:47:51 INFO hdfs.DFSClient: Abandoning
> blk_6368219524592897749_31717
>
>
>
> The same jar sent from inside the cluster runs fine.
>
>
>
> The network where cluster lives  is protected by firewall with only NameNode
> and JobTracker ports opened externally.
>
> iptables on all nodes are off.
>
>
>
> I have no ideas about reasons of these messages in the log. To the moment I
> were sure that entry point to hadoop cluster  contains just NameNode and
> JobTracker ports.
>
> Both are open.
>
>
>
> Please help!
>
>
>
>
>
>



-- 
Harsh J

RE: access to hadoop cluster to post tasks remotely

Posted by Sergey Gerasimov <ge...@mlab.cs.msu.su>.
Oooops.

 

Not all "hadoop fs" commands works fine..

 

-ls is OK

-put/-get give similar error.

 

Looks like port 50010 of data nodes should be accessible externally.. Does
anybody know some config param to work around?

 

But I still don't understand why hadoop engine tries  to connect to
DataNodes from client(!) machine during posting jar from client machine to
the cluster.

 

From: Sergey Gerasimov [mailto:gerasimov@mlab.cs.msu.su] 
Sent: Thursday, November 07, 2013 3:20 AM
To: user@hadoop.apache.org
Subject: access to hadoop cluster to post tasks remotely

 

Hello,

 

I have problems with posting jar to my cluster remotely from client machine
located somewhere in the Web. I use original hadoop-1.2.1.

 

I installed hadoop  on client machine (same version as in the cluster),
configured fs.default.name and mapred.job.tracker.

Access to DFS works fine remotely. I can successfully play with "hadoop fs"
commands. 

 

But when I send some job, for example: 

hadoop jar hadoop-examples -1.2.1.jar sleep 1

 

I see output like:

13/11/07 02:44:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:44:42 INFO hdfs.DFSClient: Abandoning
blk_1089181243677159149_31717

13/11/07 02:44:42 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:45:45 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:45:45 INFO hdfs.DFSClient: Abandoning
blk_6550586867464091073_31717

13/11/07 02:45:45 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:46:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:46:48 INFO hdfs.DFSClient: Abandoning
blk_5814098597599107248_31717

13/11/07 02:46:48 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010

13/11/07 02:47:51 INFO hdfs.DFSClient: Exception in createBlockOutputStream
xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out

13/11/07 02:47:51 INFO hdfs.DFSClient: Abandoning
blk_6368219524592897749_31717

 

The same jar sent from inside the cluster runs fine.

 

The network where cluster lives  is protected by firewall with only NameNode
and JobTracker ports opened externally. 

iptables on all nodes are off.

 

I have no ideas about reasons of these messages in the log. To the moment I
were sure that entry point to hadoop cluster  contains just NameNode and
JobTracker ports.

Both are open.

 

Please help!

 

 

 


Re: access to hadoop cluster to post tasks remotely

Posted by Harsh J <ha...@cloudera.com>.
Data in HDFS is read and written via the individual DN's 50010 ports,
which you would also need to open up to avoid these errors. Data isn't
written/read through the NameNode.

On Thu, Nov 7, 2013 at 4:50 AM, Sergey Gerasimov
<ge...@mlab.cs.msu.su> wrote:
> Hello,
>
>
>
> I have problems with posting jar to my cluster remotely from client machine
> located somewhere in the Web. I use original hadoop-1.2.1.
>
>
>
> I installed hadoop  on client machine (same version as in the cluster),
> configured fs.default.name and mapred.job.tracker.
>
> Access to DFS works fine remotely. I can successfully play with “hadoop fs”
> commands.
>
>
>
> But when I send some job, for example:
>
> hadoop jar hadoop-examples -1.2.1.jar sleep 1
>
>
>
> I see output like:
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Abandoning
> blk_1089181243677159149_31717
>
> 13/11/07 02:44:42 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Abandoning
> blk_6550586867464091073_31717
>
> 13/11/07 02:45:45 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Abandoning
> blk_5814098597599107248_31717
>
> 13/11/07 02:46:48 INFO hdfs.DFSClient: Excluding datanode xx.xx.xx.xx:50010
>
> 13/11/07 02:47:51 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> xx.xx.xx.xx:50010 java.net.ConnectException: Connection timed out
>
> 13/11/07 02:47:51 INFO hdfs.DFSClient: Abandoning
> blk_6368219524592897749_31717
>
>
>
> The same jar sent from inside the cluster runs fine.
>
>
>
> The network where cluster lives  is protected by firewall with only NameNode
> and JobTracker ports opened externally.
>
> iptables on all nodes are off.
>
>
>
> I have no ideas about reasons of these messages in the log. To the moment I
> were sure that entry point to hadoop cluster  contains just NameNode and
> JobTracker ports.
>
> Both are open.
>
>
>
> Please help!
>
>
>
>
>
>



-- 
Harsh J