You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Robert Davis <dw...@gmail.com> on 2012/04/03 22:37:25 UTC

Exceptions when establishing RPC

Hello,

I was trying to run Giraph on two machines (one master and one slave) but
kept getting exceptions when establishing RPC to the slave machine. Does
anybody has any ideas what's going wrong here? I am running the test with
following parameters.

hadoop jar target/giraph-0.2-SNAPSHOT-jar-with-dependencies.jar
org.apache.giraph.benchmark.PageRankBenchmark -e 10 -s 2 -v -V 2000 -w 2

Thanks,
Robert

12/04/03 01:35:01 DEBUG comm.BasicRPCCommunications:
startPeerConnectionThread: hostname
ec2-107-20-19-131.compute-1.amazonaws.com, port 30001
12/04/03 01:35:01 DEBUG comm.BasicRPCCommunications:
startPeerConnectionThread: Connecting to Worker(hostname=
ec2-107-20-19-131.compute-1.amazonaws.com, MRpartition=1, port=30001), addr
= ec2-107-20-19-131.compute-1.amazonaws.com:30001 if outMsgMap (null) ==
null
12/04/03 01:35:11 WARN comm.BasicRPCCommunications: connectAllRPCProxys:
Failed on attempt 1 of 5 to connect to (id=0,cur=Worker(hostname=
ec2-107-20-19-131.compute-1.amazonaws.com, MRpartition=1,
port=30001),prev=null,ckpt_file=null)
java.net.ConnectException: Call to
ec2-107-20-19-131.compute-1.amazonaws.com:30001 failed on connection
exception: java.net.ConnectException: Connection refused
 at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
at org.apache.hadoop.ipc.Client.call(Client.java:1071)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy3.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:370)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:420)
at
org.apache.giraph.comm.RPCCommunications$1.run(RPCCommunications.java:194)
 at
org.apache.giraph.comm.RPCCommunications$1.run(RPCCommunications.java:190)
at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
 at
org.apache.giraph.comm.RPCCommunications.getRPCProxy(RPCCommunications.java:188)
at
org.apache.giraph.comm.RPCCommunications.getRPCProxy(RPCCommunications.java:58)
 at
org.apache.giraph.comm.BasicRPCCommunications.startPeerConnectionThread(BasicRPCCommunications.java:678)
at
org.apache.giraph.comm.BasicRPCCommunications.connectAllRPCProxys(BasicRPCCommunications.java:622)
 at
org.apache.giraph.comm.BasicRPCCommunications.setup(BasicRPCCommunications.java:583)
at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:555)
 at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:474)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:646)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:656)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
 at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
at org.apache.hadoop.ipc.Client.call(Client.java:1046)
 ... 25 more

Re: Exceptions when establishing RPC

Posted by Avery Ching <ac...@apache.org>.
If you're using one master and one slave, you need to do -w 1.  Did you 
see any error about the RPC server starting up?

Avery

On 4/3/12 1:37 PM, Robert Davis wrote:
> Hello,
>
> I was trying to run Giraph on two machines (one master and one slave) 
> but kept getting exceptions when establishing RPC to the slave 
> machine. Does anybody has any ideas what's going wrong here? I am 
> running the test with following parameters.
>
> hadoop jar target/giraph-0.2-SNAPSHOT-jar-with-dependencies.jar 
> org.apache.giraph.benchmark.PageRankBenchmark -e 10 -s 2 -v -V 2000 -w 2
>
> Thanks,
> Robert
>
> 12/04/03 01:35:01 DEBUG comm.BasicRPCCommunications: 
> startPeerConnectionThread: hostname 
> ec2-107-20-19-131.compute-1.amazonaws.com 
> <http://ec2-107-20-19-131.compute-1.amazonaws.com>, port 30001
> 12/04/03 01:35:01 DEBUG comm.BasicRPCCommunications: 
> startPeerConnectionThread: Connecting to 
> Worker(hostname=ec2-107-20-19-131.compute-1.amazonaws.com 
> <http://ec2-107-20-19-131.compute-1.amazonaws.com>, MRpartition=1, 
> port=30001), addr = ec2-107-20-19-131.compute-1.amazonaws.com:30001 
> <http://ec2-107-20-19-131.compute-1.amazonaws.com:30001> if outMsgMap 
> (null) == null
> 12/04/03 01:35:11 WARN comm.BasicRPCCommunications: 
> connectAllRPCProxys: Failed on attempt 1 of 5 to connect to 
> (id=0,cur=Worker(hostname=ec2-107-20-19-131.compute-1.amazonaws.com 
> <http://ec2-107-20-19-131.compute-1.amazonaws.com>, MRpartition=1, 
> port=30001),prev=null,ckpt_file=null)
> java.net.ConnectException: Call to 
> ec2-107-20-19-131.compute-1.amazonaws.com:30001 
> <http://ec2-107-20-19-131.compute-1.amazonaws.com:30001> failed on 
> connection exception: java.net.ConnectException: Connection refused
> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
> at org.apache.hadoop.ipc.Client.call(Client.java:1071)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
> at $Proxy3.getProtocolVersion(Unknown Source)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:370)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:420)
> at 
> org.apache.giraph.comm.RPCCommunications$1.run(RPCCommunications.java:194)
> at 
> org.apache.giraph.comm.RPCCommunications$1.run(RPCCommunications.java:190)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
> at 
> org.apache.giraph.comm.RPCCommunications.getRPCProxy(RPCCommunications.java:188)
> at 
> org.apache.giraph.comm.RPCCommunications.getRPCProxy(RPCCommunications.java:58)
> at 
> org.apache.giraph.comm.BasicRPCCommunications.startPeerConnectionThread(BasicRPCCommunications.java:678)
> at 
> org.apache.giraph.comm.BasicRPCCommunications.connectAllRPCProxys(BasicRPCCommunications.java:622)
> at 
> org.apache.giraph.comm.BasicRPCCommunications.setup(BasicRPCCommunications.java:583)
> at 
> org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:555)
> at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:474)
> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:646)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:656)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
> at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
> at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
> at org.apache.hadoop.ipc.Client.call(Client.java:1046)
> ... 25 more
>


Re: Exceptions when establishing RPC

Posted by André Kelpe <ef...@googlemail.com>.
2012/4/4 Robert Davis <dw...@gmail.com>:
> Hi Andre,
>
> Thanks for the note. Yep, they are in the same group.
>
> -Robert
>


Okay, so can they connect to each other on the correct ports? Remember
that distributions like redhat/fedora/centos run very restrictive
iptables rules by default, which basically block everything, that is
not explicitly allowed. I have run too often into this on aws myself,
to not consider it the problem, you are seeing here. Also: Make sure
that your security group settings allow the boxes to connect to each
other.

HTH

--André

Re: Exceptions when establishing RPC

Posted by Robert Davis <dw...@gmail.com>.
Hi Andre,

Thanks for the note. Yep, they are in the same group.

-Robert

On Tue, Apr 3, 2012 at 2:54 PM, André Kelpe
<ef...@googlemail.com>wrote:

> Hi Robert!
>
> > Caused by: java.net.ConnectException: Connection refused
> > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> > at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> > at
> >
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:656)
> > at
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
> > at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
> > at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
> > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
> > at org.apache.hadoop.ipc.Client.call(Client.java:1046)
> > ... 25 more
>
> You are running on EC2. Did you make sure that your security group
> allows all instances in the same group to connect to each other on any
> port? It seems that the internal firewall of amazon is blocking things
> for you.
>
> André
>

Re: Exceptions when establishing RPC

Posted by André Kelpe <ef...@googlemail.com>.
Hi Robert!

> Caused by: java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:656)
> at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:434)
> at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:560)
> at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
> at org.apache.hadoop.ipc.Client.call(Client.java:1046)
> ... 25 more

You are running on EC2. Did you make sure that your security group
allows all instances in the same group to connect to each other on any
port? It seems that the internal firewall of amazon is blocking things
for you.

André