You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Bipin Dalbhide <bi...@acellere.com> on 2014/12/31 05:25:18 UTC

Unable to run Giraph Job in Cluster

Hi All,
I have successfully ran custom job on 46 MB size of graph. Now I am 
trying to run same giraph job on 3 node cluster and my data size is 450 
MB. I have turned off zookeeper server on all machines. The three data 
nodes are *master-hadoop, hnode, INPUN-5KPH622*. I google a lot about 
this error but could not found exact solution for it. Is it mandatory to 
run zookeeper on all machines for Giraph Job?

I know little bit about functioning of Zookeeper, it default run on 2181 
port but here it is trying to connect on port 22181.

Here is how i run the job.

target# hadoop jar 
giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner 
org.apache.giraph.examples.CalculateCCWithJSON -vif 
org.apache.giraph.examples.JsonLongTextLongTextVertexInputFormat -vip 
/giraph/input/graphInputDatawithoutroot.txt -vof 
org.apache.giraph.examples.ccOutputFormat -op /giraph/vout2 -w 3

I am receiving this error on all datanodes.

684 INFO org.apache.giraph.graph.GraphTaskManager: setup: Registering health of this worker...
2014-12-29 15:58:30,699 INFO org.apache.giraph.bsp.BspService: getJobState: Job state already exists (/_hadoopBsp/job_201412291209_0010/_masterJobState)
2014-12-29 15:58:30,703 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201412291209_0010/_applicationAttemptsDir already exists!
2014-12-29 15:58:30,706 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201412291209_0010/_applicationAttemptsDir already exists!
2014-12-29 15:58:30,711 INFO org.apache.giraph.worker.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/job_201412291209_0010/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/inpun-5kph622_1 and workerInfo= Worker(hostname=inpun-5kph622, MRtaskID=1, port=30001)
2014-12-29 16:14:42,385 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x14a9596f1a20002, likely server has closed socket, closing socket connection and attempting reconnect
2014-12-29 16:14:42,486 WARN org.apache.giraph.bsp.BspService: process: Disconnected from ZooKeeper (will automatically try to recover) WatchedEvent state:Disconnected type:None path:null
2014-12-29 16:14:43,525 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server hnode/192.168.2.24:22181. Will not attempt to authenticate using SASL (unknown error)
2014-12-29 16:14:43,527 WARN org.apache.zookeeper.ClientCnxn: Session 0x14a9596f1a20002 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
2014-12-29 16:14:43,636 WARN org.apache.giraph.zk.ZooKeeperExt: exists: Connection loss on attempt 0, waiting 5000 msecs before retrying.
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /_hadoopBsp/job_201412291209_0010/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
	at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:360)
	at org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:818)
	at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:576)
	at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)


Thanks,
Bipin Dalbhide
Acellere Software Pvt. Ltd.