You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Christoph Böhm <li...@gmx.net> on 2012/01/03 12:54:21 UTC

java.io.EOFException

Thanks!
The next exception I cannot explain myself is the following.
I have one input file of the form:
[2095029,[[1100046950,-1],[952771928,-1]],[[1276522248,0.9829082],[322609086,0.013525307]]]
[5146036,[[947366954,-1],[34019593,-1]],[[1199061143,0.573876],[1024309140,0.98412496]]]
[5270429,[[800028028,-1],[1362541830,-1]],[[164325925,0.92203426],[148512084,0.65505975]]]
... and want to use say 5 workers.
Then worker tenem05 reports what is below.

Cheers.
Christoph

--------------
java.lang.RuntimeException: java.io.IOException: Call to tenem02//172.16.23.151:30003 failed on local exception: java.io.EOFException
	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:780)
	at org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:304)
	at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:569)
	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:458)
	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
	at org.apache.hadoop.ipc.Client.call(Client.java:1033)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
	at $Proxy3.putVertexList(Unknown Source)
	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:777)
	... 11 more
Caused by: java.io.EOFException
	at java.io.DataInputStream.readInt(DataInputStream.java:375)
	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
2012-01-03 12:35:46,259 ERROR org.apache.giraph.graph.GraphMapper: setup: Caught exception just before end of setup
java.lang.IllegalStateException: setup: loadVertices failed
	at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:576)
	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:458)
	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.RuntimeException: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:780)
	at org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:304)
	at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:569)
	... 9 more
Caused by: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
	at org.apache.hadoop.ipc.Client.call(Client.java:1033)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
	at $Proxy3.putVertexList(Unknown Source)
	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:777)
	... 11 more
Caused by: java.io.EOFException
	at java.io.DataInputStream.readInt(DataInputStream.java:375)
	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
2012-01-03 12:35:46,260 ERROR org.apache.giraph.graph.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_201112231316_4347/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/tenem05_1 on superstep -1
2012-01-03 12:35:46,270 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-01-03 12:35:46,320 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
2012-01-03 12:35:46,320 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName hadoop00 for UID 503 from the native implementation
2012-01-03 12:35:46,322 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.IllegalStateException: run: Caught an unrecoverable exception setup: Offlining servers due to exception...
	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:641)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
	at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.RuntimeException: setup: Offlining servers due to exception...
	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:466)
	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
	... 7 more
Caused by: java.lang.IllegalStateException: setup: loadVertices failed
	at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:576)
	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:458)
	... 8 more
Caused by: java.lang.RuntimeException: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:780)
	at org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:304)
	at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:569)
	... 9 more
Caused by: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
	at org.apache.hadoop.ipc.Client.call(Client.java:1033)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
	at $Proxy3.putVertexList(Unknown Source)
	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:777)
	... 11 more
Caused by: java.io.EOFException
	at java.io.DataInputStream.readInt(DataInputStream.java:375)
	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
2012-01-03 12:35:46,337 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task





-------- Original-Nachricht --------
> Datum: Fri, 23 Dec 2011 09:25:24 -0800
> Von: Avery Ching <ac...@apache.org>
> An: giraph-user@incubator.apache.org
> Betreff: Re: zookeeper connection issue

> Yeah, of those errors can seem a little scary.  But I think they are 
> mostly harmless.  Let's go over each one inline.
> 
> On 12/23/11 7:10 AM, "Christoph Böhm" wrote:
> > Hi List,
> >
> > I'm about to get started with Giraph and have a few of questions:
> > when running the Pagrank example with
> >     hadoop jar giraph-0.70-jar-with-dependencies.jar
> org.apache.giraph.benchmark.PageRankBenchmark -e 1 -s 3 -v -V 500000 -w 10
> > this finishes but I find the following in one worker's logs:
> >
> > *** Worker:
> > 2011-12-23 15:36:09,468 ERROR org.apache.zookeeper.ClientCnxn: Error
> while calling watcher
> > java.lang.RuntimeException:
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for
> /_hadoopBsp/job_201112231316_0010/_masterJobState
> > 	at org.apache.giraph.graph.BspService.getJobState(BspService.java:564)
> > 	at
> org.apache.giraph.graph.BspServiceWorker.processEvent(BspServiceWorker.java:1414)
> > 	at org.apache.giraph.graph.BspService.process(BspService.java:1017)
> > 	at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
> > 	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
> > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss for
> /_hadoopBsp/job_201112231316_0010/_masterJobState
> > 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> > 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> > 	at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
> > 	at org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:99)
> > 	at org.apache.giraph.graph.BspService.getJobState(BspService.java:555)
> > 	... 4 more
> 
> Depends when this happens.  If it's after the worker has let the master 
> know that it was finished with everything, this is fine.
> 
> > *** The Master says:
> > 2011-12-23 15:45:40,564 WARN org.apache.giraph.zk.ZooKeeperManager:
> onlineZooKeeperServers: Got ConnectException
> > java.net.ConnectException: Connection refused
> > 	at java.net.PlainSocketImpl.socketConnect(Native Method)
> > 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> > 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
> > 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> > 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> > 	at java.net.Socket.connect(Socket.java:525)
> > 	at
> org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:624)
> > 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:408)
> > 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
> > 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> > 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
> > 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> > 	at java.security.AccessController.doPrivileged(Native Method)
> > 	at javax.security.auth.Subject.doAs(Subject.java:396)
> > 	at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> > 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
> >
> >
> >
> > Also, when I'm trying to run my own Job I see the following. All
> firewalls etc. should be shutdown.
> >
> > *** Master (node09.de):
> > 2011-12-23 15:57:47,140 INFO org.apache.giraph.zk.ZooKeeperManager:
> onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to
> node09.de:22181 with poll msecs = 3000
> > 2011-12-23 15:57:47,143 WARN org.apache.giraph.zk.ZooKeeperManager:
> onlineZooKeeperServers: Got ConnectException
> > java.net.ConnectException: Connection refused
> > 	at java.net.PlainSocketImpl.socketConnect(Native Method)
> > 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
> > 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
> > 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
> > 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> > 	at java.net.Socket.connect(Socket.java:525)
> > 	at
> org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:624)
> > 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:409)
> > 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
> > 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> > 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
> > 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> > 	at java.security.AccessController.doPrivileged(Native Method)
> > 	at javax.security.auth.Subject.doAs(Subject.java:396)
> > 	at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> > 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
> >
> >
> >
> > Thanks again.
> > Christoph
> These two exceptions on the master are also fine.  It takes some time 
> for the master to start the zk service (hence the multiple connection 
> attempts).

Re: java.io.EOFException

Posted by Avery Ching <ac...@apache.org>.
It appears that you had a problem with the serialization/deserialization 
of your vertex and/or its types (I, E, V, M).  You might want to try to 
test that out separately.

Avery

On 1/3/12 3:54 AM, "Christoph Böhm" wrote:
> Thanks!
> The next exception I cannot explain myself is the following.
> I have one input file of the form:
> [2095029,[[1100046950,-1],[952771928,-1]],[[1276522248,0.9829082],[322609086,0.013525307]]]
> [5146036,[[947366954,-1],[34019593,-1]],[[1199061143,0.573876],[1024309140,0.98412496]]]
> [5270429,[[800028028,-1],[1362541830,-1]],[[164325925,0.92203426],[148512084,0.65505975]]]
> ... and want to use say 5 workers.
> Then worker tenem05 reports what is below.
>
> Cheers.
> Christoph
>
> --------------
> java.lang.RuntimeException: java.io.IOException: Call to tenem02//172.16.23.151:30003 failed on local exception: java.io.EOFException
> 	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:780)
> 	at org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:304)
> 	at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:569)
> 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:458)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
> Caused by: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1033)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
> 	at $Proxy3.putVertexList(Unknown Source)
> 	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:777)
> 	... 11 more
> Caused by: java.io.EOFException
> 	at java.io.DataInputStream.readInt(DataInputStream.java:375)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
> 2012-01-03 12:35:46,259 ERROR org.apache.giraph.graph.GraphMapper: setup: Caught exception just before end of setup
> java.lang.IllegalStateException: setup: loadVertices failed
> 	at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:576)
> 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:458)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
> Caused by: java.lang.RuntimeException: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
> 	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:780)
> 	at org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:304)
> 	at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:569)
> 	... 9 more
> Caused by: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1033)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
> 	at $Proxy3.putVertexList(Unknown Source)
> 	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:777)
> 	... 11 more
> Caused by: java.io.EOFException
> 	at java.io.DataInputStream.readInt(DataInputStream.java:375)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
> 2012-01-03 12:35:46,260 ERROR org.apache.giraph.graph.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_201112231316_4347/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/tenem05_1 on superstep -1
> 2012-01-03 12:35:46,270 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-01-03 12:35:46,320 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
> 2012-01-03 12:35:46,320 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName hadoop00 for UID 503 from the native implementation
> 2012-01-03 12:35:46,322 WARN org.apache.hadoop.mapred.Child: Error running child
> java.lang.IllegalStateException: run: Caught an unrecoverable exception setup: Offlining servers due to exception...
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:641)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
> Caused by: java.lang.RuntimeException: setup: Offlining servers due to exception...
> 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:466)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
> 	... 7 more
> Caused by: java.lang.IllegalStateException: setup: loadVertices failed
> 	at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:576)
> 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:458)
> 	... 8 more
> Caused by: java.lang.RuntimeException: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
> 	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:780)
> 	at org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:304)
> 	at org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:569)
> 	... 9 more
> Caused by: java.io.IOException: Call to tenem02/172.16.23.151:30003 failed on local exception: java.io.EOFException
> 	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1065)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1033)
> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:224)
> 	at $Proxy3.putVertexList(Unknown Source)
> 	at org.apache.giraph.comm.BasicRPCCommunications.sendPartitionReq(BasicRPCCommunications.java:777)
> 	... 11 more
> Caused by: java.io.EOFException
> 	at java.io.DataInputStream.readInt(DataInputStream.java:375)
> 	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:767)
> 	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:712)
> 2012-01-03 12:35:46,337 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
>
>
>
>
>
> -------- Original-Nachricht --------
>> Datum: Fri, 23 Dec 2011 09:25:24 -0800
>> Von: Avery Ching<ac...@apache.org>
>> An: giraph-user@incubator.apache.org
>> Betreff: Re: zookeeper connection issue
>> Yeah, of those errors can seem a little scary.  But I think they are
>> mostly harmless.  Let's go over each one inline.
>>
>> On 12/23/11 7:10 AM, "Christoph Böhm" wrote:
>>> Hi List,
>>>
>>> I'm about to get started with Giraph and have a few of questions:
>>> when running the Pagrank example with
>>>      hadoop jar giraph-0.70-jar-with-dependencies.jar
>> org.apache.giraph.benchmark.PageRankBenchmark -e 1 -s 3 -v -V 500000 -w 10
>>> this finishes but I find the following in one worker's logs:
>>>
>>> *** Worker:
>>> 2011-12-23 15:36:09,468 ERROR org.apache.zookeeper.ClientCnxn: Error
>> while calling watcher
>>> java.lang.RuntimeException:
>> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for
>> /_hadoopBsp/job_201112231316_0010/_masterJobState
>>> 	at org.apache.giraph.graph.BspService.getJobState(BspService.java:564)
>>> 	at
>> org.apache.giraph.graph.BspServiceWorker.processEvent(BspServiceWorker.java:1414)
>>> 	at org.apache.giraph.graph.BspService.process(BspService.java:1017)
>>> 	at
>> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
>>> 	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
>>> Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for
>> /_hadoopBsp/job_201112231316_0010/_masterJobState
>>> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>> 	at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>>> 	at org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:99)
>>> 	at org.apache.giraph.graph.BspService.getJobState(BspService.java:555)
>>> 	... 4 more
>> Depends when this happens.  If it's after the worker has let the master
>> know that it was finished with everything, this is fine.
>>
>>> *** The Master says:
>>> 2011-12-23 15:45:40,564 WARN org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Got ConnectException
>>> java.net.ConnectException: Connection refused
>>> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>>> 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>>> 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>>> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>>> 	at java.net.Socket.connect(Socket.java:525)
>>> 	at
>> org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:624)
>>> 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:408)
>>> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:396)
>>> 	at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
>>>
>>>
>>>
>>> Also, when I'm trying to run my own Job I see the following. All
>> firewalls etc. should be shutdown.
>>> *** Master (node09.de):
>>> 2011-12-23 15:57:47,140 INFO org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to
>> node09.de:22181 with poll msecs = 3000
>>> 2011-12-23 15:57:47,143 WARN org.apache.giraph.zk.ZooKeeperManager:
>> onlineZooKeeperServers: Got ConnectException
>>> java.net.ConnectException: Connection refused
>>> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
>>> 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>>> 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>>> 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>>> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>>> 	at java.net.Socket.connect(Socket.java:525)
>>> 	at
>> org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:624)
>>> 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:409)
>>> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:396)
>>> 	at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>>> 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
>>>
>>>
>>>
>>> Thanks again.
>>> Christoph
>> These two exceptions on the master are also fine.  It takes some time
>> for the master to start the zk service (hence the multiple connection
>> attempts).