You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Puneet Jain <pu...@gmail.com> on 2013/07/25 00:30:48 UTC

Job settings to run PageRank on 75M vertices

Hello:

I am struggling to make PageRank run on 75M nodes with each node having
1-75000 edges.

I am constantly getting zookeeper timeouts irrespective of my configuration.

- I have 21 node hadoop cluster, each node having 4 cores, 4GB memory.
- Data is stored in hbase as adjacency matrix
- I am running 21 regionservers, 3 zookeepers.
- I am using standard PageRankComputation class, my vertexID is a long.

I am setting only these parameters:
GiraphConfiguration.SPLIT_MASTER_WORKER.set(giraphConf, false);
GiraphConfiguration.USE_SUPERSTEP_COUNTERS.set(giraphConf, false);
GiraphConfiguration.CHECKPOINT_FREQUENCY.set(giraphConf, 0);

Most of other configurations are set to default value.

Thanks
-- 
--Puneet

Re: Job settings to run PageRank on 75M vertices

Posted by Puneet Jain <pu...@gmail.com>.
Just a followup note .. my Master is timing out because my other mappers
are taking too much time to finish ...


2013-07-24 22:13:18,874 INFO org.apache.giraph.utils.ProgressableUtils:
waitFor: Waiting for
org.apache.giraph.utils.ProgressableUtils$FutureWaitable@5f337f6f
2013-07-24 22:13:18,895 INFO org.apache.zookeeper.ClientCnxn: Client
session timed out, have not heard from server in 62377ms for sessionid
0x240129e443a011a, closing socket connection and attempting reconnect
2013-07-24 22:13:18,895 INFO org.apache.zookeeper.ClientCnxn: Client
session timed out, have not heard from server in 63825ms for sessionid
0x240129e443a0118, closing socket connection and attempting reconnect
2013-07-24 22:13:19,123 WARN org.apache.giraph.bsp.BspService: process:
Disconnected from ZooKeeper (will automatically try to recover)
WatchedEvent state:Disconnected type:None path:null
2013-07-24 22:13:19,137 WARN org.apache.giraph.bsp.BspService: process:
Disconnected from ZooKeeper (will automatically try to recover)
WatchedEvent state:Disconnected type:None path:null
2013-07-24 22:13:19,491 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server XXXXXX:2181
2013-07-24 22:13:19,492 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to XXXXXX:2181, initiating session
2013-07-24 22:13:19,546 INFO org.apache.zookeeper.ClientCnxn: Unable to
reconnect to ZooKeeper service, session 0x240129e443a011a has expired,
closing socket connection
2013-07-24 22:13:19,549 WARN org.apache.giraph.bsp.BspService: process: Got
unknown null path event WatchedEvent state:Expired type:None path:null
2013-07-24 22:13:19,549 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2013-07-24 22:13:20,045 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server XXXXXX:2181
2013-07-24 22:13:20,046 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to XXXXXX:2181, initiating session
2013-07-24 22:13:20,056 INFO org.apache.zookeeper.ClientCnxn: Unable to
reconnect to ZooKeeper service, session 0x240129e443a0118 has expired,
closing socket connection
2013-07-24 22:13:20,056 WARN org.apache.giraph.bsp.BspService: process: Got
unknown null path event WatchedEvent state:Expired type:None path:null
2013-07-24 22:13:20,056 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2013-07-24 22:13:20,169 ERROR org.apache.giraph.master.MasterThread:
masterThread: Master algorithm failed with IllegalStateException
java.lang.IllegalStateException: Failed to create job state path due to
KeeperException
at org.apache.giraph.bsp.BspService.getJobState(BspService.java:676)
at
org.apache.giraph.master.BspServiceMaster.becomeMaster(BspServiceMaster.java:835)
at org.apache.giraph.master.MasterThread.run(MasterThread.java:97)
Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/_hadoopBsp/job_201307241738_0004/_masterJobState
at org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
at org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:152)
at org.apache.giraph.bsp.BspService.getJobState(BspService.java:667)
... 2 more
2013-07-24 22:13:20,314 FATAL org.apache.giraph.graph.GraphMapper:
uncaughtException: OverrideExceptionHandler on thread
org.apache.giraph.master.MasterThread, msg =
java.lang.IllegalStateException: Failed to create job state path due to
KeeperException, exiting...
java.lang.IllegalStateException: java.lang.IllegalStateException: Failed to
create job state path due to KeeperException
at org.apache.giraph.master.MasterThread.run(MasterThread.java:180)
Caused by: java.lang.IllegalStateException: Failed to create job state path
due to KeeperException
at org.apache.giraph.bsp.BspService.getJobState(BspService.java:676)
at
org.apache.giraph.master.BspServiceMaster.becomeMaster(BspServiceMaster.java:835)
at org.apache.giraph.master.MasterThread.run(MasterThread.java:97)
Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/_hadoopBsp/job_201307241738_0004/_masterJobState
at org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
at org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:152)
at org.apache.giraph.bsp.BspService.getJobState(BspService.java:667)


On Wed, Jul 24, 2013 at 6:30 PM, Puneet Jain <pu...@gmail.com>wrote:

> Hello:
>
> I am struggling to make PageRank run on 75M nodes with each node having
> 1-75000 edges.
>
> I am constantly getting zookeeper timeouts irrespective of my
> configuration.
>
> - I have 21 node hadoop cluster, each node having 4 cores, 4GB memory.
> - Data is stored in hbase as adjacency matrix
> - I am running 21 regionservers, 3 zookeepers.
> - I am using standard PageRankComputation class, my vertexID is a long.
>
> I am setting only these parameters:
> GiraphConfiguration.SPLIT_MASTER_WORKER.set(giraphConf, false);
> GiraphConfiguration.USE_SUPERSTEP_COUNTERS.set(giraphConf, false);
> GiraphConfiguration.CHECKPOINT_FREQUENCY.set(giraphConf, 0);
>
> Most of other configurations are set to default value.
>
> Thanks
> --
> --Puneet
>



-- 
--Puneet