You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Bryan Rowe II <br...@yahoo.com> on 2014/07/03 22:06:47 UTC

Errors only with large graphs

Hello,

Giraph: release-1.0.0-RC3

In short, when I use large graphs with the Shortest Paths example, it fails.  But when I use the small graph provided on the Quick Start guide, it succeeds.
I converted all of my large graphs into the format shown in the Quick Start guide to simply things.
I'm using a one-node setup.



Here is the command I'm using:
hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-0.20.203.0-jar-with-dependencies.jar 

org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsVertex 

-vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip /user/ubuntu/input/CA.txt -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat
-op /user/ubuntu/output/shortestpaths 

-w 1

(all on one line)

CA.txt is a large graph file: 96,026,228 bytes

The job fails in 10mins, 46sec.

Two Map tasks are created when run.
The first one, task_201407021636_0006_m_000000, is KILLED.  syslog:

===========================================================================

2014-07-02 17:01:34,757 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2014-07-02 17:01:34,945 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2014-07-02 17:01:35,127 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsVertex
2014-07-02 17:01:35,201 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201407021636_0006
2014-07-02 17:01:35,204 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_task/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0
2014-07-02 17:01:35,219 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Got [ec2-54-186-5-159.us-west-2.compute.amazonaws.com] 1 hosts from 1 candidates when 1 required (polling period is 3000) on attempt 0
2014-07-02 17:01:35,220 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperServerList: Creating the final ZooKeeper file '_bsp/_defaultZkManagerDir/job_201407021636_0006/zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 0, got file 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 ' (polling period is 3000)
2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazonaws.com, 0] 2 hosts in filename 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:35,229 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Trying to delete old directory /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper
2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Creating file /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg in /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper with base port 22181
2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true
2014-07-02 17:01:35,235 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Delete of zoo.cfg = false
2014-07-02 17:01:35,236 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Attempting to start ZooKeeper server with command [/usr/lib/jvm/java-7-oracle/jre/bin/java, -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar, org.apache.zookeeper.server.quorum.QuorumPeerMain, /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg] in directory /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper
2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Shutdown hook added.
2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with poll msecs = 3000
2014-07-02 17:01:35,241 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException
java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:701) at org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:357) at org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:188) at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:60) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:90) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at
 org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253)
2014-07-02 17:01:38,242 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with poll msecs = 3000
2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connected to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181!
2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_zkServer/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0
2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setup: Chosen to run ZooKeeper...
2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceMaster (master thread)...
2014-07-02 17:01:40,260 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201407021636_0006, 0 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ec2-54-186-5-159.us-west-2.compute.amazonaws.com
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client
 environment:java.class.path=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work:/home/ubuntu/hadoop/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ubuntu/hadoop/bin/..:/home/ubuntu/hadoop/bin/../hadoop-core-0.20.203.0.jar:/home/ubuntu/hadoop/bin/../lib/aspectjrt-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/aspectjtools-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-1.7.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-cli-1.2.jar:/home/ubuntu/hadoop/bin/../lib/commons-codec-1.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-collections-3.2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-configuration-1.6.jar:/home/ubuntu/hadoo
p/bin/../lib/commons-daemon-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-digester-1.8.jar:/home/ubuntu/hadoop/bin/../lib/commons-el-1.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-lang-2.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-1.1.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-math-2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/ubuntu/hadoop/bin/../lib/core-3.1.1.jar:/home/ubuntu/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/ubuntu/hadoop/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/ubuntu/hadoop/bin/../lib/jetty-6.1.26.jar:/home/ubuntu/hadoop/bin/../lib/jetty-util-6.1.26.jar:/home/ubu
ntu/hadoop/bin/../lib/jsch-0.1.42.jar:/home/ubuntu/hadoop/bin/../lib/junit-4.5.jar:/home/ubuntu/hadoop/bin/../lib/kfs-0.2.2.jar:/home/ubuntu/hadoop/bin/../lib/log4j-1.2.15.jar:/home/ubuntu/hadoop/bin/../lib/mockito-all-1.8.5.jar:/home/ubuntu/hadoop/bin/../lib/oro-2.0.8.jar:/home/ubuntu/hadoop/bin/../lib/servlet-api-2.5-20081211.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/xmlenc-0.52.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/ubuntu/hadoop/bin/../lib/native/Linux-amd64-64:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work/tmp
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-58-virtual
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=ubuntu
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/ubuntu
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
2014-07-02 17:01:40,272 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 sessionTimeout=60000 watcher=org.apache.giraph.master.BspServiceMaster@4deb9df0
2014-07-02 17:01:40,284 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
2014-07-02 17:01:40,296 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
2014-07-02 17:01:42,336 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid = 0x146f806378d0000, negotiated timeout = 600000
2014-07-02 17:01:42,337 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: map: No need to do anything when not a worker
2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: cleanup: Starting for MASTER_ZOOKEEPER_ONLY
2014-07-02 17:01:45,033 INFO org.apache.giraph.bsp.BspService: getJobState: Job state already exists (/_hadoopBsp/job_201407021636_0006/_masterJobState)
2014-07-02 17:01:45,038 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: First child is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000' and my bid is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000'
2014-07-02 17:01:45,044 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir already exists!
2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
2014-07-02 17:01:45,089 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
2014-07-02 17:01:45,111 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30000 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
2014-07-02 17:01:45,116 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
2014-07-02 17:01:45,119 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: I am now the master!
2014-07-02 17:01:45,126 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir already exists!
2014-07-02 17:01:45,140 INFO org.apache.giraph.io.formats.GiraphFileInputFormat: Total input paths to process : 1
2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: generateVertexInputSplits: Got 2 input splits for 1 input threads
2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Starting to write input split data to zookeeper with 1 threads
2014-07-02 17:01:45,163 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Done writing input split data to zookeeper
2014-07-02 17:01:45,173 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
2014-07-02 17:01:45,187 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
2014-07-02 17:01:45,188 INFO org.apache.giraph.partition.PartitionUtils: computePartitionCount: Creating 1, default would have been 1 partitions.
2014-07-02 17:01:45,350 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
2014-07-02 17:06:45,347 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
2014-07-02 17:06:45,349 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1]
2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1]
2014-07-02 17:12:07,201 INFO org.apache.giraph.zk.ZooKeeperManager: run: Shutdown hook started.
2014-07-02 17:12:07,202 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.
2014-07-02 17:12:07,518 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x146f806378d0000, likely server has closed socket, closing socket connection and attempting reconnect
2014-07-02 17:12:07,518 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: ZooKeeper process exited with 143 (note that 143 typically means killed).
===========================================================================








The second one, task_201407021636_0006_m_000001, goes to the FAILED state.  syslog:
===========================================================================
2014-07-02 17:01:38,016 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2014-07-02 17:01:38,203 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2014-07-02 17:01:38,379 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
2014-07-02 17:01:40,280 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
2014-07-02 17:01:40,281 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsVertex
2014-07-02 17:01:40,317 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201407021636_0006
2014-07-02 17:01:40,320 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_task/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 1
2014-07-02 17:01:42,109 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 1, got file 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 ' (polling period is 3000)
2014-07-02 17:01:42,313 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazonaws.com, 0] 2 hosts in filename 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:42,316 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got [ec2-54-186-5-159.us-west-2.compute.amazonaws.com] 1 hosts from 1 ready servers when 1 required (polling period is 3000) on attempt 0
2014-07-02 17:01:42,316 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceWorker...
2014-07-02 17:01:42,330 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201407021636_0006, 1 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ec2-54-186-5-159.us-west-2.compute.amazonaws.com
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client
 environment:java.class.path=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work:/home/ubuntu/hadoop/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ubuntu/hadoop/bin/..:/home/ubuntu/hadoop/bin/../hadoop-core-0.20.203.0.jar:/home/ubuntu/hadoop/bin/../lib/aspectjrt-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/aspectjtools-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-1.7.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-cli-1.2.jar:/home/ubuntu/hadoop/bin/../lib/commons-codec-1.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-collections-3.2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-configuration-1.6.jar:/home/ubuntu/hadoo
p/bin/../lib/commons-daemon-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-digester-1.8.jar:/home/ubuntu/hadoop/bin/../lib/commons-el-1.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-lang-2.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-1.1.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-math-2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/ubuntu/hadoop/bin/../lib/core-3.1.1.jar:/home/ubuntu/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/ubuntu/hadoop/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/ubuntu/hadoop/bin/../lib/jetty-6.1.26.jar:/home/ubuntu/hadoop/bin/../lib/jetty-util-6.1.26.jar:/home/ubu
ntu/hadoop/bin/../lib/jsch-0.1.42.jar:/home/ubuntu/hadoop/bin/../lib/junit-4.5.jar:/home/ubuntu/hadoop/bin/../lib/kfs-0.2.2.jar:/home/ubuntu/hadoop/bin/../lib/log4j-1.2.15.jar:/home/ubuntu/hadoop/bin/../lib/mockito-all-1.8.5.jar:/home/ubuntu/hadoop/bin/../lib/oro-2.0.8.jar:/home/ubuntu/hadoop/bin/../lib/servlet-api-2.5-20081211.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/xmlenc-0.52.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/ubuntu/hadoop/bin/../lib/native/Linux-amd64-64:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work/tmp
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-58-virtual
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=ubuntu
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/ubuntu
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
2014-07-02 17:01:42,339 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 sessionTimeout=60000 watcher=org.apache.giraph.worker.BspServiceWorker@54a5733f
2014-07-02 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
2014-07-02 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
2014-07-02 17:01:42,812 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid = 0x146f806378d0001, negotiated timeout = 600000
2014-07-02 17:01:42,814 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
2014-07-02 17:01:42,819 INFO org.apache.giraph.comm.netty.NettyWorkerServer: createMessageStoreFactory: Using ByteArrayMessagesPerVertexStore since there is no combiner
2014-07-02 17:01:42,881 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
2014-07-02 17:01:42,904 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30001 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
2014-07-02 17:01:42,907 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
2014-07-02 17:01:42,914 INFO org.apache.giraph.graph.GraphTaskManager: setup: Registering health of this worker...
2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
2014-07-02 17:01:45,068 WARN org.apache.giraph.bsp.BspService: process: Unknown and unprocessed event (path=/_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir, type=NodeChildrenChanged, state=SyncConnected)
2014-07-02 17:01:45,076 INFO org.apache.giraph.worker.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 and workerInfo= Worker(hostname=ec2-54-186-5-159.us-west-2.compute.amazonaws.com, MRtaskID=1, port=30001)
2014-07-02 17:01:45,183 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
2014-07-02 17:01:45,339 INFO org.apache.giraph.bsp.BspService: process: partitionAssignmentsReadyChanged (partitions are assigned)
2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: startSuperstep: Master(hostname=ec2-54-186-5-159.us-west-2.compute.amazonaws.com, MRtaskID=0, port=30000)
2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: startSuperstep: Ready for computation on superstep -1 since worker selection and vertex range assignments are done in /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
2014-07-02 17:01:45,346 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
2014-07-02 17:01:45,354 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
2014-07-02 17:01:45,359 INFO org.apache.giraph.worker.BspServiceWorker: loadInputSplits: Using 1 thread(s), originally 1 threads(s) for 2 total splits.
2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache: maxVerticesPerTransfer = 10000
2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache: maxEdgesPerTransfer = 80000
2014-07-02 17:01:45,372 INFO org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit: Reserved input split path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDir/0, overall roughly 0.0% input splits reserved
2014-07-02 17:01:45,373 INFO org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDir/0 from ZooKeeper and got input split 'hdfs://ec2-54-186-5-159.us-west-2.compute.amazonaws.com:54310/user/ubuntu/input/CA.txt:0+67108864'
2014-07-02 17:01:46,615 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 250000 vertices at 200412.01085983627 vertices/sec 697447 edges at 560085.1663971642 edges/sec Memory (free/total/max) = 112.94M / 182.50M / 182.50M
2014-07-02 17:01:47,440 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 500000 vertices at 241131.36490688394 vertices/sec 1419367 edges at 685221.3493060106 edges/sec Memory (free/total/max) = 45.07M / 187.50M / 187.50M
2014-07-02 17:01:51,322 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 750000 vertices at 125921.72750649283 vertices/sec 2149814 edges at 361077.323111284 edges/sec Memory (free/total/max) = 16.73M / 189.50M / 189.50M
2014-07-02 17:01:55,205 ERROR org.apache.giraph.worker.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 on superstep -1
2014-07-02 17:01:55,213 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2014-07-02 17:01:55,716 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
2014-07-02 17:01:55,717 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName ubuntu for UID 1000 from the native implementation
2014-07-02 17:01:55,718 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@36df8f5 at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:102) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@36df8f5 at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:151) at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:111) at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:73) at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:192) at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:276) at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:323) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:506) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:230) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:92) ... 7 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:202) at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:271) at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:143) ... 15 more
Caused by: java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.ConcurrentHashMap$Segment.rehash(ConcurrentHashMap.java:501) at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:460) at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1130) at org.apache.giraph.partition.SimplePartition.addPartition(SimplePartition.java:87) at org.apache.giraph.partition.SimplePartitionStore.addPartition(SimplePartitionStore.java:71) at org.apache.giraph.comm.requests.SendVertexRequest.doRequest(SendVertexRequest.java:81) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:470) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendPartitionRequest(NettyWorkerClientRequestProcessor.java:203) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendVertexRequest(NettyWorkerClientRequestProcessor.java:267) at
 org.apache.giraph.worker.VertexInputSplitsCallable.readInputSplit(VertexInputSplitsCallable.java:140) at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:220) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:161) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:58) at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
2014-07-02 17:01:55,722 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
===========================================================================




I've tried increasing Java Heap Space in hadoop/conf/mapred-site.xml by adding this:
  <property>
    <name>mapred.child.java.opts</name>
   
 <value>-Xmx1024m</value>
  </property>

But that just caused the entire job to fail from the start.

Before using this version of Giraph, I used 1.0.0 and 1.1.0-RC0 and those 
versions provide me with more and different errors to debug that relate 
to problems with Hadoop itself.  So the Giraph version I'm currently 
using seems to be the best for me because these errors seem more 
manageable.


What can I do to fix this error?  I thought Giraph was built for large scale graph processing so I suppose this problem was encountered before by 
someone testing large graphs.  I searched through the mailing archives and couldn't find anything though.

I can provide more information if you need it.  Thanks a lot.
Bryan Rowe

Re: Errors only with large graphs

Posted by Young Han <yo...@uwaterloo.ca>.
Yeah, your input format looks correct. Did you have the graph sorted by
source vertex IDs before conversion? (I'm not sure if duplicate entries
with the same source ID matters, but just in case.)

They're all out of memory errors, so I think Xmx is the culprit. What type
of EC2 instances are you using? You probably want to use something larger
than t1.micro or m1.small.

Young


On Thu, Jul 3, 2014 at 4:38 PM, Bryan Rowe II <br...@yahoo.com> wrote:

> I've tried a larger graph.  It ran for 2 hours then failed with different
> error messages I believe.
>
> Bryan
>
>
>   On Thursday, July 3, 2014 1:37 PM, Bryan Rowe II <br...@yahoo.com>
> wrote:
>
>
> Hi Young,
>
> I believe my graph has the correct format.
> Here is the first 40 lines of the graph I'm using:
> [0,0,[[1,1],[4,1],[5,1]]]
> [1,0,[[0,1],[2,1],[3,1]]]
> [2,0,[[1,1],[6,1],[7,1]]]
> [3,0,[[8,1],[1,1],[9,1]]]
> [4,0,[[0,1],[10,1],[11,1]]]
> [5,0,[[0,1]]]
> [6,0,[[2,1],[12,1]]]
> [7,0,[[2,1],[12,1],[13,1]]]
> [8,0,[[11,1],[3,1],[60,1]]]
> [9,0,[[3,1]]]
> [10,0,[[35,1],[4,1],[38,1]]]
> [11,0,[[8,1],[59,1],[4,1]]]
> [12,0,[[41,1],[6,1],[7,1]]]
> [13,0,[[89,1],[90,1],[91,1],[7,1]]]
> [14,0,[[18,1],[19,1],[15,1]]]
> [15,0,[[16,1],[17,1],[14,1]]]
> [16,0,[[20,1],[21,1],[22,1],[15,1]]]
> [17,0,[[24,1],[23,1],[21,1],[15,1]]]
> [18,0,[[24,1],[14,1]]]
> [19,0,[[25,1],[22,1],[14,1]]]
> [20,0,[[16,1],[25,1],[26,1]]]
> [21,0,[[16,1],[17,1],[30,1]]]
> [22,0,[[16,1],[19,1]]]
> [23,0,[[17,1],[105,1]]]
> [24,0,[[17,1],[18,1],[58,1]]]
> [25,0,[[27,1],[19,1],[20,1]]]
> [26,0,[[28,1],[27,1],[20,1]]]
> [27,0,[[25,1],[26,1],[29,1]]]
> [28,0,[[26,1],[29,1],[30,1]]]
> [29,0,[[27,1],[28,1],[31,1]]]
> [30,0,[[32,1],[33,1],[28,1],[21,1]]]
> [31,0,[[34,1],[29,1]]]
> [32,0,[[105,1],[30,1],[39,1]]]
> [33,0,[[30,1]]]
> [34,0,[[38,1],[31,1]]]
> [35,0,[[10,1],[36,1],[37,1]]]
> [36,0,[[40,1],[35,1],[39,1]]]
> [37,0,[[41,1],[35,1]]]
> [38,0,[[10,1],[34,1]]]
> [39,0,[[32,1],[58,1],[36,1],[119,1]]]
> [40,0,[[90,1],[36,1]]]
>
> Also, sorry about sending the email twice.  My email client messed up.
>
> Thanks,
> Bryan
>
>
>   On Thursday, July 3, 2014 1:27 PM, Young Han <yo...@uwaterloo.ca>
> wrote:
>
>
> 10 minutes seems way too long to load in 91mb from HDFS. Are you sure your
> graph's format is correct? For the Json input formats, each line of your
> file should be:
>
> [vertex id, vertex value, [[dst id, edge weight], [dst id, edge weigth],
> ..., [dst id, edge weight]]]
>
> You can set the vertex value for every vertex to 0 (SSSP will overwrite
> that value in the first superstep). If your graph doesn't have edge
> weights, you can just set them to 1.
>
> Also, have you tried a larger Xmx value? E.g., 4096m or 8192m.
>
> Young
>
>
> On Thu, Jul 3, 2014 at 4:06 PM, Bryan Rowe II <br...@yahoo.com>
> wrote:
>
> Hello,
>
> Giraph: release-1.0.0-RC3
>
> In short, when I use large graphs with the Shortest Paths example, it
> fails.  But when I use the small graph provided on the Quick Start guide,
> it succeeds.
> I converted all of my large graphs into the format shown in the Quick
> Start guide to simply things.
> I'm using a one-node setup.
>
>
> Here is the command I'm using:
> hadoop jar
> $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-0.20.203.0-jar-with-dependencies.jar
>
> org.apache.giraph.GiraphRunner
> org.apache.giraph.examples.SimpleShortestPathsVertex
> -vif
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
> -vip /user/ubuntu/input/CA.txt -of
> org.apache.giraph.io.formats.IdWithValueTextOutputFormat
> -op /user/ubuntu/output/shortestpaths
> -w 1
>
> (all on one line)
>
> CA.txt is a large graph file: 96,026,228 bytes
>
> The job fails in 10mins, 46sec.
>
> Two Map tasks are created when run.
> The first one, task_201407021636_0006_m_000000, is KILLED.  syslog:
> ===========================================================================
>
> 2014-07-02 17:01:34,757 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2014-07-02 17:01:34,945 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
> 2014-07-02 17:01:35,127 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
> 2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
> 2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsVertex
> 2014-07-02 17:01:35,201 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201407021636_0006
> 2014-07-02 17:01:35,204 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_task/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0
> 2014-07-02 17:01:35,219 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Got [ec2-54-186-5-159.us-west-2.compute.amazonaws.com] 1 hosts from 1 candidates when 1 required (polling period is 3000) on attempt 0
> 2014-07-02 17:01:35,220 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperServerList: Creating the final ZooKeeper file '_bsp/_defaultZkManagerDir/job_201407021636_0006/zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com <http://zkserverlist_ec2-54-186-5-159.us-west-2.compute.amazonaws.com/> 0 '
> 2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 0, got file 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com <http://zkserverlist_ec2-54-186-5-159.us-west-2.compute.amazonaws.com/> 0 ' (polling period is 3000)
> 2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazonaws.com, 0] 2 hosts in filename 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com <http://zkserverlist_ec2-54-186-5-159.us-west-2.compute.amazonaws.com/> 0 '
> 2014-07-02 17:01:35,229 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Trying to delete old directory /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper
> 2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Creating file /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg in /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper with base port 22181
> 2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true
> 2014-07-02 17:01:35,235 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Delete of zoo.cfg = false
> 2014-07-02 17:01:35,236 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Attempting to start ZooKeeper server with command [/usr/lib/jvm/java-7-oracle/jre/bin/java, -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar, org.apache.zookeeper.server.quorum.QuorumPeerMain, /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg] in directory /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper
> 2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Shutdown hook added.
> 2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with poll msecs = 3000
> 2014-07-02 17:01:35,241 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException
> java.net.ConnectException: Connection refused
> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
> 	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> 	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> 	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> 	at java.net.Socket.connect(Socket.java:579)
> 	at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:701)
> 	at org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:357)
> 	at org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:188)
> 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:60)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:90)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
> 2014-07-02 17:01:38,242 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with poll msecs = 3000
> 2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connected to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181!
> 2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_zkServer/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0
> 2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setup: Chosen to run ZooKeeper...
> 2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceMaster (master thread)...
> 2014-07-02 17:01:40,260 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201407021636_0006, 0 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181
> 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
> 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ec2-54-186-5-159.us-west-2.compute.amazonaws.com
> 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60
> 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client
>  environment:java.class.path=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work:/home/ubuntu/hadoop/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ubuntu/hadoop/bin/..:/home/ubuntu/hadoop/bin/../hadoop-core-0.20.203.0.jar:/home/ubuntu/hadoop/bin/../lib/aspectjrt-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/aspectjtools-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-1.7.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-cli-1.2.jar:/home/ubuntu/hadoop/bin/../lib/commons-codec-1.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-collections-3.2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-configuration-1.6.jar:/home/ubuntu/hadoo
> p/bin/../lib/commons-daemon-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-digester-1.8.jar:/home/ubuntu/hadoop/bin/../lib/commons-el-1.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-lang-2.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-1.1.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-math-2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/ubuntu/hadoop/bin/../lib/core-3.1.1.jar:/home/ubuntu/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/ubuntu/hadoop/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/ubuntu/hadoop/bin/../lib/jetty-6.1.26.jar:/home/ubuntu/hadoop/bin/../lib/jetty-util-6.1.26.jar:/home/ubu
> ntu/hadoop/bin/../lib/jsch-0.1.42.jar:/home/ubuntu/hadoop/bin/../lib/junit-4.5.jar:/home/ubuntu/hadoop/bin/../lib/kfs-0.2.2.jar:/home/ubuntu/hadoop/bin/../lib/log4j-1.2.15.jar:/home/ubuntu/hadoop/bin/../lib/mockito-all-1.8.5.jar:/home/ubuntu/hadoop/bin/../lib/oro-2.0.8.jar:/home/ubuntu/hadoop/bin/../lib/servlet-api-2.5-20081211.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/xmlenc-0.52.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/ubuntu/hadoop/bin/../lib/native/Linux-amd64-64:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work/tmp
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-58-virtual
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=ubuntu
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/ubuntu
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
> 2014-07-02 17:01:40,272 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 sessionTimeout=60000 watcher=org.apache.giraph.master.BspServiceMaster@4deb9df0
> 2014-07-02 17:01:40,284 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
> 2014-07-02 <http://ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:221812014-07-02> 17:01:40,296 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
> 2014-07-02 17:01:42,336 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid = 0x146f806378d0000, negotiated timeout = 600000
> 2014-07-02 17:01:42,337 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
> 2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: map: No need to do anything when not a worker
> 2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: cleanup: Starting for MASTER_ZOOKEEPER_ONLY
> 2014-07-02 17:01:45,033 INFO org.apache.giraph.bsp.BspService: getJobState: Job state already exists (/_hadoopBsp/job_201407021636_0006/_masterJobState)
> 2014-07-02 17:01:45,038 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: First child is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000' and my bid is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000'
> 2014-07-02 17:01:45,044 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir already exists!
> 2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
> 2014-07-02 17:01:45,089 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
> 2014-07-02 17:01:45,111 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30000 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
> 2014-07-02 17:01:45,116 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
> 2014-07-02 17:01:45,119 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: I am now the master!
> 2014-07-02 17:01:45,126 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir already exists!
> 2014-07-02 17:01:45,140 INFO org.apache.giraph.io.formats.GiraphFileInputFormat: Total input paths to process : 1
> 2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: generateVertexInputSplits: Got 2 input splits for 1 input threads
> 2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Starting to write input split data to zookeeper with 1 threads
> 2014-07-02 17:01:45,163 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Done writing input split data to zookeeper
> 2014-07-02 17:01:45,173 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
> 2014-07-02 17:01:45,187 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
> 2014-07-02 17:01:45,188 INFO org.apache.giraph.partition.PartitionUtils: computePartitionCount: Creating 1, default would have been 1 partitions.
> 2014-07-02 17:01:45,350 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
> 2014-07-02 17:06:45,347 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
> 2014-07-02 17:06:45,349 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1]
> 2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
> 2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1]
> 2014-07-02 17:12:07,201 INFO org.apache.giraph.zk.ZooKeeperManager: run: Shutdown hook started.
> 2014-07-02 17:12:07,202 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.
> 2014-07-02 17:12:07,518 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x146f806378d0000, likely server has closed socket, closing socket connection and attempting reconnect
> 2014-07-02 17:12:07,518 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: ZooKeeper process exited with 143 (note that 143 typically means killed).
> ===========================================================================
>
>
>
>
>
>
>
>
> The second one, task_201407021636_0006_m_000001, goes to the FAILED
> state.  syslog:
> ===========================================================================
>
> 2014-07-02 17:01:38,016 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2014-07-02 17:01:38,203 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
> 2014-07-02 17:01:38,379 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
> 2014-07-02 17:01:40,280 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
> 2014-07-02 17:01:40,281 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsVertex
> 2014-07-02 17:01:40,317 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201407021636_0006
> 2014-07-02 17:01:40,320 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_task/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 1
> 2014-07-02 17:01:42,109 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 1, got file 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com <http://zkserverlist_ec2-54-186-5-159.us-west-2.compute.amazonaws.com/> 0 ' (polling period is 3000)
> 2014-07-02 17:01:42,313 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazonaws.com, 0] 2 hosts in filename 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com <http://zkserverlist_ec2-54-186-5-159.us-west-2.compute.amazonaws.com/> 0 '
> 2014-07-02 17:01:42,316 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got [ec2-54-186-5-159.us-west-2.compute.amazonaws.com] 1 hosts from 1 ready servers when 1 required (polling period is 3000) on attempt 0
> 2014-07-02 17:01:42,316 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceWorker...
> 2014-07-02 17:01:42,330 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201407021636_0006, 1 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181
> 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
> 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ec2-54-186-5-159.us-west-2.compute.amazonaws.com
> 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60
> 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
> 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client
>  environment:java.class.path=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work:/home/ubuntu/hadoop/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ubuntu/hadoop/bin/..:/home/ubuntu/hadoop/bin/../hadoop-core-0.20.203.0.jar:/home/ubuntu/hadoop/bin/../lib/aspectjrt-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/aspectjtools-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-1.7.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-cli-1.2.jar:/home/ubuntu/hadoop/bin/../lib/commons-codec-1.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-collections-3.2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-configuration-1.6.jar:/home/ubuntu/hadoo
> p/bin/../lib/commons-daemon-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-digester-1.8.jar:/home/ubuntu/hadoop/bin/../lib/commons-el-1.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-lang-2.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-1.1.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-math-2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/ubuntu/hadoop/bin/../lib/core-3.1.1.jar:/home/ubuntu/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/ubuntu/hadoop/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/ubuntu/hadoop/bin/../lib/jetty-6.1.26.jar:/home/ubuntu/hadoop/bin/../lib/jetty-util-6.1.26.jar:/home/ubu
> ntu/hadoop/bin/../lib/jsch-0.1.42.jar:/home/ubuntu/hadoop/bin/../lib/junit-4.5.jar:/home/ubuntu/hadoop/bin/../lib/kfs-0.2.2.jar:/home/ubuntu/hadoop/bin/../lib/log4j-1.2.15.jar:/home/ubuntu/hadoop/bin/../lib/mockito-all-1.8.5.jar:/home/ubuntu/hadoop/bin/../lib/oro-2.0.8.jar:/home/ubuntu/hadoop/bin/../lib/servlet-api-2.5-20081211.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/xmlenc-0.52.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/ubuntu/hadoop/bin/../lib/native/Linux-amd64-64:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work/tmp
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-58-virtual
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=ubuntu
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/ubuntu
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
> 2014-07-02 17:01:42,339 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 sessionTimeout=60000 watcher=org.apache.giraph.worker.BspServiceWorker@54a5733f
> 2014-07-02 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
> 2014-07-02 <http://ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:221812014-07-02> 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
> 2014-07-02 17:01:42,812 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid = 0x146f806378d0001, negotiated timeout = 600000
> 2014-07-02 17:01:42,814 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
> 2014-07-02 17:01:42,819 INFO org.apache.giraph.comm.netty.NettyWorkerServer: createMessageStoreFactory: Using ByteArrayMessagesPerVertexStore since there is no combiner
> 2014-07-02 17:01:42,881 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
> 2014-07-02 17:01:42,904 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30001 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
> 2014-07-02 17:01:42,907 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
> 2014-07-02 17:01:42,914 INFO org.apache.giraph.graph.GraphTaskManager: setup: Registering health of this worker...
> 2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
> 2014-07-02 17:01:45,068 WARN org.apache.giraph.bsp.BspService: process: Unknown and unprocessed event (path=/_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir, type=NodeChildrenChanged, state=SyncConnected)
> 2014-07-02 17:01:45,076 INFO org.apache.giraph.worker.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 and workerInfo= Worker(hostname=ec2-54-186-5-159.us-west-2.compute.amazonaws.com, MRtaskID=1, port=30001)
> 2014-07-02 17:01:45,183 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
> 2014-07-02 17:01:45,339 INFO org.apache.giraph.bsp.BspService: process: partitionAssignmentsReadyChanged (partitions are assigned)
> 2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: startSuperstep: Master(hostname=ec2-54-186-5-159.us-west-2.compute.amazonaws.com, MRtaskID=0, port=30000)
> 2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: startSuperstep: Ready for computation on superstep -1 since worker selection and vertex range assignments are done in /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
> 2014-07-02 17:01:45,346 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
> 2014-07-02 17:01:45,354 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
> 2014-07-02 17:01:45,359 INFO org.apache.giraph.worker.BspServiceWorker: loadInputSplits: Using 1 thread(s), originally 1 threads(s) for 2 total splits.
> 2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache: maxVerticesPerTransfer = 10000
> 2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache: maxEdgesPerTransfer = 80000
> 2014-07-02 17:01:45,372 INFO org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit: Reserved input split path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDir/0, overall roughly 0.0% input splits reserved
> 2014-07-02 17:01:45,373 INFO org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDir/0 from ZooKeeper and got input split 'hdfs://ec2-54-186-5-159.us-west-2.compute.amazonaws.com:54310/user/ubuntu/input/CA.txt:0+67108864'
> 2014-07-02 17:01:46,615 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 250000 vertices at 200412.01085983627 vertices/sec 697447 edges at 560085.1663971642 edges/sec Memory (free/total/max) = 112.94M / 182.50M / 182.50M
> 2014-07-02 17:01:47,440 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 500000 vertices at 241131.36490688394 vertices/sec 1419367 edges at 685221.3493060106 edges/sec Memory (free/total/max) = 45.07M / 187.50M / 187.50M
> 2014-07-02 17:01:51,322 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 750000 vertices at 125921.72750649283 vertices/sec 2149814 edges at 361077.323111284 edges/sec Memory (free/total/max) = 16.73M / 189.50M / 189.50M
> 2014-07-02 17:01:55,205 ERROR org.apache.giraph.worker.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 on superstep -1
> 2014-07-02 17:01:55,213 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2014-07-02 17:01:55,716 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
> 2014-07-02 17:01:55,717 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName ubuntu for UID 1000 from the native implementation
> 2014-07-02 17:01:55,718 WARN org.apache.hadoop.mapred.Child: Error running child
> java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@36df8f5
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:102)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
> Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@36df8f5
> 	at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:151)
> 	at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:111)
> 	at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:73)
> 	at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:192)
> 	at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:276)
> 	at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:323)
> 	at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:506)
> 	at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:230)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:92)
> 	... 7 more
> Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
> 	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> 	at java.util.concurrent.FutureTask.get(FutureTask.java:202)
> 	at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:271)
> 	at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:143)
> 	... 15 more
> Caused by: java.lang.OutOfMemoryError: Java heap space
> 	at java.util.concurrent.ConcurrentHashMap$Segment.rehash(ConcurrentHashMap.java:501)
> 	at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:460)
> 	at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1130)
> 	at org.apache.giraph.partition.SimplePartition.addPartition(SimplePartition.java:87)
> 	at org.apache.giraph.partition.SimplePartitionStore.addPartition(SimplePartitionStore.java:71)
> 	at org.apache.giraph.comm.requests.SendVertexRequest.doRequest(SendVertexRequest.java:81)
> 	at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:470)
> 	at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendPartitionRequest(NettyWorkerClientRequestProcessor.java:203)
> 	at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendVertexRequest(NettyWorkerClientRequestProcessor.java:267)
> 	at org.apache.giraph.worker.VertexInputSplitsCallable.readInputSplit(VertexInputSplitsCallable.java:140)
> 	at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:220)
> 	at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:161)
> 	at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:58)
> 	at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> 2014-07-02 17:01:55,722 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
>
> ===========================================================================
>
>
>
>
> I've tried increasing Java Heap Space in hadoop/conf/mapred-site.xml by
> adding this:
>   <property>
>     <name>mapred.child.java.opts</name>
>     <value>-Xmx1024m</value>
>   </property>
>
> But that just caused the entire job to fail from the start.
>
> Before using this version of Giraph, I used 1.0.0 and 1.1.0-RC0 and those
> versions provide me with more and different errors to debug that relate to
> problems with Hadoop itself.  So the Giraph version I'm currently using
> seems to be the best for me because these errors seem more manageable.
>
> What can I do to fix this error?  I thought Giraph was built for large
> scale graph processing so I suppose this problem was encountered before by
> someone testing large graphs.  I searched through the mailing archives and
> couldn't find anything though.
>
> I can provide more information if you need it.  Thanks a lot.
>
> Bryan Rowe
>
>
>
>
>
>
>

Re: Errors only with large graphs

Posted by Bryan Rowe II <br...@yahoo.com>.
I've tried a larger graph.  It ran for 2 hours then failed with different error messages I believe.

Bryan



On Thursday, July 3, 2014 1:37 PM, Bryan Rowe II <br...@yahoo.com> wrote:
 


Hi Young,
I believe my graph has the correct format.
Here is the first 40 lines of the graph I'm using:
[0,0,[[1,1],[4,1],[5,1]]]
[1,0,[[0,1],[2,1],[3,1]]]
[2,0,[[1,1],[6,1],[7,1]]]
[3,0,[[8,1],[1,1],[9,1]]]
[4,0,[[0,1],[10,1],[11,1]]]
[5,0,[[0,1]]]
[6,0,[[2,1],[12,1]]]
[7,0,[[2,1],[12,1],[13,1]]]
[8,0,[[11,1],[3,1],[60,1]]]
[9,0,[[3,1]]]
[10,0,[[35,1],[4,1],[38,1]]]
[11,0,[[8,1],[59,1],[4,1]]]
[12,0,[[41,1],[6,1],[7,1]]]
[13,0,[[89,1],[90,1],[91,1],[7,1]]]
[14,0,[[18,1],[19,1],[15,1]]]
[15,0,[[16,1],[17,1],[14,1]]]
[16,0,[[20,1],[21,1],[22,1],[15,1]]]
[17,0,[[24,1],[23,1],[21,1],[15,1]]]
[18,0,[[24,1],[14,1]]]
[19,0,[[25,1],[22,1],[14,1]]]
[20,0,[[16,1],[25,1],[26,1]]]
[21,0,[[16,1],[17,1],[30,1]]]
[22,0,[[16,1],[19,1]]]
[23,0,[[17,1],[105,1]]]
[24,0,[[17,1],[18,1],[58,1]]]
[25,0,[[27,1],[19,1],[20,1]]]
[26,0,[[28,1],[27,1],[20,1]]]
[27,0,[[25,1],[26,1],[29,1]]]
[28,0,[[26,1],[29,1],[30,1]]]
[29,0,[[27,1],[28,1],[31,1]]]
[30,0,[[32,1],[33,1],[28,1],[21,1]]]
[31,0,[[34,1],[29,1]]]
[32,0,[[105,1],[30,1],[39,1]]]
[33,0,[[30,1]]]
[34,0,[[38,1],[31,1]]]
[35,0,[[10,1],[36,1],[37,1]]]
[36,0,[[40,1],[35,1],[39,1]]]
[37,0,[[41,1],[35,1]]]
[38,0,[[10,1],[34,1]]]
[39,0,[[32,1],[58,1],[36,1],[119,1]]]
[40,0,[[90,1],[36,1]]]

Also, sorry about sending the email twice.  My email client messed up.

Thanks,
Bryan



On Thursday, July 3, 2014 1:27 PM, Young Han <yo...@uwaterloo.ca> wrote:
 


10 minutes seems way too long to load in 91mb from HDFS. Are you sure your graph's format is correct? For the Json input formats, each line of your file should be:

[vertex id, vertex value, [[dst id, edge weight], [dst id, edge weigth], ..., [dst id, edge weight]]]


You can set the vertex value for every vertex to 0 (SSSP will overwrite that value in the first superstep). If your graph doesn't have edge weights, you can just set them to 1.


Also, have you tried a larger Xmx value? E.g., 4096m or 8192m.


Young




On Thu, Jul 3, 2014 at 4:06 PM, Bryan Rowe II <br...@yahoo.com> wrote:

Hello,
>
>
>Giraph: release-1.0.0-RC3
>
>
>In short, when I use large graphs with the Shortest Paths example, it fails.  But when I use the small graph provided on the Quick Start guide, it succeeds.
>I converted all of my large graphs into the format shown in the Quick Start guide to simply things.
>I'm using a one-node setup.
>
>
>
>
>
>Here is the command I'm using:
>hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-0.20.203.0-jar-with-dependencies.jar 
>
>org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsVertex 
>
>-vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>-vip /user/ubuntu/input/CA.txt -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat
>-op /user/ubuntu/output/shortestpaths 
>
>-w 1
>
>
>(all on one line)
>
>
>CA.txt is a large graph file: 96,026,228 bytes
>
>
>The job fails in 10mins, 46sec.
>
>
>Two Map tasks are created when run.
>The first one, task_201407021636_0006_m_000000, is KILLED.  syslog:
>
>===========================================================================
>
>2014-07-02 17:01:34,757 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2014-07-02 17:01:34,945 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2014-07-02 17:01:35,127 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsVertex
2014-07-02 17:01:35,201 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201407021636_0006
2014-07-02 17:01:35,204 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_task/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0
2014-07-02 17:01:35,219 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Got [ec2-54-186-5-159.us-west-2.compute.amazonaws.com] 1 hosts from 1 candidates when 1 required (polling period is 3000) on attempt 0
2014-07-02 17:01:35,220 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperServerList: Creating the final ZooKeeper file '_bsp/_defaultZkManagerDir/job_201407021636_0006/zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 0, got file 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 ' (polling period is 3000)
2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazonaws.com, 0] 2 hosts in filename 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:35,229 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Trying to delete old directory /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper
2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Creating file /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg in /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper with base port 22181
2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true
2014-07-02 17:01:35,235 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Delete of zoo.cfg = false
2014-07-02 17:01:35,236 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Attempting to start ZooKeeper server with command [/usr/lib/jvm/java-7-oracle/jre/bin/java, -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar, org.apache.zookeeper.server.quorum.QuorumPeerMain, /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg] in directory /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper
2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Shutdown hook added.
2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with poll msecs = 3000
2014-07-02 17:01:35,241 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException
java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:701) at org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:357) at org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:188) at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:60) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:90) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at
 org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253)
2014-07-02 17:01:38,242 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with poll msecs = 3000
2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connected to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181!
2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_zkServer/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0
2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setup: Chosen to run ZooKeeper...
2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceMaster (master thread)...
2014-07-02 17:01:40,260 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201407021636_0006, 0 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ec2-54-186-5-159.us-west-2.compute.amazonaws.com 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client
 environment:java.class.path=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work:/home/ubuntu/hadoop/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ubuntu/hadoop/bin/..:/home/ubuntu/hadoop/bin/../hadoop-core-0.20.203.0.jar:/home/ubuntu/hadoop/bin/../lib/aspectjrt-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/aspectjtools-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-1.7.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-cli-1.2.jar:/home/ubuntu/hadoop/bin/../lib/commons-codec-1.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-collections-3.2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-configuration-1.6.jar:/home/ubuntu/hadoo
p/bin/../lib/commons-daemon-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-digester-1.8.jar:/home/ubuntu/hadoop/bin/../lib/commons-el-1.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-lang-2.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-1.1.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-math-2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/ubuntu/hadoop/bin/../lib/core-3.1.1.jar:/home/ubuntu/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/ubuntu/hadoop/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/ubuntu/hadoop/bin/../lib/jetty-6.1.26.jar:/home/ubuntu/hadoop/bin/../lib/jetty-util-6.1.26.jar:/home/ubu
ntu/hadoop/bin/../lib/jsch-0.1.42.jar:/home/ubuntu/hadoop/bin/../lib/junit-4.5.jar:/home/ubuntu/hadoop/bin/../lib/kfs-0.2.2.jar:/home/ubuntu/hadoop/bin/../lib/log4j-1.2.15.jar:/home/ubuntu/hadoop/bin/../lib/mockito-all-1.8.5.jar:/home/ubuntu/hadoop/bin/../lib/oro-2.0.8.jar:/home/ubuntu/hadoop/bin/../lib/servlet-api-2.5-20081211.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/xmlenc-0.52.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/ubuntu/hadoop/bin/../lib/native/Linux-amd64-64:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work/tmp
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-58-virtual
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=ubuntu
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/ubuntu
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
2014-07-02 17:01:40,272 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 sessionTimeout=60000 watcher=org.apache.giraph.master.BspServiceMaster@4deb9df0
2014-07-02 17:01:40,284 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
2014-07-02 17:01:40,296 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
2014-07-02 17:01:42,336 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid = 0x146f806378d0000, negotiated timeout = 600000
2014-07-02 17:01:42,337 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: map: No need to do anything when not a worker
2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: cleanup: Starting for MASTER_ZOOKEEPER_ONLY
2014-07-02 17:01:45,033 INFO org.apache.giraph.bsp.BspService: getJobState: Job state already exists (/_hadoopBsp/job_201407021636_0006/_masterJobState)
2014-07-02 17:01:45,038 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: First child is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000' and my bid is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000'
2014-07-02 17:01:45,044 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir already exists!
2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
2014-07-02 17:01:45,089 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
2014-07-02 17:01:45,111 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30000 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
2014-07-02 17:01:45,116 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
2014-07-02 17:01:45,119 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: I am now the master!
2014-07-02 17:01:45,126 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir already exists!
2014-07-02 17:01:45,140 INFO org.apache.giraph.io.formats.GiraphFileInputFormat: Total input paths to process : 1
2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: generateVertexInputSplits: Got 2 input splits for 1 input threads
2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Starting to write input split data to zookeeper with 1 threads
2014-07-02 17:01:45,163 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Done writing input split data to zookeeper
2014-07-02 17:01:45,173 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
2014-07-02 17:01:45,187 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
2014-07-02 17:01:45,188 INFO org.apache.giraph.partition.PartitionUtils: computePartitionCount: Creating 1, default would have been 1 partitions.
2014-07-02 17:01:45,350 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
2014-07-02 17:06:45,347 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
2014-07-02 17:06:45,349 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1]
2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1]
2014-07-02 17:12:07,201 INFO org.apache.giraph.zk.ZooKeeperManager: run: Shutdown hook started.
2014-07-02 17:12:07,202 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.
2014-07-02 17:12:07,518 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x146f806378d0000, likely server has closed socket, closing socket connection and attempting reconnect
2014-07-02 17:12:07,518 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: ZooKeeper process exited with 143 (note that 143 typically means killed).
>===========================================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>The second one, task_201407021636_0006_m_000001, goes to the FAILED state.  syslog:
>===========================================================================
>2014-07-02 17:01:38,016 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2014-07-02 17:01:38,203 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2014-07-02 17:01:38,379 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
2014-07-02 17:01:40,280 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
2014-07-02 17:01:40,281 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsVertex
2014-07-02 17:01:40,317 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201407021636_0006
2014-07-02 17:01:40,320 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_task/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 1
2014-07-02 17:01:42,109 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 1, got file 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 ' (polling period is 3000)
2014-07-02 17:01:42,313 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazonaws.com, 0] 2 hosts in filename 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:42,316 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got [ec2-54-186-5-159.us-west-2.compute.amazonaws.com] 1 hosts from 1 ready servers when 1 required (polling period is 3000) on attempt 0
2014-07-02 17:01:42,316 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceWorker...
2014-07-02 17:01:42,330 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201407021636_0006, 1 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ec2-54-186-5-159.us-west-2.compute.amazonaws.com 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client
 environment:java.class.path=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work:/home/ubuntu/hadoop/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ubuntu/hadoop/bin/..:/home/ubuntu/hadoop/bin/../hadoop-core-0.20.203.0.jar:/home/ubuntu/hadoop/bin/../lib/aspectjrt-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/aspectjtools-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-1.7.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-cli-1.2.jar:/home/ubuntu/hadoop/bin/../lib/commons-codec-1.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-collections-3.2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-configuration-1.6.jar:/home/ubuntu/hadoo
p/bin/../lib/commons-daemon-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-digester-1.8.jar:/home/ubuntu/hadoop/bin/../lib/commons-el-1.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-lang-2.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-1.1.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-math-2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/ubuntu/hadoop/bin/../lib/core-3.1.1.jar:/home/ubuntu/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/ubuntu/hadoop/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/ubuntu/hadoop/bin/../lib/jetty-6.1.26.jar:/home/ubuntu/hadoop/bin/../lib/jetty-util-6.1.26.jar:/home/ubu
ntu/hadoop/bin/../lib/jsch-0.1.42.jar:/home/ubuntu/hadoop/bin/../lib/junit-4.5.jar:/home/ubuntu/hadoop/bin/../lib/kfs-0.2.2.jar:/home/ubuntu/hadoop/bin/../lib/log4j-1.2.15.jar:/home/ubuntu/hadoop/bin/../lib/mockito-all-1.8.5.jar:/home/ubuntu/hadoop/bin/../lib/oro-2.0.8.jar:/home/ubuntu/hadoop/bin/../lib/servlet-api-2.5-20081211.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/xmlenc-0.52.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/ubuntu/hadoop/bin/../lib/native/Linux-amd64-64:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work/tmp
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-58-virtual
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=ubuntu
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/ubuntu
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
2014-07-02 17:01:42,339 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 sessionTimeout=60000 watcher=org.apache.giraph.worker.BspServiceWorker@54a5733f
2014-07-02 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
2014-07-02 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
2014-07-02 17:01:42,812 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid = 0x146f806378d0001, negotiated timeout = 600000
2014-07-02 17:01:42,814 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
2014-07-02 17:01:42,819 INFO org.apache.giraph.comm.netty.NettyWorkerServer: createMessageStoreFactory: Using ByteArrayMessagesPerVertexStore since there is no combiner
2014-07-02 17:01:42,881 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
2014-07-02 17:01:42,904 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30001 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
2014-07-02 17:01:42,907 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
2014-07-02 17:01:42,914 INFO org.apache.giraph.graph.GraphTaskManager: setup: Registering health of this worker...
2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
2014-07-02 17:01:45,068 WARN org.apache.giraph.bsp.BspService: process: Unknown and unprocessed event (path=/_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir, type=NodeChildrenChanged, state=SyncConnected)
2014-07-02 17:01:45,076 INFO org.apache.giraph.worker.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 and workerInfo= Worker(hostname=ec2-54-186-5-159.us-west-2.compute.amazonaws.com, MRtaskID=1, port=30001)
2014-07-02 17:01:45,183 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
2014-07-02 17:01:45,339 INFO org.apache.giraph.bsp.BspService: process: partitionAssignmentsReadyChanged (partitions are assigned)
2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: startSuperstep: Master(hostname=ec2-54-186-5-159.us-west-2.compute.amazonaws.com, MRtaskID=0, port=30000)
2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: startSuperstep: Ready for computation on superstep -1 since worker selection and vertex range assignments are done in /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
2014-07-02 17:01:45,346 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
2014-07-02 17:01:45,354 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
2014-07-02 17:01:45,359 INFO org.apache.giraph.worker.BspServiceWorker: loadInputSplits: Using 1 thread(s), originally 1 threads(s) for 2 total splits.
2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache: maxVerticesPerTransfer = 10000
2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache: maxEdgesPerTransfer = 80000
2014-07-02 17:01:45,372 INFO org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit: Reserved input split path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDir/0, overall roughly 0.0% input splits reserved
2014-07-02 17:01:45,373 INFO org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDir/0 from ZooKeeper and got input split 'hdfs://ec2-54-186-5-159.us-west-2.compute.amazonaws.com:54310/user/ubuntu/input/CA.txt:0+67108864'
2014-07-02 17:01:46,615 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 250000 vertices at 200412.01085983627 vertices/sec 697447 edges at 560085.1663971642 edges/sec Memory (free/total/max) = 112.94M / 182.50M / 182.50M
2014-07-02 17:01:47,440 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 500000 vertices at 241131.36490688394 vertices/sec 1419367 edges at 685221.3493060106 edges/sec Memory (free/total/max) = 45.07M / 187.50M / 187.50M
2014-07-02 17:01:51,322 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 750000 vertices at 125921.72750649283 vertices/sec 2149814 edges at 361077.323111284 edges/sec Memory (free/total/max) = 16.73M / 189.50M / 189.50M
2014-07-02 17:01:55,205 ERROR org.apache.giraph.worker.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 on superstep -1
2014-07-02 17:01:55,213 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2014-07-02 17:01:55,716 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
2014-07-02 17:01:55,717 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName ubuntu for UID 1000 from the native implementation
2014-07-02 17:01:55,718 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@36df8f5 at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:102) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@36df8f5 at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:151) at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:111) at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:73) at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:192) at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:276) at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:323) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:506) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:230) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:92) ... 7 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:202) at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:271) at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:143) ... 15 more
Caused by: java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.ConcurrentHashMap$Segment.rehash(ConcurrentHashMap.java:501) at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:460) at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1130) at org.apache.giraph.partition.SimplePartition.addPartition(SimplePartition.java:87) at org.apache.giraph.partition.SimplePartitionStore.addPartition(SimplePartitionStore.java:71) at org.apache.giraph.comm.requests.SendVertexRequest.doRequest(SendVertexRequest.java:81) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:470) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendPartitionRequest(NettyWorkerClientRequestProcessor.java:203) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendVertexRequest(NettyWorkerClientRequestProcessor.java:267) at
 org.apache.giraph.worker.VertexInputSplitsCallable.readInputSplit(VertexInputSplitsCallable.java:140) at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:220) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:161) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:58) at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
2014-07-02 17:01:55,722 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
>===========================================================================
>
>
>
>
>
>
>
>
>I've tried increasing Java Heap Space in hadoop/conf/mapred-site.xml by adding this:
>  <property>
>    <name>mapred.child.java.opts</name>
>   
 <value>-Xmx1024m</value>
>  </property>
>
>
>But that just caused the entire job to fail from the start.
>
>
>Before using this version of Giraph, I used 1.0.0 and 1.1.0-RC0 and those 
versions provide me with more and different errors to debug that relate 
to problems with Hadoop itself.  So the Giraph version I'm currently 
using seems to be the best for me because these errors seem more 
manageable.
>
>
>
>What can I do to fix this error?  I thought Giraph was built for large scale graph processing so I suppose this problem was encountered before by 
someone testing large graphs.  I searched through the mailing archives and couldn't find anything though.
>
>
>I can provide more information if you need it.  Thanks a lot.
>
>Bryan Rowe
>
>

Re: Errors only with large graphs

Posted by Bryan Rowe II <br...@yahoo.com>.
Hi Young,
I believe my graph has the correct format.
Here is the first 40 lines of the graph I'm using:
[0,0,[[1,1],[4,1],[5,1]]]
[1,0,[[0,1],[2,1],[3,1]]]
[2,0,[[1,1],[6,1],[7,1]]]
[3,0,[[8,1],[1,1],[9,1]]]
[4,0,[[0,1],[10,1],[11,1]]]
[5,0,[[0,1]]]
[6,0,[[2,1],[12,1]]]
[7,0,[[2,1],[12,1],[13,1]]]
[8,0,[[11,1],[3,1],[60,1]]]
[9,0,[[3,1]]]
[10,0,[[35,1],[4,1],[38,1]]]
[11,0,[[8,1],[59,1],[4,1]]]
[12,0,[[41,1],[6,1],[7,1]]]
[13,0,[[89,1],[90,1],[91,1],[7,1]]]
[14,0,[[18,1],[19,1],[15,1]]]
[15,0,[[16,1],[17,1],[14,1]]]
[16,0,[[20,1],[21,1],[22,1],[15,1]]]
[17,0,[[24,1],[23,1],[21,1],[15,1]]]
[18,0,[[24,1],[14,1]]]
[19,0,[[25,1],[22,1],[14,1]]]
[20,0,[[16,1],[25,1],[26,1]]]
[21,0,[[16,1],[17,1],[30,1]]]
[22,0,[[16,1],[19,1]]]
[23,0,[[17,1],[105,1]]]
[24,0,[[17,1],[18,1],[58,1]]]
[25,0,[[27,1],[19,1],[20,1]]]
[26,0,[[28,1],[27,1],[20,1]]]
[27,0,[[25,1],[26,1],[29,1]]]
[28,0,[[26,1],[29,1],[30,1]]]
[29,0,[[27,1],[28,1],[31,1]]]
[30,0,[[32,1],[33,1],[28,1],[21,1]]]
[31,0,[[34,1],[29,1]]]
[32,0,[[105,1],[30,1],[39,1]]]
[33,0,[[30,1]]]
[34,0,[[38,1],[31,1]]]
[35,0,[[10,1],[36,1],[37,1]]]
[36,0,[[40,1],[35,1],[39,1]]]
[37,0,[[41,1],[35,1]]]
[38,0,[[10,1],[34,1]]]
[39,0,[[32,1],[58,1],[36,1],[119,1]]]
[40,0,[[90,1],[36,1]]]

Also, sorry about sending the email twice.  My email client messed up.

Thanks,
Bryan



On Thursday, July 3, 2014 1:27 PM, Young Han <yo...@uwaterloo.ca> wrote:
 


10 minutes seems way too long to load in 91mb from HDFS. Are you sure your graph's format is correct? For the Json input formats, each line of your file should be:

[vertex id, vertex value, [[dst id, edge weight], [dst id, edge weigth], ..., [dst id, edge weight]]]


You can set the vertex value for every vertex to 0 (SSSP will overwrite that value in the first superstep). If your graph doesn't have edge weights, you can just set them to 1.


Also, have you tried a larger Xmx value? E.g., 4096m or 8192m.


Young




On Thu, Jul 3, 2014 at 4:06 PM, Bryan Rowe II <br...@yahoo.com> wrote:

Hello,
>
>
>Giraph: release-1.0.0-RC3
>
>
>In short, when I use large graphs with the Shortest Paths example, it fails.  But when I use the small graph provided on the Quick Start guide, it succeeds.
>I converted all of my large graphs into the format shown in the Quick Start guide to simply things.
>I'm using a one-node setup.
>
>
>
>
>
>Here is the command I'm using:
>hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-0.20.203.0-jar-with-dependencies.jar 
>
>org.apache.giraph.GiraphRunner org.apache.giraph.examples.SimpleShortestPathsVertex 
>
>-vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>-vip /user/ubuntu/input/CA.txt -of org.apache.giraph.io.formats.IdWithValueTextOutputFormat
>-op /user/ubuntu/output/shortestpaths 
>
>-w 1
>
>
>(all on one line)
>
>
>CA.txt is a large graph file: 96,026,228 bytes
>
>
>The job fails in 10mins, 46sec.
>
>
>Two Map tasks are created when run.
>The first one, task_201407021636_0006_m_000000, is KILLED.  syslog:
>
>===========================================================================
>
>2014-07-02 17:01:34,757 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2014-07-02 17:01:34,945 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2014-07-02 17:01:35,127 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsVertex
2014-07-02 17:01:35,201 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201407021636_0006
2014-07-02 17:01:35,204 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_task/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0
2014-07-02 17:01:35,219 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Got [ec2-54-186-5-159.us-west-2.compute.amazonaws.com] 1 hosts from 1 candidates when 1 required (polling period is 3000) on attempt 0
2014-07-02 17:01:35,220 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperServerList: Creating the final ZooKeeper file '_bsp/_defaultZkManagerDir/job_201407021636_0006/zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 0, got file 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 ' (polling period is 3000)
2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazonaws.com, 0] 2 hosts in filename 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:35,229 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Trying to delete old directory /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper
2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Creating file /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg in /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper with base port 22181
2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true
2014-07-02 17:01:35,235 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Delete of zoo.cfg = false
2014-07-02 17:01:35,236 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Attempting to start ZooKeeper server with command [/usr/lib/jvm/java-7-oracle/jre/bin/java, -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar, org.apache.zookeeper.server.quorum.QuorumPeerMain, /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg] in directory /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper
2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Shutdown hook added.
2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with poll msecs = 3000
2014-07-02 17:01:35,241 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException
java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:701) at org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:357) at org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:188) at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:60) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:90) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at
 org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253)
2014-07-02 17:01:38,242 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with poll msecs = 3000
2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connected to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181!
2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_zkServer/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0
2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setup: Chosen to run ZooKeeper...
2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceMaster (master thread)...
2014-07-02 17:01:40,260 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201407021636_0006, 0 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ec2-54-186-5-159.us-west-2.compute.amazonaws.com 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60
2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client
 environment:java.class.path=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work:/home/ubuntu/hadoop/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ubuntu/hadoop/bin/..:/home/ubuntu/hadoop/bin/../hadoop-core-0.20.203.0.jar:/home/ubuntu/hadoop/bin/../lib/aspectjrt-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/aspectjtools-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-1.7.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-cli-1.2.jar:/home/ubuntu/hadoop/bin/../lib/commons-codec-1.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-collections-3.2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-configuration-1.6.jar:/home/ubuntu/hadoo
p/bin/../lib/commons-daemon-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-digester-1.8.jar:/home/ubuntu/hadoop/bin/../lib/commons-el-1.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-lang-2.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-1.1.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-math-2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/ubuntu/hadoop/bin/../lib/core-3.1.1.jar:/home/ubuntu/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/ubuntu/hadoop/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/ubuntu/hadoop/bin/../lib/jetty-6.1.26.jar:/home/ubuntu/hadoop/bin/../lib/jetty-util-6.1.26.jar:/home/ubu
ntu/hadoop/bin/../lib/jsch-0.1.42.jar:/home/ubuntu/hadoop/bin/../lib/junit-4.5.jar:/home/ubuntu/hadoop/bin/../lib/kfs-0.2.2.jar:/home/ubuntu/hadoop/bin/../lib/log4j-1.2.15.jar:/home/ubuntu/hadoop/bin/../lib/mockito-all-1.8.5.jar:/home/ubuntu/hadoop/bin/../lib/oro-2.0.8.jar:/home/ubuntu/hadoop/bin/../lib/servlet-api-2.5-20081211.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/xmlenc-0.52.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/ubuntu/hadoop/bin/../lib/native/Linux-amd64-64:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work/tmp
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-58-virtual
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=ubuntu
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/ubuntu
2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
2014-07-02 17:01:40,272 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 sessionTimeout=60000 watcher=org.apache.giraph.master.BspServiceMaster@4deb9df0
2014-07-02 17:01:40,284 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
2014-07-02 17:01:40,296 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
2014-07-02 17:01:42,336 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid = 0x146f806378d0000, negotiated timeout = 600000
2014-07-02 17:01:42,337 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: map: No need to do anything when not a worker
2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: cleanup: Starting for MASTER_ZOOKEEPER_ONLY
2014-07-02 17:01:45,033 INFO org.apache.giraph.bsp.BspService: getJobState: Job state already exists (/_hadoopBsp/job_201407021636_0006/_masterJobState)
2014-07-02 17:01:45,038 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: First child is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000' and my bid is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000'
2014-07-02 17:01:45,044 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir already exists!
2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
2014-07-02 17:01:45,089 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
2014-07-02 17:01:45,111 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30000 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
2014-07-02 17:01:45,116 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
2014-07-02 17:01:45,119 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: I am now the master!
2014-07-02 17:01:45,126 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir already exists!
2014-07-02 17:01:45,140 INFO org.apache.giraph.io.formats.GiraphFileInputFormat: Total input paths to process : 1
2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: generateVertexInputSplits: Got 2 input splits for 1 input threads
2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Starting to write input split data to zookeeper with 1 threads
2014-07-02 17:01:45,163 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Done writing input split data to zookeeper
2014-07-02 17:01:45,173 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
2014-07-02 17:01:45,187 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
2014-07-02 17:01:45,188 INFO org.apache.giraph.partition.PartitionUtils: computePartitionCount: Creating 1, default would have been 1 partitions.
2014-07-02 17:01:45,350 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
2014-07-02 17:06:45,347 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
2014-07-02 17:06:45,349 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1]
2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1]
2014-07-02 17:12:07,201 INFO org.apache.giraph.zk.ZooKeeperManager: run: Shutdown hook started.
2014-07-02 17:12:07,202 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.
2014-07-02 17:12:07,518 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x146f806378d0000, likely server has closed socket, closing socket connection and attempting reconnect
2014-07-02 17:12:07,518 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: ZooKeeper process exited with 143 (note that 143 typically means killed).
>===========================================================================
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>The second one, task_201407021636_0006_m_000001, goes to the FAILED state.  syslog:
>===========================================================================
>2014-07-02 17:01:38,016 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2014-07-02 17:01:38,203 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2014-07-02 17:01:38,379 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
2014-07-02 17:01:40,280 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
2014-07-02 17:01:40,281 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsVertex
2014-07-02 17:01:40,317 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201407021636_0006
2014-07-02 17:01:40,320 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_task/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 1
2014-07-02 17:01:42,109 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 1, got file 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 ' (polling period is 3000)
2014-07-02 17:01:42,313 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazonaws.com, 0] 2 hosts in filename 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
2014-07-02 17:01:42,316 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got [ec2-54-186-5-159.us-west-2.compute.amazonaws.com] 1 hosts from 1 ready servers when 1 required (polling period is 3000) on attempt 0
2014-07-02 17:01:42,316 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceWorker...
2014-07-02 17:01:42,330 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201407021636_0006, 1 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ec2-54-186-5-159.us-west-2.compute.amazonaws.com 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client
 environment:java.class.path=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work:/home/ubuntu/hadoop/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ubuntu/hadoop/bin/..:/home/ubuntu/hadoop/bin/../hadoop-core-0.20.203.0.jar:/home/ubuntu/hadoop/bin/../lib/aspectjrt-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/aspectjtools-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-1.7.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-cli-1.2.jar:/home/ubuntu/hadoop/bin/../lib/commons-codec-1.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-collections-3.2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-configuration-1.6.jar:/home/ubuntu/hadoo
p/bin/../lib/commons-daemon-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-digester-1.8.jar:/home/ubuntu/hadoop/bin/../lib/commons-el-1.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-lang-2.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-1.1.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-math-2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/ubuntu/hadoop/bin/../lib/core-3.1.1.jar:/home/ubuntu/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/ubuntu/hadoop/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/ubuntu/hadoop/bin/../lib/jetty-6.1.26.jar:/home/ubuntu/hadoop/bin/../lib/jetty-util-6.1.26.jar:/home/ubu
ntu/hadoop/bin/../lib/jsch-0.1.42.jar:/home/ubuntu/hadoop/bin/../lib/junit-4.5.jar:/home/ubuntu/hadoop/bin/../lib/kfs-0.2.2.jar:/home/ubuntu/hadoop/bin/../lib/log4j-1.2.15.jar:/home/ubuntu/hadoop/bin/../lib/mockito-all-1.8.5.jar:/home/ubuntu/hadoop/bin/../lib/oro-2.0.8.jar:/home/ubuntu/hadoop/bin/../lib/servlet-api-2.5-20081211.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/xmlenc-0.52.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/ubuntu/hadoop/bin/../lib/native/Linux-amd64-64:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work/tmp
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-58-virtual
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=ubuntu
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/ubuntu
2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
2014-07-02 17:01:42,339 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 sessionTimeout=60000 watcher=org.apache.giraph.worker.BspServiceWorker@54a5733f
2014-07-02 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
2014-07-02 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
2014-07-02 17:01:42,812 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid = 0x146f806378d0001, negotiated timeout = 600000
2014-07-02 17:01:42,814 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
2014-07-02 17:01:42,819 INFO org.apache.giraph.comm.netty.NettyWorkerServer: createMessageStoreFactory: Using ByteArrayMessagesPerVertexStore since there is no combiner
2014-07-02 17:01:42,881 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
2014-07-02 17:01:42,904 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30001 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
2014-07-02 17:01:42,907 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
2014-07-02 17:01:42,914 INFO org.apache.giraph.graph.GraphTaskManager: setup: Registering health of this worker...
2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
2014-07-02 17:01:45,068 WARN org.apache.giraph.bsp.BspService: process: Unknown and unprocessed event (path=/_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir, type=NodeChildrenChanged, state=SyncConnected)
2014-07-02 17:01:45,076 INFO org.apache.giraph.worker.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 and workerInfo= Worker(hostname=ec2-54-186-5-159.us-west-2.compute.amazonaws.com, MRtaskID=1, port=30001)
2014-07-02 17:01:45,183 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
2014-07-02 17:01:45,339 INFO org.apache.giraph.bsp.BspService: process: partitionAssignmentsReadyChanged (partitions are assigned)
2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: startSuperstep: Master(hostname=ec2-54-186-5-159.us-west-2.compute.amazonaws.com, MRtaskID=0, port=30000)
2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: startSuperstep: Ready for computation on superstep -1 since worker selection and vertex range assignments are done in /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
2014-07-02 17:01:45,346 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
2014-07-02 17:01:45,354 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
2014-07-02 17:01:45,359 INFO org.apache.giraph.worker.BspServiceWorker: loadInputSplits: Using 1 thread(s), originally 1 threads(s) for 2 total splits.
2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache: maxVerticesPerTransfer = 10000
2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache: maxEdgesPerTransfer = 80000
2014-07-02 17:01:45,372 INFO org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit: Reserved input split path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDir/0, overall roughly 0.0% input splits reserved
2014-07-02 17:01:45,373 INFO org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDir/0 from ZooKeeper and got input split 'hdfs://ec2-54-186-5-159.us-west-2.compute.amazonaws.com:54310/user/ubuntu/input/CA.txt:0+67108864'
2014-07-02 17:01:46,615 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 250000 vertices at 200412.01085983627 vertices/sec 697447 edges at 560085.1663971642 edges/sec Memory (free/total/max) = 112.94M / 182.50M / 182.50M
2014-07-02 17:01:47,440 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 500000 vertices at 241131.36490688394 vertices/sec 1419367 edges at 685221.3493060106 edges/sec Memory (free/total/max) = 45.07M / 187.50M / 187.50M
2014-07-02 17:01:51,322 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 750000 vertices at 125921.72750649283 vertices/sec 2149814 edges at 361077.323111284 edges/sec Memory (free/total/max) = 16.73M / 189.50M / 189.50M
2014-07-02 17:01:55,205 ERROR org.apache.giraph.worker.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 on superstep -1
2014-07-02 17:01:55,213 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2014-07-02 17:01:55,716 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
2014-07-02 17:01:55,717 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName ubuntu for UID 1000 from the native implementation
2014-07-02 17:01:55,718 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@36df8f5 at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:102) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at org.apache.hadoop.mapred.Child$4.run(Child.java:259) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253)
Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@36df8f5 at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:151) at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:111) at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:73) at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:192) at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:276) at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:323) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:506) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:230) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:92) ... 7 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:202) at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:271) at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:143) ... 15 more
Caused by: java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.ConcurrentHashMap$Segment.rehash(ConcurrentHashMap.java:501) at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:460) at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1130) at org.apache.giraph.partition.SimplePartition.addPartition(SimplePartition.java:87) at org.apache.giraph.partition.SimplePartitionStore.addPartition(SimplePartitionStore.java:71) at org.apache.giraph.comm.requests.SendVertexRequest.doRequest(SendVertexRequest.java:81) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:470) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendPartitionRequest(NettyWorkerClientRequestProcessor.java:203) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendVertexRequest(NettyWorkerClientRequestProcessor.java:267) at
 org.apache.giraph.worker.VertexInputSplitsCallable.readInputSplit(VertexInputSplitsCallable.java:140) at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:220) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:161) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:58) at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
2014-07-02 17:01:55,722 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
>===========================================================================
>
>
>
>
>
>
>
>
>I've tried increasing Java Heap Space in hadoop/conf/mapred-site.xml by adding this:
>  <property>
>    <name>mapred.child.java.opts</name>
>   
 <value>-Xmx1024m</value>
>  </property>
>
>
>But that just caused the entire job to fail from the start.
>
>
>Before using this version of Giraph, I used 1.0.0 and 1.1.0-RC0 and those 
versions provide me with more and different errors to debug that relate 
to problems with Hadoop itself.  So the Giraph version I'm currently 
using seems to be the best for me because these errors seem more 
manageable.
>
>
>
>What can I do to fix this error?  I thought Giraph was built for large scale graph processing so I suppose this problem was encountered before by 
someone testing large graphs.  I searched through the mailing archives and couldn't find anything though.
>
>
>I can provide more information if you need it.  Thanks a lot.
>
>Bryan Rowe
>
>

Re: Errors only with large graphs

Posted by Young Han <yo...@uwaterloo.ca>.
10 minutes seems way too long to load in 91mb from HDFS. Are you sure your
graph's format is correct? For the Json input formats, each line of your
file should be:

[vertex id, vertex value, [[dst id, edge weight], [dst id, edge weigth],
..., [dst id, edge weight]]]

You can set the vertex value for every vertex to 0 (SSSP will overwrite
that value in the first superstep). If your graph doesn't have edge
weights, you can just set them to 1.

Also, have you tried a larger Xmx value? E.g., 4096m or 8192m.

Young


On Thu, Jul 3, 2014 at 4:06 PM, Bryan Rowe II <br...@yahoo.com> wrote:

> Hello,
>
> Giraph: release-1.0.0-RC3
>
> In short, when I use large graphs with the Shortest Paths example, it
> fails.  But when I use the small graph provided on the Quick Start guide,
> it succeeds.
> I converted all of my large graphs into the format shown in the Quick
> Start guide to simply things.
> I'm using a one-node setup.
>
>
> Here is the command I'm using:
> hadoop jar
> $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-0.20.203.0-jar-with-dependencies.jar
>
> org.apache.giraph.GiraphRunner
> org.apache.giraph.examples.SimpleShortestPathsVertex
> -vif
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
> -vip /user/ubuntu/input/CA.txt -of
> org.apache.giraph.io.formats.IdWithValueTextOutputFormat
> -op /user/ubuntu/output/shortestpaths
> -w 1
>
> (all on one line)
>
> CA.txt is a large graph file: 96,026,228 bytes
>
> The job fails in 10mins, 46sec.
>
> Two Map tasks are created when run.
> The first one, task_201407021636_0006_m_000000, is KILLED.  syslog:
> ===========================================================================
>
> 2014-07-02 17:01:34,757 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2014-07-02 17:01:34,945 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
> 2014-07-02 17:01:35,127 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
> 2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
> 2014-07-02 17:01:35,159 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsVertex
> 2014-07-02 17:01:35,201 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201407021636_0006
> 2014-07-02 17:01:35,204 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_task/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0
> 2014-07-02 17:01:35,219 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Got [ec2-54-186-5-159.us-west-2.compute.amazonaws.com] 1 hosts from 1 candidates when 1 required (polling period is 3000) on attempt 0
> 2014-07-02 17:01:35,220 INFO org.apache.giraph.zk.ZooKeeperManager: createZooKeeperServerList: Creating the final ZooKeeper file '_bsp/_defaultZkManagerDir/job_201407021636_0006/zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
> 2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 0, got file 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 ' (polling period is 3000)
> 2014-07-02 17:01:35,228 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazonaws.com, 0] 2 hosts in filename 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
> 2014-07-02 17:01:35,229 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Trying to delete old directory /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper
> 2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Creating file /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg in /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper with base port 22181
> 2014-07-02 17:01:35,234 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Make directory of _bspZooKeeper = true
> 2014-07-02 17:01:35,235 INFO org.apache.giraph.zk.ZooKeeperManager: generateZooKeeperConfigFile: Delete of zoo.cfg = false
> 2014-07-02 17:01:35,236 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Attempting to start ZooKeeper server with command [/usr/lib/jvm/java-7-oracle/jre/bin/java, -Xmx512m, -XX:ParallelGCThreads=4, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=70, -XX:MaxGCPauseMillis=100, -cp, /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar, org.apache.zookeeper.server.quorum.QuorumPeerMain, /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper/zoo.cfg] in directory /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/work/_bspZooKeeper
> 2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Shutdown hook added.
> 2014-07-02 17:01:35,238 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 0 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with poll msecs = 3000
> 2014-07-02 17:01:35,241 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got ConnectException
> java.net.ConnectException: Connection refused
> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
> 	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
> 	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
> 	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> 	at java.net.Socket.connect(Socket.java:579)
> 	at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:701)
> 	at org.apache.giraph.graph.GraphTaskManager.startZooKeeperManager(GraphTaskManager.java:357)
> 	at org.apache.giraph.graph.GraphTaskManager.setup(GraphTaskManager.java:188)
> 	at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:60)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:90)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
> 2014-07-02 17:01:38,242 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connect attempt 1 of 10 max trying to connect to ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 with poll msecs = 3000
> 2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Connected to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181!
> 2014-07-02 17:01:38,243 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_zkServer/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0
> 2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setup: Chosen to run ZooKeeper...
> 2014-07-02 17:01:40,249 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceMaster (master thread)...
> 2014-07-02 17:01:40,260 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201407021636_0006, 0 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181
> 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
> 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ec2-54-186-5-159.us-west-2.compute.amazonaws.com
> 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60
> 2014-07-02 17:01:40,270 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client
>  environment:java.class.path=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work:/home/ubuntu/hadoop/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ubuntu/hadoop/bin/..:/home/ubuntu/hadoop/bin/../hadoop-core-0.20.203.0.jar:/home/ubuntu/hadoop/bin/../lib/aspectjrt-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/aspectjtools-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-1.7.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-cli-1.2.jar:/home/ubuntu/hadoop/bin/../lib/commons-codec-1.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-collections-3.2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-configuration-1.6.jar:/home/ubuntu/hadoo
> p/bin/../lib/commons-daemon-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-digester-1.8.jar:/home/ubuntu/hadoop/bin/../lib/commons-el-1.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-lang-2.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-1.1.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-math-2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/ubuntu/hadoop/bin/../lib/core-3.1.1.jar:/home/ubuntu/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/ubuntu/hadoop/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/ubuntu/hadoop/bin/../lib/jetty-6.1.26.jar:/home/ubuntu/hadoop/bin/../lib/jetty-util-6.1.26.jar:/home/ubu
> ntu/hadoop/bin/../lib/jsch-0.1.42.jar:/home/ubuntu/hadoop/bin/../lib/junit-4.5.jar:/home/ubuntu/hadoop/bin/../lib/kfs-0.2.2.jar:/home/ubuntu/hadoop/bin/../lib/log4j-1.2.15.jar:/home/ubuntu/hadoop/bin/../lib/mockito-all-1.8.5.jar:/home/ubuntu/hadoop/bin/../lib/oro-2.0.8.jar:/home/ubuntu/hadoop/bin/../lib/servlet-api-2.5-20081211.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/xmlenc-0.52.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/ubuntu/hadoop/bin/../lib/native/Linux-amd64-64:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work/tmp
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-58-virtual
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=ubuntu
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/ubuntu
> 2014-07-02 17:01:40,271 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000000_0/work
> 2014-07-02 17:01:40,272 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 sessionTimeout=60000 watcher=org.apache.giraph.master.BspServiceMaster@4deb9df0
> 2014-07-02 17:01:40,284 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
> 2014-07-02 <http://ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:221812014-07-02> 17:01:40,296 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
> 2014-07-02 17:01:42,336 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid = 0x146f806378d0000, negotiated timeout = 600000
> 2014-07-02 17:01:42,337 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
> 2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: map: No need to do anything when not a worker
> 2014-07-02 17:01:42,343 INFO org.apache.giraph.graph.GraphTaskManager: cleanup: Starting for MASTER_ZOOKEEPER_ONLY
> 2014-07-02 17:01:45,033 INFO org.apache.giraph.bsp.BspService: getJobState: Job state already exists (/_hadoopBsp/job_201407021636_0006/_masterJobState)
> 2014-07-02 17:01:45,038 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: First child is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000' and my bid is '/_hadoopBsp/job_201407021636_0006/_masterElectionDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_00000000000'
> 2014-07-02 17:01:45,044 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir already exists!
> 2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
> 2014-07-02 17:01:45,089 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
> 2014-07-02 17:01:45,111 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30000 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
> 2014-07-02 17:01:45,116 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
> 2014-07-02 17:01:45,119 INFO org.apache.giraph.master.BspServiceMaster: becomeMaster: I am now the master!
> 2014-07-02 17:01:45,126 INFO org.apache.giraph.bsp.BspService: getApplicationAttempt: Node /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir already exists!
> 2014-07-02 17:01:45,140 INFO org.apache.giraph.io.formats.GiraphFileInputFormat: Total input paths to process : 1
> 2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: generateVertexInputSplits: Got 2 input splits for 1 input threads
> 2014-07-02 17:01:45,149 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Starting to write input split data to zookeeper with 1 threads
> 2014-07-02 17:01:45,163 INFO org.apache.giraph.master.BspServiceMaster: createVertexInputSplits: Done writing input split data to zookeeper
> 2014-07-02 17:01:45,173 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
> 2014-07-02 17:01:45,187 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
> 2014-07-02 17:01:45,188 INFO org.apache.giraph.partition.PartitionUtils: computePartitionCount: Creating 1, default would have been 1 partitions.
> 2014-07-02 17:01:45,350 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
> 2014-07-02 17:06:45,347 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
> 2014-07-02 17:06:45,349 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1]
> 2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep -1 on path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDoneDir
> 2014-07-02 17:11:45,355 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1]
> 2014-07-02 17:12:07,201 INFO org.apache.giraph.zk.ZooKeeperManager: run: Shutdown hook started.
> 2014-07-02 17:12:07,202 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.
> 2014-07-02 17:12:07,518 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x146f806378d0000, likely server has closed socket, closing socket connection and attempting reconnect
> 2014-07-02 17:12:07,518 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: ZooKeeper process exited with 143 (note that 143 typically means killed).
> ===========================================================================
>
>
>
>
>
>
>
>
> The second one, task_201407021636_0006_m_000001, goes to the FAILED
> state.  syslog:
> ===========================================================================
>
> 2014-07-02 17:01:38,016 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
> 2014-07-02 17:01:38,203 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
> 2014-07-02 17:01:38,379 INFO org.apache.giraph.graph.GraphTaskManager: setup: Log level remains at info
> 2014-07-02 17:01:40,280 INFO org.apache.giraph.graph.GraphTaskManager: Distributed cache is empty. Assuming fatjar.
> 2014-07-02 17:01:40,281 INFO org.apache.giraph.graph.GraphTaskManager: setup: classpath @ /home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/job.jar for job Giraph: org.apache.giraph.examples.SimpleShortestPathsVertex
> 2014-07-02 17:01:40,317 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Made the directory _bsp/_defaultZkManagerDir/job_201407021636_0006
> 2014-07-02 17:01:40,320 INFO org.apache.giraph.zk.ZooKeeperManager: createCandidateStamp: Creating my filestamp _bsp/_defaultZkManagerDir/job_201407021636_0006/_task/ec2-54-186-5-159.us-west-2.compute.amazonaws.com 1
> 2014-07-02 17:01:42,109 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: For task 1, got file 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 ' (polling period is 3000)
> 2014-07-02 17:01:42,313 INFO org.apache.giraph.zk.ZooKeeperManager: getZooKeeperServerList: Found [ec2-54-186-5-159.us-west-2.compute.amazonaws.com, 0] 2 hosts in filename 'zkServerList_ec2-54-186-5-159.us-west-2.compute.amazonaws.com 0 '
> 2014-07-02 17:01:42,316 INFO org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Got [ec2-54-186-5-159.us-west-2.compute.amazonaws.com] 1 hosts from 1 ready servers when 1 required (polling period is 3000) on attempt 0
> 2014-07-02 17:01:42,316 INFO org.apache.giraph.graph.GraphTaskManager: setup: Starting up BspServiceWorker...
> 2014-07-02 17:01:42,330 INFO org.apache.giraph.bsp.BspService: BspService: Connecting to ZooKeeper with job job_201407021636_0006, 1 on ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181
> 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-1073969, built on 02/23/2011 22:27 GMT
> 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ec2-54-186-5-159.us-west-2.compute.amazonaws.com
> 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.7.0_60
> 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
> 2014-07-02 17:01:42,337 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-7-oracle/jre
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client
>  environment:java.class.path=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars/classes:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/jars:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work:/home/ubuntu/hadoop/bin/../conf:/usr/lib/jvm/java-7-oracle/lib/tools.jar:/home/ubuntu/hadoop/bin/..:/home/ubuntu/hadoop/bin/../hadoop-core-0.20.203.0.jar:/home/ubuntu/hadoop/bin/../lib/aspectjrt-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/aspectjtools-1.6.5.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-1.7.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-beanutils-core-1.8.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-cli-1.2.jar:/home/ubuntu/hadoop/bin/../lib/commons-codec-1.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-collections-3.2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-configuration-1.6.jar:/home/ubuntu/hadoo
> p/bin/../lib/commons-daemon-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-digester-1.8.jar:/home/ubuntu/hadoop/bin/../lib/commons-el-1.0.jar:/home/ubuntu/hadoop/bin/../lib/commons-httpclient-3.0.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-lang-2.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-1.1.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-logging-api-1.0.4.jar:/home/ubuntu/hadoop/bin/../lib/commons-math-2.1.jar:/home/ubuntu/hadoop/bin/../lib/commons-net-1.4.1.jar:/home/ubuntu/hadoop/bin/../lib/core-3.1.1.jar:/home/ubuntu/hadoop/bin/../lib/hsqldb-1.8.0.10.jar:/home/ubuntu/hadoop/bin/../lib/jackson-core-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jackson-mapper-asl-1.0.1.jar:/home/ubuntu/hadoop/bin/../lib/jasper-compiler-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jasper-runtime-5.5.12.jar:/home/ubuntu/hadoop/bin/../lib/jets3t-0.6.1.jar:/home/ubuntu/hadoop/bin/../lib/jetty-6.1.26.jar:/home/ubuntu/hadoop/bin/../lib/jetty-util-6.1.26.jar:/home/ubu
> ntu/hadoop/bin/../lib/jsch-0.1.42.jar:/home/ubuntu/hadoop/bin/../lib/junit-4.5.jar:/home/ubuntu/hadoop/bin/../lib/kfs-0.2.2.jar:/home/ubuntu/hadoop/bin/../lib/log4j-1.2.15.jar:/home/ubuntu/hadoop/bin/../lib/mockito-all-1.8.5.jar:/home/ubuntu/hadoop/bin/../lib/oro-2.0.8.jar:/home/ubuntu/hadoop/bin/../lib/servlet-api-2.5-20081211.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-api-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/slf4j-log4j12-1.4.3.jar:/home/ubuntu/hadoop/bin/../lib/xmlenc-0.52.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-2.1.jar:/home/ubuntu/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/home/ubuntu/hadoop/bin/../lib/native/Linux-amd64-64:/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work/tmp
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-58-virtual
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=ubuntu
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/ubuntu
> 2014-07-02 17:01:42,338 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/ubuntu/hdfstmp/mapred/local/taskTracker/ubuntu/jobcache/job_201407021636_0006/attempt_201407021636_0006_m_000001_0/work
> 2014-07-02 17:01:42,339 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ec2-54-186-5-159.us-west-2.compute.amazonaws.com:22181 sessionTimeout=60000 watcher=org.apache.giraph.worker.BspServiceWorker@54a5733f
> 2014-07-02 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181
> 2014-07-02 <http://ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:221812014-07-02> 17:01:42,347 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, initiating session
> 2014-07-02 17:01:42,812 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:22181, sessionid = 0x146f806378d0001, negotiated timeout = 600000
> 2014-07-02 17:01:42,814 INFO org.apache.giraph.bsp.BspService: process: Asynchronous connection complete.
> 2014-07-02 17:01:42,819 INFO org.apache.giraph.comm.netty.NettyWorkerServer: createMessageStoreFactory: Using ByteArrayMessagesPerVertexStore since there is no combiner
> 2014-07-02 17:01:42,881 INFO org.apache.giraph.comm.netty.NettyServer: NettyServer: Using execution handler with 8 threads after requestFrameDecoder.
> 2014-07-02 17:01:42,904 INFO org.apache.giraph.comm.netty.NettyServer: start: Started server communication server: ec2-54-186-5-159.us-west-2.compute.amazonaws.com/172.31.45.24:30001 with up to 16 threads on bind attempt 0 with sendBufferSize = 32768 receiveBufferSize = 524288 backlog = 1
> 2014-07-02 17:01:42,907 INFO org.apache.giraph.comm.netty.NettyClient: NettyClient: Using execution handler with 8 threads after requestEncoder.
> 2014-07-02 17:01:42,914 INFO org.apache.giraph.graph.GraphTaskManager: setup: Registering health of this worker...
> 2014-07-02 17:01:45,049 INFO org.apache.giraph.bsp.BspService: process: applicationAttemptChanged signaled
> 2014-07-02 17:01:45,068 WARN org.apache.giraph.bsp.BspService: process: Unknown and unprocessed event (path=/_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir, type=NodeChildrenChanged, state=SyncConnected)
> 2014-07-02 17:01:45,076 INFO org.apache.giraph.worker.BspServiceWorker: registerHealth: Created my health node for attempt=0, superstep=-1 with /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 and workerInfo= Worker(hostname=ec2-54-186-5-159.us-west-2.compute.amazonaws.com, MRtaskID=1, port=30001)
> 2014-07-02 17:01:45,183 INFO org.apache.giraph.comm.netty.NettyServer: start: Using Netty without authentication.
> 2014-07-02 17:01:45,339 INFO org.apache.giraph.bsp.BspService: process: partitionAssignmentsReadyChanged (partitions are assigned)
> 2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: startSuperstep: Master(hostname=ec2-54-186-5-159.us-west-2.compute.amazonaws.com, MRtaskID=0, port=30000)
> 2014-07-02 17:01:45,345 INFO org.apache.giraph.worker.BspServiceWorker: startSuperstep: Ready for computation on superstep -1 since worker selection and vertex range assignments are done in /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
> 2014-07-02 17:01:45,346 INFO org.apache.giraph.comm.netty.NettyClient: Using Netty without authentication.
> 2014-07-02 17:01:45,354 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 1 connections, (1 total connected) 0 failed, 0 failures total.
> 2014-07-02 17:01:45,359 INFO org.apache.giraph.worker.BspServiceWorker: loadInputSplits: Using 1 thread(s), originally 1 threads(s) for 2 total splits.
> 2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache: maxVerticesPerTransfer = 10000
> 2014-07-02 17:01:45,362 INFO org.apache.giraph.comm.SendPartitionCache: SendPartitionCache: maxEdgesPerTransfer = 80000
> 2014-07-02 17:01:45,372 INFO org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit: Reserved input split path /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDir/0, overall roughly 0.0% input splits reserved
> 2014-07-02 17:01:45,373 INFO org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved /_hadoopBsp/job_201407021636_0006/_vertexInputSplitDir/0 from ZooKeeper and got input split 'hdfs://ec2-54-186-5-159.us-west-2.compute.amazonaws.com:54310/user/ubuntu/input/CA.txt:0+67108864'
> 2014-07-02 17:01:46,615 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 250000 vertices at 200412.01085983627 vertices/sec 697447 edges at 560085.1663971642 edges/sec Memory (free/total/max) = 112.94M / 182.50M / 182.50M
> 2014-07-02 17:01:47,440 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 500000 vertices at 241131.36490688394 vertices/sec 1419367 edges at 685221.3493060106 edges/sec Memory (free/total/max) = 45.07M / 187.50M / 187.50M
> 2014-07-02 17:01:51,322 INFO org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: Loaded 750000 vertices at 125921.72750649283 vertices/sec 2149814 edges at 361077.323111284 edges/sec Memory (free/total/max) = 16.73M / 189.50M / 189.50M
> 2014-07-02 17:01:55,205 ERROR org.apache.giraph.worker.BspServiceWorker: unregisterHealth: Got failure, unregistering health on /_hadoopBsp/job_201407021636_0006/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/ec2-54-186-5-159.us-west-2.compute.amazonaws.com_1 on superstep -1
> 2014-07-02 17:01:55,213 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2014-07-02 17:01:55,716 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
> 2014-07-02 17:01:55,717 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName ubuntu for UID 1000 from the native implementation
> 2014-07-02 17:01:55,718 WARN org.apache.hadoop.mapred.Child: Error running child
> java.lang.IllegalStateException: run: Caught an unrecoverable exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@36df8f5
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:102)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:253)
> Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@36df8f5
> 	at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:151)
> 	at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:111)
> 	at org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:73)
> 	at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:192)
> 	at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:276)
> 	at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:323)
> 	at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:506)
> 	at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:230)
> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:92)
> 	... 7 more
> Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
> 	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> 	at java.util.concurrent.FutureTask.get(FutureTask.java:202)
> 	at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:271)
> 	at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:143)
> 	... 15 more
> Caused by: java.lang.OutOfMemoryError: Java heap space
> 	at java.util.concurrent.ConcurrentHashMap$Segment.rehash(ConcurrentHashMap.java:501)
> 	at java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:460)
> 	at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:1130)
> 	at org.apache.giraph.partition.SimplePartition.addPartition(SimplePartition.java:87)
> 	at org.apache.giraph.partition.SimplePartitionStore.addPartition(SimplePartitionStore.java:71)
> 	at org.apache.giraph.comm.requests.SendVertexRequest.doRequest(SendVertexRequest.java:81)
> 	at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:470)
> 	at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendPartitionRequest(NettyWorkerClientRequestProcessor.java:203)
> 	at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.sendVertexRequest(NettyWorkerClientRequestProcessor.java:267)
> 	at org.apache.giraph.worker.VertexInputSplitsCallable.readInputSplit(VertexInputSplitsCallable.java:140)
> 	at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:220)
> 	at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:161)
> 	at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:58)
> 	at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> 2014-07-02 17:01:55,722 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
>
> ===========================================================================
>
>
>
>
> I've tried increasing Java Heap Space in hadoop/conf/mapred-site.xml by
> adding this:
>   <property>
>     <name>mapred.child.java.opts</name>
>     <value>-Xmx1024m</value>
>   </property>
>
> But that just caused the entire job to fail from the start.
>
> Before using this version of Giraph, I used 1.0.0 and 1.1.0-RC0 and those
> versions provide me with more and different errors to debug that relate to
> problems with Hadoop itself.  So the Giraph version I'm currently using
> seems to be the best for me because these errors seem more manageable.
>
> What can I do to fix this error?  I thought Giraph was built for large
> scale graph processing so I suppose this problem was encountered before by
> someone testing large graphs.  I searched through the mailing archives and
> couldn't find anything though.
>
> I can provide more information if you need it.  Thanks a lot.
>
> Bryan Rowe
>
>