You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by Alessio Arleo <in...@icloud.com> on 2015/05/06 11:37:19 UTC

Split Master Worker and Zookeeper: "waitUntilAllTasksDone: Still waiting on tasks 1"

Hello everybody

I migrated to Hadoop 2.4.0 from Hadoop 1.2.1 and I am making some tests on my local machine, pseudo cluster-mode. To make it work, as stated in http://stackoverflow.com/questions/26175116/apache-giraph-cannot-run-in-split-master-worker-mode-since-there-is-only-1-t <http://stackoverflow.com/questions/26175116/apache-giraph-cannot-run-in-split-master-worker-mode-since-there-is-only-1-t>, I de-activated the “giraph.splitMasterWorker” option. 

The job completed and worked fine, but it could not complete the output committing phase, not being able to shutoff ZooKeeper. Here’s a fraction of the last part of the log:

015-05-06 11:25:01,741 INFO [main] org.apache.giraph.zk.ZooKeeperManager: waitUntilAllTasksDone: Still waiting on tasks 1, 
2015-05-06 11:25:04,744 INFO [main] org.apache.giraph.zk.ZooKeeperManager: waitUntilAllTasksDone: Got 1 and 2 desired (polling period is 3000) on attempt 299
2015-05-06 11:25:04,744 INFO [main] org.apache.giraph.zk.ZooKeeperManager: waitUntilAllTasksDone: Still waiting on tasks 1, 
2015-05-06 11:25:07,744 ERROR [main] org.apache.giraph.graph.GraphTaskManager: cleanup: Error offlining zookeeper
java.lang.IllegalStateException: waitUntilAllTasksDone: Tasks did not finish by the maximum time of 900000 milliseconds
	at org.apache.giraph.zk.ZooKeeperManager.waitUntilAllTasksDone(ZooKeeperManager.java:890)
	at org.apache.giraph.zk.ZooKeeperManager.offlineZooKeeperServers(ZooKeeperManager.java:919)
	at org.apache.giraph.graph.GraphTaskManager.cleanup(GraphTaskManager.java:921)
	at org.apache.giraph.graph.GraphMapper.cleanup(GraphMapper.java:83)
	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:95)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
2015-05-06 11:25:08,555 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1430743932099_0004_m_000000_0 is done. And is in the process of committing
2015-05-06 11:25:08,644 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1430743932099_0004_m_000000_0' done.
2015-05-06 11:25:08,717 INFO [Thread-13] org.apache.giraph.zk.ZooKeeperManager: run: Shutdown hook started.
2015-05-06 11:25:08,717 WARN [Thread-13] org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.

How come am I experiencing this behaviour? Is ZK expecting two tasks to complete but only one spawned in the first place (because of the split master/worker option)?

I am really sorry but I am a bit confused. Thanks in advance for any help.

~~~~~~~~~~~~~~~~~~~

Ing. Alessio Arleo

Dottorando in Ingegneria Industriale e dell’Informazione

Dottore Magistrale in Ingegneria Informatica e dell’Automazione
Dottore in Ingegneria Informatica ed Elettronica

Linkedin: it.linkedin.com/in/IngArleo <http://it.linkedin.com/in/IngArleo>
Skype: Ing. Alessio Arleo

Tel: +39 075 5853920
Cell: +39 349 0575782

~~~~~~~~~~~~~~~~~~~