You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org> on 2011/09/04 07:30:09 UTC

[jira] [Commented] (GIRAPH-25) NPE in BspServiceMaster when failing a job

    [ https://issues.apache.org/jira/browse/GIRAPH-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096816#comment-13096816 ] 

Dmitriy V. Ryaboy commented on GIRAPH-25:
-----------------------------------------

Here's the log I saw on a timed out master:

{code}
2011-09-04 05:22:11,115 INFO org.apache.giraph.graph.BspServiceMaster: checkWorkers: Only found 182 responses of 186 needed to start superstep -1.  Sleeping for 30000 msecs and used 9 of 10 attempts.
2011-09-04 05:22:11,115 WARN org.apache.giraph.graph.BspServiceMaster: checkWorkers: Did not receive enough processes in time (only 182 of 186 required)
2011-09-04 05:22:11,120 INFO org.apache.giraph.graph.BspServiceMaster: setJobState: {"_stateKey":"FAILED","_applicationAttemptKey":-1,"_superstepKey":-1} on superstep -1
2011-09-04 05:22:11,129 FATAL org.apache.giraph.graph.BspServiceMaster: failJob: Killing job job_201109012213_17306
2011-09-04 05:22:11,159 ERROR org.apache.giraph.graph.MasterThread: masterThread: Master algorithm failed: 
java.lang.NullPointerException
	at org.apache.giraph.graph.BspServiceMaster.createInputSplits(BspServiceMaster.java:486)
	at org.apache.giraph.graph.MasterThread.run(MasterThread.java:94)
2011-09-04 05:22:11,160 FATAL org.apache.giraph.graph.GraphMapper: uncaughtException: OverrideExceptionHandler on thread org.apache.giraph.graph.MasterThread, msg = java.lang.NullPointerException, exiting...
java.lang.RuntimeException: java.lang.NullPointerException
	at org.apache.giraph.graph.MasterThread.run(MasterThread.java:177)
Caused by: java.lang.NullPointerException
	at org.apache.giraph.graph.BspServiceMaster.createInputSplits(BspServiceMaster.java:486)
	at org.apache.giraph.graph.MasterThread.run(MasterThread.java:94)
2011-09-04 05:22:11,161 WARN org.apache.giraph.zk.ZooKeeperManager: onlineZooKeeperServers: Forced a shutdown hook kill of the ZooKeeper process.
{code}

> NPE in BspServiceMaster when failing a job
> ------------------------------------------
>
>                 Key: GIRAPH-25
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-25
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Dmitriy V. Ryaboy
>            Priority: Minor
>
> When BspServiceMaster times out waiting for all workers to check in, it dies with a NullPointerException.
> This can perhaps be handled a bit more gracefully.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira