You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Nitay Joffe (JIRA)" <ji...@apache.org> on 2013/03/20 20:07:15 UTC

[jira] [Commented] (GIRAPH-576) BspServiceMaster.failureCleanup() shouldn't pass null in observers' applicationFailed() method

    [ https://issues.apache.org/jira/browse/GIRAPH-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608023#comment-13608023 ] 

Nitay Joffe commented on GIRAPH-576:
------------------------------------

No need to put patch in comment, attached file is good enough.

This looks great. Can we get a bit more descriptive on the first case (the "FAILED"). Looking at the code it seems to be called basically when checkWorkers() returns an error. Since that is only case we could stick that in directly as the message, but let's do a bit better - how about create some function like setJobStateFailed() which takes a String message describing reason. It calls setJobState() and then does the failJob passing in the message given. Then the callers put something like "Not enough healthy workers to (create input splits / coordinate superstep X)" where the two options are the two callsites I see. Does that make sense?  
                
> BspServiceMaster.failureCleanup() shouldn't pass null in observers' applicationFailed() method
> ----------------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-576
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-576
>             Project: Giraph
>          Issue Type: Bug
>          Components: bsp
>    Affects Versions: 0.2.0
>         Environment: Linux
>            Reporter: Jess Garms
>            Priority: Minor
>              Labels: easy, newbie, patch
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-576.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> failureCleanup() in BspServiceMaster gets called with a null exception from failJob(). That in turn passes a null exception to the set of MasterObservers, in their applicationFailed() method. They probably aren't expecting that. Instead we should pass an appropriate exception around depending on the cause of the failure.
> I'll attach a patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira