You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Mohammad Kamrul Islam (JIRA)" <ji...@apache.org> on 2014/03/14 19:26:51 UTC

[jira] [Comment Edited] (TEZ-694) Remove task commit burden from user code

    [ https://issues.apache.org/jira/browse/TEZ-694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13934640#comment-13934640 ] 

Mohammad Kamrul Islam edited comment on TEZ-694 at 3/14/14 6:25 PM:
--------------------------------------------------------------------

[~bikassaha] Which one will be cleaner?
1. Adding it to LIORuntimeTask 
2. Tp AbstractLIProcessor :  we can commit it in LogicalProcessor abstract class (patch is ready in TEZ-695). In that case, we will need to provide another method execute ( say), which will call the run() and then commit the output. In addition, we can move all the input.start and putput.start stuffs into this execute method. I actually prefer this process

Please give your comments.


was (Author: kamrul):
There are few things that I want to clarify:
1. commit logic is only in MROutput , not in its base class LogicalOutput. 
2. LogicalIOProcessor normally deals with LogicalO. Adding commit logic in LIORuntimeTask.run() might not be clean.  We can do it by checking if the Output is an instanceOf MROutput and commit it. again not a clean approach.
3. Alternatively, we can commit it in LogicalProcessor abstract class (patch is ready in TEZ-695). In that case, we will need to provide another method execute ( say), which will call the run() and then commit the output.

Please give your comments.

> Remove task commit burden from user code
> ----------------------------------------
>
>                 Key: TEZ-694
>                 URL: https://issues.apache.org/jira/browse/TEZ-694
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Mohammad Kamrul Islam
>
> Its easy to forget to add commit logic in the processor after the logic in the run() method is done. After that its hard to debug why the outputs are not appearing after the dag completes even though all the tasks have been completed successfully. Since commit is an operation that should happen after the processor completes, it cannot be delegated entirely to the outputs. We can either do it in the LIORuntimeTask after processor.run() completes or create and abstract base processor class that does this after the real run() method completes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)