You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2014/02/17 22:52:19 UTC

[jira] [Updated] (PIG-3767) Work with TEZ-668 which allows starting and closing of inputs and outputs

     [ https://issues.apache.org/jira/browse/PIG-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy updated PIG-3767:
------------------------------------

    Attachment: PIG-3767-1.patch

Initial patch. TEZ-668 is not committed yet. Facing some issues with it where output directory is empty even though records are written to MROutput and commit is called. Will check with [~sseth] on that.

> Work with TEZ-668 which allows starting and closing of inputs and outputs
> -------------------------------------------------------------------------
>
>                 Key: PIG-3767
>                 URL: https://issues.apache.org/jira/browse/PIG-3767
>             Project: Pig
>          Issue Type: Sub-task
>          Components: tez
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: tez-branch
>
>         Attachments: PIG-3767-1.patch
>
>
> From [~bikassaha]:
> https://issues.apache.org/jira/browse/TEZ-668 is a breaking change in TEZ trunk
>  
> This adds a start method to the Input/Output and the processor is expected to call input.start()/output.start() for the input/output to actually start fetching/writing data. After this get committed, Hive and Pig Processors need to call start() on the input/output that they want to start. The processors may decide to not call start for an input they do not want to read (e.g. data already in ObjectRegistry) or they may choose to stagger the inputs in a certain order based on memory or processing requirements.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)