You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Mohit Sabharwal (JIRA)" <ji...@apache.org> on 2015/05/09 00:29:00 UTC

[jira] [Updated] (PIG-4542) OutputConsumerIterator should flush buffered records

     [ https://issues.apache.org/jira/browse/PIG-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mohit Sabharwal updated PIG-4542:
---------------------------------
    Attachment: PIG-4542.patch

> OutputConsumerIterator should flush buffered records
> ----------------------------------------------------
>
>                 Key: PIG-4542
>                 URL: https://issues.apache.org/jira/browse/PIG-4542
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>    Affects Versions: spark-branch
>            Reporter: Mohit Sabharwal
>            Assignee: Mohit Sabharwal
>             Fix For: spark-branch
>
>         Attachments: PIG-4542.patch
>
>
> Certain operators may buffer the output. We need to flush the last set of records from such operators, when we encounter the last input record, before calling getNextTuple() for the last time.
> Currently, to flush the last set of records, we compute RDD.count() and compare the count with a running counter to determine if we have reached the last record. This is an unnecessary and inefficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)