You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Deepak Jaiswal (JIRA)" <ji...@apache.org> on 2018/06/11 01:04:00 UTC

[jira] [Created] (HIVE-19849) ReduceRecordSource should flush the last record when reader runs out of records

Deepak Jaiswal created HIVE-19849:
-------------------------------------

             Summary: ReduceRecordSource should flush the last record when reader runs out of records
                 Key: HIVE-19849
                 URL: https://issues.apache.org/jira/browse/HIVE-19849
             Project: Hive
          Issue Type: Task
            Reporter: Deepak Jaiswal
            Assignee: Deepak Jaiswal


ReduceRecordSource pushes all the records to the reducer operator. It is upto that operator to forward it down the pipeline. Incase of operators such as GBY, the last record is flushed only when the operator is closed which may cause joins to miss records.

This has been fixed for SMB Join when it happens on reducer, however, it maybe good idea to just flush out recursively (see flushRecursive) when reader is exhausted to ensure that last record  or set of records is not held.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)