You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ning Zhang (JIRA)" <ji...@apache.org> on 2010/08/16 23:22:16 UTC

[jira] Commented: (HIVE-1544) Filtering out NULL-keyed rows in ReduceSinkOperator when no outer join involved

    [ https://issues.apache.org/jira/browse/HIVE-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899096#action_12899096 ] 

Ning Zhang commented on HIVE-1544:
----------------------------------

The JoinDesc already has a flag noOuterJoin to keep track if there are outer joins involved in the join operator. Based on that we should set a flag in the ReduceSinkDesc to indicate whether NULL-keyed rows will be filtered out.

> Filtering out NULL-keyed rows in ReduceSinkOperator when no outer join involved
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-1544
>                 URL: https://issues.apache.org/jira/browse/HIVE-1544
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>
> As discussed in HIVE-741, if a plan indicates that a non-outer join is the first operator in the reducer, the ReduceSinkOperator should filter out (not sending) rows with NULL as keys since they will not generate any results anyways. This should save both bandwidth and processing power. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.