You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sungwoo Park (Jira)" <ji...@apache.org> on 2023/03/24 03:02:00 UTC

[jira] [Commented] (HIVE-27138) MapJoinOperator throws NPE when computing OuterJoin with filter expressions on small table

    [ https://issues.apache.org/jira/browse/HIVE-27138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17704423#comment-17704423 ] 

Sungwoo Park commented on HIVE-27138:
-------------------------------------

This commit also fixes a bug in LazyBinaryStruct.java:

https://github.com/apache/hive/blob/a1ff44ccd434373c7eef56fc081b40c343a23f33/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryStruct.java#L224

Here fieldStart and fieldLength should not be declared as local variables because they shadow private fields of the class. Because of this bug, assert() in the method getShort() is violated.


> MapJoinOperator throws NPE when computing OuterJoin with filter expressions on small table
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-27138
>                 URL: https://issues.apache.org/jira/browse/HIVE-27138
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Seonggon Namgung
>            Assignee: Seonggon Namgung
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hive throws NPE when running mapjoin_filter_on_outerjoin.q using Tez engine. (I used TestMiniLlapCliDriver.)
> The NPE is thrown by CommonJoinOperator.getFilterTag(), which just retreives the last object from the given list.
> To the best of my knowledge, if Hive selects MapJoin to perform Join operation, filterTag should be computed and appended to a row before the row is passed to MapJoinOperator.
> In the case of MapReduce engine, this is done by HashTableSinkOperator.
> However, I cannot find any logic pareparing filterTag for small tables when Hive uses Tez engine.
> I think there are 2 available options:
> 1. Don't use MapJoinOperator if a small table has filter expression.
> 2. Add a new logic that computes and passes filterTag to MapJoinOperator.
> I am working on the second option and ready to discuss about it.
> It would be grateful if you could give any opinion about this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)