You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Seonggon Namgung (Jira)" <ji...@apache.org> on 2023/05/24 08:19:00 UTC

[jira] [Commented] (HIVE-27269) VectorizedMapJoin returns wrong result for TPC-DS query 97

    [ https://issues.apache.org/jira/browse/HIVE-27269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725691#comment-17725691 ] 

Seonggon Namgung commented on HIVE-27269:
-----------------------------------------

To reproduce this issue, one should set the following configurations:
 # hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled=true
 # hive.mapjoin.hashtable.load.threads=2 (or higher integer)

 

> VectorizedMapJoin returns wrong result for TPC-DS query 97
> ----------------------------------------------------------
>
>                 Key: HIVE-27269
>                 URL: https://issues.apache.org/jira/browse/HIVE-27269
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Seonggon Namgung
>            Priority: Blocker
>              Labels: hive-4.0.0-must
>
> TPC-DS query 97 returns wrong results when hive.auto.convert.join and hive.vectorized.execution.enabled are set to true.
>  
> Result of query 97 on 1TB text dataset:
> CommonMergeJoinOperator(hive.auto.convert.join=false): 534151529, 284185{*}746{*}, 84163
> MapJoinOperator(hive.auto.convert.join=true, hive.vectorized.execution.enabled=false): 534151529, 284185{*}746{*}, 84163
> VectorMapJoinOperator(hive.auto.convert.join=true, hive.vectorized.execution.enabled=true): 534151529, 284185{*}388{*}, 84163
>  
> Also I observed that VectorizedMapJoin returns different results for 100GB dataset when I run query 97 twice, but I could not reproduce it since then.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)