You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Seonggon Namgung (Jira)" <ji...@apache.org> on 2023/05/24 08:19:00 UTC
[jira] [Commented] (HIVE-27269) VectorizedMapJoin returns wrong result for TPC-DS query 97
[ https://issues.apache.org/jira/browse/HIVE-27269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725691#comment-17725691 ]
Seonggon Namgung commented on HIVE-27269:
-----------------------------------------
To reproduce this issue, one should set the following configurations:
# hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled=true
# hive.mapjoin.hashtable.load.threads=2 (or higher integer)
> VectorizedMapJoin returns wrong result for TPC-DS query 97
> ----------------------------------------------------------
>
> Key: HIVE-27269
> URL: https://issues.apache.org/jira/browse/HIVE-27269
> Project: Hive
> Issue Type: Sub-task
> Reporter: Seonggon Namgung
> Priority: Blocker
> Labels: hive-4.0.0-must
>
> TPC-DS query 97 returns wrong results when hive.auto.convert.join and hive.vectorized.execution.enabled are set to true.
>
> Result of query 97 on 1TB text dataset:
> CommonMergeJoinOperator(hive.auto.convert.join=false): 534151529, 284185{*}746{*}, 84163
> MapJoinOperator(hive.auto.convert.join=true, hive.vectorized.execution.enabled=false): 534151529, 284185{*}746{*}, 84163
> VectorMapJoinOperator(hive.auto.convert.join=true, hive.vectorized.execution.enabled=true): 534151529, 284185{*}388{*}, 84163
>
> Also I observed that VectorizedMapJoin returns different results for 100GB dataset when I run query 97 twice, but I could not reproduce it since then.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)