You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "gaozhan ding (Jira)" <ji...@apache.org> on 2020/06/09 04:29:00 UTC

[jira] [Created] (HIVE-23667) Incorrect output with option hive.auto.convert.join=fasle

gaozhan ding created HIVE-23667:
-----------------------------------

             Summary: Incorrect output with option hive.auto.convert.join=fasle
                 Key: HIVE-23667
                 URL: https://issues.apache.org/jira/browse/HIVE-23667
             Project: Hive
          Issue Type: Bug
            Reporter: gaozhan ding
             Fix For: 3.1.0


We use hive with version 3.1.0 with tez engine 0.9.1.3

I encountered an error when executing a hive SQL. This SQL is as follows
{code:java}
set mapreduce.job.queuename=root.xxx;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set hive.exec.max.dynamic.partitions.pernode=10000;
set hive.exec.max.dynamic.partitions=10000;
set hive.fileformat.check=false;
set mapred.reduce.tasks=50;
set hive.auto.convert.join=true;
use xxx;

select count(*) from   230_dim_site  join dw_fact_inverter_detail on  dw_fact_inverter_detail.site=230_dim_site.id;{code}
with the output.
{code:java}
+----------+ | _c0 | +----------+ | 4954736 | +----------+
{code}
But when the hive.auto.convert.join option is set to false，the utput is not as expected。

The SQL is as follows
{code:java}
set mapreduce.job.queuename=root.xxx;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set hive.exec.max.dynamic.partitions.pernode=10000;
set hive.exec.max.dynamic.partitions=10000;
set hive.fileformat.check=false;  
set mapred.reduce.tasks=50;
set hive.auto.convert.join=false; //changed
use xxx;

select count(*) from   230_dim_site  join dw_fact_inverter_detail on  dw_fact_inverter_detail.site=230_dim_site.id;{code}
with output：
{code:java}
+------+ | _c0 | +------+ | 0 | +------+
{code}
Beside，both tables participating in the join are partition tables.

Especially，if the option mapred.reduce.tasks=50 was not set，all above the sql output expected results.

We just upgraded hive from 1.2 to 3.1.0, and we found that these problems only occurred in the old hive table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)