You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "kang (Jira)" <ji...@apache.org> on 2021/05/24 13:44:00 UTC
[jira] [Updated] (HIVE-25156) reduce数设置为2后,查出的数据减少一半
[ https://issues.apache.org/jira/browse/HIVE-25156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
kang updated HIVE-25156:
------------------------
Description:
c'c我的hive版本为3.1.2,当我设置reduce数后,返回的结果会是 结果数/reduce数 的数据量。
下面为我的sql
select b.stas_day ,d.year_week_curr_wednesday as curr_week-- 本周
,count(distinct a.parent_enterprise_user_code) as ents
,count(distinct b.user_code) as charge_users
,sum(nvl(today_charge_electricity,0)) as today_charge_electricity
from tmp.tmp_ent a,(select year_week_curr_wednesday,year_week_last_wednesday
from dim.dim_date a
where a.date_id >='2021-05-13'
and a.date_id<='2021-05-19'
GROUP BY year_week_curr_wednesday,year_week_last_wednesday) d
left join tmp.dws_user_province_trade_dat b
on a.user_code = b.user_code
and d.year_week_curr_wednesday = b.stas_week
where 1=1
group by b.stas_day ,d.year_week_curr_wednesday-- 周
;
was:
我的hive版本为3.1.2,当我设置reduce数后,返回的结果会是 结果数/reduce数 的数据量。
下面为我的sql
select b.stas_day ,d.year_week_curr_wednesday as curr_week-- 本周
,count(distinct a.parent_enterprise_user_code) as ents
,count(distinct b.user_code) as charge_users
,sum(nvl(today_charge_electricity,0)) as today_charge_electricity
from tmp.tmp_ent a,(select year_week_curr_wednesday,year_week_last_wednesday
from dim.dim_date a
where a.date_id >='2021-05-13'
and a.date_id<='2021-05-19'
GROUP BY year_week_curr_wednesday,year_week_last_wednesday) d
left join tmp.dws_user_province_trade_dat b
on a.user_code = b.user_code
and d.year_week_curr_wednesday = b.stas_week
where 1=1
group by b.stas_day ,d.year_week_curr_wednesday-- 周
;
> reduce数设置为2后,查出的数据减少一半
> ----------------------
>
> Key: HIVE-25156
> URL: https://issues.apache.org/jira/browse/HIVE-25156
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 3.1.2
> Reporter: kang
> Priority: Major
>
> c'c我的hive版本为3.1.2,当我设置reduce数后,返回的结果会是 结果数/reduce数 的数据量。
> 下面为我的sql
> select b.stas_day ,d.year_week_curr_wednesday as curr_week-- 本周
> ,count(distinct a.parent_enterprise_user_code) as ents
> ,count(distinct b.user_code) as charge_users
> ,sum(nvl(today_charge_electricity,0)) as today_charge_electricity
> from tmp.tmp_ent a,(select year_week_curr_wednesday,year_week_last_wednesday
> from dim.dim_date a
> where a.date_id >='2021-05-13'
> and a.date_id<='2021-05-19'
> GROUP BY year_week_curr_wednesday,year_week_last_wednesday) d
> left join tmp.dws_user_province_trade_dat b
> on a.user_code = b.user_code
> and d.year_week_curr_wednesday = b.stas_week
> where 1=1
> group by b.stas_day ,d.year_week_curr_wednesday-- 周
> ;
--
This message was sent by Atlassian Jira
(v8.3.4#803005)