You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "kang (Jira)" <ji...@apache.org> on 2021/05/24 13:44:00 UTC

[jira] [Updated] (HIVE-25156) reduce数设置为2后,查出的数据减少一半

     [ https://issues.apache.org/jira/browse/HIVE-25156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

kang updated HIVE-25156:
------------------------
    Description: 
c'c我的hive版本为3.1.2,当我设置reduce数后,返回的结果会是 结果数/reduce数  的数据量。

下面为我的sql 

select b.stas_day ,d.year_week_curr_wednesday as curr_week-- 本周
 ,count(distinct a.parent_enterprise_user_code) as ents
 ,count(distinct b.user_code) as charge_users 
 ,sum(nvl(today_charge_electricity,0)) as today_charge_electricity 
 from tmp.tmp_ent a,(select year_week_curr_wednesday,year_week_last_wednesday
 from dim.dim_date a 
 where a.date_id >='2021-05-13'
 and a.date_id<='2021-05-19'
 GROUP BY year_week_curr_wednesday,year_week_last_wednesday) d 
 left join tmp.dws_user_province_trade_dat b 
 on a.user_code = b.user_code 
 and d.year_week_curr_wednesday = b.stas_week
 where 1=1
 group by b.stas_day ,d.year_week_curr_wednesday-- 周
 ;

  was:
我的hive版本为3.1.2,当我设置reduce数后,返回的结果会是 结果数/reduce数  的数据量。

下面为我的sql 

select b.stas_day ,d.year_week_curr_wednesday as curr_week-- 本周
 ,count(distinct a.parent_enterprise_user_code) as ents
 ,count(distinct b.user_code) as charge_users 
 ,sum(nvl(today_charge_electricity,0)) as today_charge_electricity 
 from tmp.tmp_ent a,(select year_week_curr_wednesday,year_week_last_wednesday
 from dim.dim_date a 
 where a.date_id >='2021-05-13'
 and a.date_id<='2021-05-19'
 GROUP BY year_week_curr_wednesday,year_week_last_wednesday) d 
 left join tmp.dws_user_province_trade_dat b 
 on a.user_code = b.user_code 
 and d.year_week_curr_wednesday = b.stas_week
 where 1=1
 group by b.stas_day ,d.year_week_curr_wednesday-- 周
;


> reduce数设置为2后,查出的数据减少一半
> ----------------------
>
>                 Key: HIVE-25156
>                 URL: https://issues.apache.org/jira/browse/HIVE-25156
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 3.1.2
>            Reporter: kang
>            Priority: Major
>
> c'c我的hive版本为3.1.2,当我设置reduce数后,返回的结果会是 结果数/reduce数  的数据量。
> 下面为我的sql 
> select b.stas_day ,d.year_week_curr_wednesday as curr_week-- 本周
>  ,count(distinct a.parent_enterprise_user_code) as ents
>  ,count(distinct b.user_code) as charge_users 
>  ,sum(nvl(today_charge_electricity,0)) as today_charge_electricity 
>  from tmp.tmp_ent a,(select year_week_curr_wednesday,year_week_last_wednesday
>  from dim.dim_date a 
>  where a.date_id >='2021-05-13'
>  and a.date_id<='2021-05-19'
>  GROUP BY year_week_curr_wednesday,year_week_last_wednesday) d 
>  left join tmp.dws_user_province_trade_dat b 
>  on a.user_code = b.user_code 
>  and d.year_week_curr_wednesday = b.stas_week
>  where 1=1
>  group by b.stas_day ,d.year_week_curr_wednesday-- 周
>  ;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)