You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Jing Zhang (Jira)" <ji...@apache.org> on 2022/01/13 08:07:00 UTC

[jira] [Updated] (FLINK-25641) Unexpected aggregate plan after load hive module

     [ https://issues.apache.org/jira/browse/FLINK-25641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jing Zhang updated FLINK-25641:
-------------------------------
    Description: 
When using flink batch sql to run hive sql queries, we load hive module to use Hive built-in functions.
However some query plan plan are unexpected after loading hive module.
For the following sql,

{code:sql}
load module hive;
use modules hive,core;
set table.sql-dialect=hive;

select
  account_id,
  sum(impression)
from test_db.test_table where dt = '2022-01-10' and hi = '0100' group by account_id
{code}
The planner is:

 !image-2022-01-13-15-55-40-958.png! 

After remove 'load mudiles hive; use modules hive, core;', the planner is:

 !image-2022-01-13-15-52-27-783.png! 

After loading hive modules, hash aggregate is not final plan because the aggregate function is `HiveAggSqlFunction` and the aggregate buffer is not fixed length which type is as following:
{code:java}
LEGACY('RAW', 'ANY<org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$AggregationBuffer>')
{code}




  was:
When using flink batch sql to run hive sql queries, we load hive module to use Hive built-in functions.
However some query plan plan are unexpected after loading hive module.
For the following sql,

{code:sql}
load module hive;
use modules hive,core;
set table.sql-dialect=hive;

select
  account_id,
  sum(impression)
from test_db.test_table where dt = '2022-01-10' and hi = '0100' group by account_id
{code}
The planner is:

 !image-2022-01-13-15-55-40-958.png! 

After remove 'load mudiles hive; use modules hive, core;', the planner is:

 !image-2022-01-13-15-52-27-783.png! 

After loading hive modules, hash aggregate is not final plan because the aggregate buffer is not fixed length which type is as following.
{code:java}
LEGACY('RAW', 'ANY<org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$AggregationBuffer>')
{code}





> Unexpected aggregate plan after load hive module
> ------------------------------------------------
>
>                 Key: FLINK-25641
>                 URL: https://issues.apache.org/jira/browse/FLINK-25641
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table SQL / Planner
>            Reporter: Jing Zhang
>            Priority: Major
>         Attachments: image-2022-01-13-15-52-27-783.png, image-2022-01-13-15-55-40-958.png
>
>
> When using flink batch sql to run hive sql queries, we load hive module to use Hive built-in functions.
> However some query plan plan are unexpected after loading hive module.
> For the following sql,
> {code:sql}
> load module hive;
> use modules hive,core;
> set table.sql-dialect=hive;
> select
>   account_id,
>   sum(impression)
> from test_db.test_table where dt = '2022-01-10' and hi = '0100' group by account_id
> {code}
> The planner is:
>  !image-2022-01-13-15-55-40-958.png! 
> After remove 'load mudiles hive; use modules hive, core;', the planner is:
>  !image-2022-01-13-15-52-27-783.png! 
> After loading hive modules, hash aggregate is not final plan because the aggregate function is `HiveAggSqlFunction` and the aggregate buffer is not fixed length which type is as following:
> {code:java}
> LEGACY('RAW', 'ANY<org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$AggregationBuffer>')
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)