You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Alessandro Solimando (Jira)" <ji...@apache.org> on 2022/11/07 11:22:00 UTC
[jira] [Commented] (HIVE-26678) In the filter criteria associated with multiple tables, the filter result of the subquery by not in or in is incorrect.
[ https://issues.apache.org/jira/browse/HIVE-26678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629757#comment-17629757 ]
Alessandro Solimando commented on HIVE-26678:
---------------------------------------------
Non-CBO codepath has a lot of flaws, it should be probably discontinued at this point given that CBO support is mature and dates back a while.
Any specific reason why you are running without CBO in the first place?
> In the filter criteria associated with multiple tables, the filter result of the subquery by not in or in is incorrect.
> -----------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-26678
> URL: https://issues.apache.org/jira/browse/HIVE-26678
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Affects Versions: 3.1.0
> Reporter: lotan
> Priority: Major
>
> create testtable as follow:
> create table test101 (id string,id2 string);
> create table test102 (id string,id2 string);
> create table test103 (id string,id2 string);
> create table test104 (id string,id2 string);
> when cbo is false,run the following SQL statement:
> explain select count(1) from test101 t1
> left join test102 t2 on t1.id=t2.id
> left join test103 t3 on t1.id=t3.id2
> where t1.id in (select s.id from test104 s)
> and t3.id2='123';
> you will see:
> The filter criteria in the right table are lost.
> The execution plan is as follows:
> +-----------------------------------------------------------------------------------------------------+
> | Explain |
> +-----------------------------------------------------------------------------------------------------+
> | STAGE DEPENDENCIES: |
> | Stage-9 is a root stage |
> | Stage-3 depends on stages: Stage-9 |
> | Stage-0 depends on stages: Stage-3 |
> | |
> | STAGE PLANS: |
> | Stage: Stage-9 |
> | Map Reduce Local Work |
> | Alias -> Map Local Tables: |
> | sq_1:s |
> | Fetch Operator |
> | limit: -1 |
> | t2 |
> | Fetch Operator |
> | limit: -1 |
> | t3 |
> | Fetch Operator |
> | limit: -1 |
> | Alias -> Map Local Operator Tree: |
> | sq_1:s |
> | TableScan |
> | alias: s |
> | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | Filter Operator |
> | predicate: id is not null (type: boolean) |
> | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | Select Operator |
> | expressions: id (type: string) |
> | outputColumnNames: _col0 |
> | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | Group By Operator |
> | keys: _col0 (type: string) |
> | mode: hash |
> | outputColumnNames: _col0 |
> | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | HashTable Sink Operator |
> | keys: |
> | 0 _col0 (type: string) |
> | 1 _col0 (type: string) |
> | t2 |
> | TableScan |
> | alias: t2 |
> | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | Filter Operator |
> | predicate: id is not null (type: boolean) |
> | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | HashTable Sink Operator |
> | keys: |
> | 0 id (type: string) |
> | 1 id (type: string) |
> | 2 id2 (type: string) |
> | t3 |
> | TableScan |
> | alias: t3 |
> | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | Filter Operator |
> | predicate: id2 is not null (type: boolean) |
> | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | HashTable Sink Operator |
> | keys: |
> | 0 id (type: string) |
> | 1 id (type: string) |
> | 2 id2 (type: string) |
> | |
> | Stage: Stage-3 |
> | Map Reduce |
> | Map Operator Tree: |
> | TableScan |
> | alias: t1 |
> | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | Filter Operator |
> | predicate: id is not null (type: boolean) |
> | Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | Map Join Operator |
> | condition map: |
> | Left Outer Join 0 to 1 |
> | Left Outer Join 0 to 2 |
> | keys: |
> | 0 id (type: string) |
> | 1 id (type: string) |
> | 2 id2 (type: string) |
> | outputColumnNames: _col0 |
> | Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | Map Join Operator |
> | condition map: |
> | Left Semi Join 0 to 1 |
> | keys: |
> | 0 _col0 (type: string) |
> | 1 _col0 (type: string) |
> | Statistics: Num rows: 2 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
> | Group By Operator |
> | aggregations: count(1) |
> | mode: hash |
> | outputColumnNames: _col0 |
> | Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE |
> | Reduce Output Operator |
> | sort order: |
> | Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE |
> | value expressions: _col0 (type: bigint) |
> | Local Work: |
> +-----------------------------------------------------------------------------------------------------+
> | Explain |
> +-----------------------------------------------------------------------------------------------------+
> | Map Reduce Local Work |
> | Reduce Operator Tree: |
> | Group By Operator |
> | aggregations: count(VALUE._col0) |
> | mode: mergepartial |
> | outputColumnNames: _col0 |
> | Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE |
> | File Output Operator |
> | compressed: false |
> | Statistics: Num rows: 1 Data size: 8 Basic stats: PARTIAL Column stats: NONE |
> | table: |
> | input format: org.apache.hadoop.mapred.SequenceFileInputFormat |
> | output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat |
> | serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |
> | |
> | Stage: Stage-0 |
> | Fetch Operator |
> | limit: -1 |
> | Processor Tree: |
> | ListSink |
> | |
> +-----------------------------------------------------------------------------------------------------+
--
This message was sent by Atlassian Jira
(v8.20.10#820010)