You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2023/04/06 11:59:00 UTC
[jira] [Updated] (IMPALA-12007) wrong result with Pushdown runtime min-max filters on bitlong and decimal
[ https://issues.apache.org/jira/browse/IMPALA-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang updated IMPALA-12007:
------------------------------------
Priority: Critical (was: Major)
> wrong result with Pushdown runtime min-max filters on bitlong and decimal
> -------------------------------------------------------------------------
>
> Key: IMPALA-12007
> URL: https://issues.apache.org/jira/browse/IMPALA-12007
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 4.1.1
> Environment: centos
> Reporter: daicheng
> Priority: Critical
> Attachments: image-2023-03-16-18-03-55-375.png, image-2023-03-16-18-05-03-201.png, image-2023-03-16-18-08-55-259.png, image-2023-03-16-18-11-28-239.png, image-2023-03-16-18-12-30-062.png, image-2023-03-16-18-17-57-745.png, image-2023-03-17-15-09-40-463.png, image-2023-03-17-15-13-49-139.png, image-2023-03-17-15-17-23-669.png
>
>
> h1. wrong result with Pushdown runtime min-max filters
> h2. table description
> I have two tables in kudu,prod_{_}tm_jira{_}_AO_69E499_REQ_RELATION is like below,and i create external table in impala:
> {code:java}
> TABLE prod_tm_jira_AO_69E499_REQ_RELATION (
> ID INT32 NOT NULL,
> kudu_change_time UNIXTIME_MICROS NOT NULL,
> kudu_update_time UNIXTIME_MICROS NOT NULL,
> kudu_is_deleted BOOL NOT NULL,
> CHILD INT64 NULLABLE,
> CHILD_PROJECT INT64 NULLABLE,
> PARENT INT64 NULLABLE,
> PARENT_PROJECT INT64 NULLABLE,
> SEQUENCE INT32 NULLABLE,
> SEQ_BALANCER INT32 NULLABLE,
> PRIMARY KEY (ID)
> )
> HASH (ID) PARTITIONS 2,
> RANGE (ID) (
> PARTITION UNBOUNDED
> )
> OWNER yarn
> REPLICAS 3
> COMMENT {code}
> {code:java}
> CREATE EXTERNAL TABLE prod__tm_jira.ao_69e499_req_relation STORED AS KUDU TBLPROPERTIES ('STATS_GENERATED'='TASK', 'accessType'='8', 'impala.lastComputeStatsTime'='1678903202', 'kudu.master_addresses'='kudu-host', 'kudu.table_name'='prod__tm_jira__AO_69E499_REQ_RELATION', 'numFiles'='0', 'numRows'='62204', 'totalSize'='0') {code}
> ‘child’ is bigint in impala,and table AO_69E499_REQ_RELATION table has 6273 rows:
> !image-2023-03-17-15-17-23-669.png! !image-2023-03-16-18-05-03-201.png|width=395,height=295!
> and another table describe below:
> {code:java}
> TABLE prod_tm_jira_jiraissue (
> ID DECIMAL(18, 0) NOT NULL,
> kudu_change_time UNIXTIME_MICROS NOT NULL,
> kudu_update_time UNIXTIME_MICROS NOT NULL,
> kudu_is_deleted BOOL NOT NULL,
> pkey STRING NULLABLE,
> issuenum DECIMAL(18, 0) NULLABLE,
> PROJECT DECIMAL(18, 0) NULLABLE,
> REPORTER STRING NULLABLE,
> ASSIGNEE STRING NULLABLE,
> CREATOR STRING NULLABLE,
> issuetype STRING NULLABLE,
> SUMMARY STRING NULLABLE,
> DESCRIPTION STRING NULLABLE,
> ENVIRONMENT STRING NULLABLE,
> PRIORITY STRING NULLABLE,
> RESOLUTION STRING NULLABLE,
> issuestatus STRING NULLABLE,
> CREATED UNIXTIME_MICROS NULLABLE,
> UPDATED UNIXTIME_MICROS NULLABLE,
> DUEDATE UNIXTIME_MICROS NULLABLE,
> RESOLUTIONDATE UNIXTIME_MICROS NULLABLE,
> VOTES DECIMAL(18, 0) NULLABLE,
> WATCHES DECIMAL(18, 0) NULLABLE,
> TIMEORIGINALESTIMATE DECIMAL(18, 0) NULLABLE,
> TIMEESTIMATE DECIMAL(18, 0) NULLABLE,
> TIMESPENT DECIMAL(18, 0) NULLABLE,
> WORKFLOW_ID DECIMAL(18, 0) NULLABLE,
> SECURITY DECIMAL(18, 0) NULLABLE,
> FIXFOR DECIMAL(18, 0) NULLABLE,
> COMPONENT DECIMAL(18, 0) NULLABLE,
> ARCHIVEDBY STRING NULLABLE,
> ARCHIVEDDATE UNIXTIME_MICROS NULLABLE,
> ARCHIVED STRING NULLABLE,
> PRIMARY KEY (ID)
> )
> HASH (ID) PARTITIONS 2,
> RANGE (ID) (
> PARTITION UNBOUNDED
> )
> OWNER yarn
> REPLICAS 3
> COMMEN {code}
> this is the external table in impala:
> {code:java}
> CREATE EXTERNAL TABLE prod__tm_jira.jiraissue STORED AS KUDU TBLPROPERTIES ('STATS_GENERATED'='TASK', 'accessType'='8', 'impala.lastComputeStatsTime'='1678903206', 'kudu.master_addresses'='kudu-host', 'kudu.table_name'='prod__tm_jira__jiraissue', 'numFiles'='0', 'numRows'='295165', 'totalSize'='0') {code}
> 'id' is decimal in impala,and table jiraissue has 295336 rows
> !image-2023-03-17-15-13-49-139.png!!image-2023-03-16-18-03-55-375.png|width=514,height=452!
> h2. sql description
>
> {code:java}
> select
> c1.id,
> c3.child,
> c1.project,
> c1.summary,
> c1.description,
> c1.created
> from prod__tm_jira.AO_69E499_REQ_RELATION c3
> left join prod__tm_jira.jiraissue c1
> on c1.id=c3.child
> where c1.issuetype IN ('11801','10400') and c1.id=324482 {code}
> !image-2023-03-16-18-08-55-259.png|width=765,height=379!
> when i execute the sql,i got the right result ,and from execute plan i found:
> !image-2023-03-16-18-11-28-239.png|width=724,height=526!
> after i execute follow sql,i expect more rows result but i got 0 result:
> {code:java}
> select
> c1.id,
> c3.child,
> c1.project,
> c1.summary,
> c1.description,
> c1.created
> from prod__tm_jira.AO_69E499_REQ_RELATION c3
> left join prod__tm_jira.jiraissue c1
> on c1.id=c3.child
> where c1.issuetype IN ('11801','10400')
> -- and c1.id=324482 {code}
> !image-2023-03-16-18-12-30-062.png|width=545,height=316!
> and the sql plan like below,i found c3.child didn't cast to decimal,and runtime filter works to id.
> !image-2023-03-16-18-17-57-745.png|width=696,height=692!
> *I got 0 results,and i found impala runtime filter is worked,so is it possible that impala didn't identify data type with runtime filter?in this case 'id' is decimal but 'child' is bigint.*
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org