You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "daicheng (Jira)" <ji...@apache.org> on 2023/03/16 10:29:00 UTC

[jira] [Updated] (IMPALA-12007) wrong result with Pushdown runtime min-max filters on bitlong and decimal

     [ https://issues.apache.org/jira/browse/IMPALA-12007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

daicheng updated IMPALA-12007:
------------------------------
    Description: 
h1. wrong result with Pushdown runtime min-max filters
h2. table description

I have two tables in kudu,prod_{_}tm_jira{_}_AO_69E499_REQ_RELATION is like this:

 
{code:java}
TABLE prod_tm_jira_AO_69E499_REQ_RELATION (
    ID INT32 NOT NULL,
    kudu_change_time UNIXTIME_MICROS NOT NULL,
    kudu_update_time UNIXTIME_MICROS NOT NULL,
    kudu_is_deleted BOOL NOT NULL,
    CHILD INT64 NULLABLE,
    CHILD_PROJECT INT64 NULLABLE,
    PARENT INT64 NULLABLE,
    PARENT_PROJECT INT64 NULLABLE,
    SEQUENCE INT32 NULLABLE,
    SEQ_BALANCER INT32 NULLABLE,
    PRIMARY KEY (ID)
)
HASH (ID) PARTITIONS 2,
RANGE (ID) (
    PARTITION UNBOUNDED
)
OWNER yarn
REPLICAS 3
COMMENT {code}
 

and another table describe below:

 
{code:java}
TABLE prod_tm_jira_jiraissue (
    ID DECIMAL(18, 0) NOT NULL,
    kudu_change_time UNIXTIME_MICROS NOT NULL,
    kudu_update_time UNIXTIME_MICROS NOT NULL,
    kudu_is_deleted BOOL NOT NULL,
    pkey STRING NULLABLE,
    issuenum DECIMAL(18, 0) NULLABLE,
    PROJECT DECIMAL(18, 0) NULLABLE,
    REPORTER STRING NULLABLE,
    ASSIGNEE STRING NULLABLE,
    CREATOR STRING NULLABLE,
    issuetype STRING NULLABLE,
    SUMMARY STRING NULLABLE,
    DESCRIPTION STRING NULLABLE,
    ENVIRONMENT STRING NULLABLE,
    PRIORITY STRING NULLABLE,
    RESOLUTION STRING NULLABLE,
    issuestatus STRING NULLABLE,
    CREATED UNIXTIME_MICROS NULLABLE,
    UPDATED UNIXTIME_MICROS NULLABLE,
    DUEDATE UNIXTIME_MICROS NULLABLE,
    RESOLUTIONDATE UNIXTIME_MICROS NULLABLE,
    VOTES DECIMAL(18, 0) NULLABLE,
    WATCHES DECIMAL(18, 0) NULLABLE,
    TIMEORIGINALESTIMATE DECIMAL(18, 0) NULLABLE,
    TIMEESTIMATE DECIMAL(18, 0) NULLABLE,
    TIMESPENT DECIMAL(18, 0) NULLABLE,
    WORKFLOW_ID DECIMAL(18, 0) NULLABLE,
    SECURITY DECIMAL(18, 0) NULLABLE,
    FIXFOR DECIMAL(18, 0) NULLABLE,
    COMPONENT DECIMAL(18, 0) NULLABLE,
    ARCHIVEDBY STRING NULLABLE,
    ARCHIVEDDATE UNIXTIME_MICROS NULLABLE,
    ARCHIVED STRING NULLABLE,
    PRIMARY KEY (ID)
)
HASH (ID) PARTITIONS 2,
RANGE (ID) (
    PARTITION UNBOUNDED
)
OWNER yarn
REPLICAS 3
COMMEN {code}
 
h2. table jiraissue has 295336 rows

!image-2023-03-16-18-03-55-375.png|width=514,height=452!
h2. table AO_69E499_REQ_RELATION table has 6273 rows:

 

!image-2023-03-16-18-05-03-201.png|width=636,height=475!
h2. sql description
{code:java}
select
c1.id,
c3.child,
c1.project,
c1.summary,
c1.description,
c1.created
 from prod__tm_jira.AO_69E499_REQ_RELATION c3
left join prod__tm_jira.jiraissue c1
 on c1.id=c3.child 
where c1.issuetype IN ('11801','10400') and c1.id=324482 {code}
!image-2023-03-16-18-08-55-259.png!

when i execute the sql,i got the right result ,and from execute plan i found:

!image-2023-03-16-18-11-28-239.png!

after i execute follow sql,i expect more rows result but i got 0 result: 
{code:java}
select
c1.id,
c3.child,
c1.project,
c1.summary,
c1.description,
c1.created
 from prod__tm_jira.AO_69E499_REQ_RELATION c3
left join prod__tm_jira.jiraissue c1
 on c1.id=c3.child 
where c1.issuetype IN ('11801','10400')
-- and c1.id=324482 {code}
!image-2023-03-16-18-12-30-062.png!

and the sql plan like below,i found c3.child didn't cast to decimal,and runtime filter works to id.

!image-2023-03-16-18-17-57-745.png!

I got 0 results,and i found impala runtime filter is worked,so is it possible that  impala didn't identify data type with runtime filter?in this case 'id' is decimal but 'child' is bigint.

  was:
h1. wrong result with Pushdown runtime min-max filters
h2. table description

I have two tables in kudu,prod__tm_jira__AO_69E499_REQ_RELATION is like this:

"""

TABLE prod__tm_jira__AO_69E499_REQ_RELATION (
    ID INT32 NOT NULL,
    kudu_change_time UNIXTIME_MICROS NOT NULL,
    kudu_update_time UNIXTIME_MICROS NOT NULL,
    kudu_is_deleted BOOL NOT NULL,
    CHILD INT64 NULLABLE,
    CHILD_PROJECT INT64 NULLABLE,
    PARENT INT64 NULLABLE,
    PARENT_PROJECT INT64 NULLABLE,
    SEQUENCE INT32 NULLABLE,
    SEQ_BALANCER INT32 NULLABLE,
    PRIMARY KEY (ID)
)
HASH (ID) PARTITIONS 2,
RANGE (ID) (
    PARTITION UNBOUNDED
)
OWNER yarn
REPLICAS 3
COMMENT

"""

and another table describe below:

"""

TABLE prod__tm_jira__jiraissue (
    ID DECIMAL(18, 0) NOT NULL,
    kudu_change_time UNIXTIME_MICROS NOT NULL,
    kudu_update_time UNIXTIME_MICROS NOT NULL,
    kudu_is_deleted BOOL NOT NULL,
    pkey STRING NULLABLE,
    issuenum DECIMAL(18, 0) NULLABLE,
    PROJECT DECIMAL(18, 0) NULLABLE,
    REPORTER STRING NULLABLE,
    ASSIGNEE STRING NULLABLE,
    CREATOR STRING NULLABLE,
    issuetype STRING NULLABLE,
    SUMMARY STRING NULLABLE,
    DESCRIPTION STRING NULLABLE,
    ENVIRONMENT STRING NULLABLE,
    PRIORITY STRING NULLABLE,
    RESOLUTION STRING NULLABLE,
    issuestatus STRING NULLABLE,
    CREATED UNIXTIME_MICROS NULLABLE,
    UPDATED UNIXTIME_MICROS NULLABLE,
    DUEDATE UNIXTIME_MICROS NULLABLE,
    RESOLUTIONDATE UNIXTIME_MICROS NULLABLE,
    VOTES DECIMAL(18, 0) NULLABLE,
    WATCHES DECIMAL(18, 0) NULLABLE,
    TIMEORIGINALESTIMATE DECIMAL(18, 0) NULLABLE,
    TIMEESTIMATE DECIMAL(18, 0) NULLABLE,
    TIMESPENT DECIMAL(18, 0) NULLABLE,
    WORKFLOW_ID DECIMAL(18, 0) NULLABLE,
    SECURITY DECIMAL(18, 0) NULLABLE,
    FIXFOR DECIMAL(18, 0) NULLABLE,
    COMPONENT DECIMAL(18, 0) NULLABLE,
    ARCHIVEDBY STRING NULLABLE,
    ARCHIVEDDATE UNIXTIME_MICROS NULLABLE,
    ARCHIVED STRING NULLABLE,
    PRIMARY KEY (ID)
)
HASH (ID) PARTITIONS 2,
RANGE (ID) (
    PARTITION UNBOUNDED
)
OWNER yarn
REPLICAS 3
COMMENT

"""

 

jiraissue has 295336 rows

!image-2023-03-16-18-03-55-375.png|width=514,height=452!

AO_69E499_REQ_RELATION table has 6273 rows:

 

!image-2023-03-16-18-05-03-201.png|width=636,height=475!
h2. sql description

!image-2023-03-16-18-08-55-259.png!

when i execute the sql,i got the right result ,and from execute plan i found:

!image-2023-03-16-18-11-28-239.png!

after i execute follow sql,i expect more rows result but i got 0 result: 

!image-2023-03-16-18-12-30-062.png!

and the sql plan like below,i found c3.child didn't cast to decimal,and runtime filter works to id.

!image-2023-03-16-18-17-57-745.png!

I got 0 results, is it possible that  impala didn't identify data type with runtime filter?in this case 'id' is decimal but 'child' is bigint.


> wrong result with Pushdown runtime min-max filters on bitlong and decimal
> -------------------------------------------------------------------------
>
>                 Key: IMPALA-12007
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12007
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 4.1.1
>         Environment: centos
>            Reporter: daicheng
>            Priority: Major
>         Attachments: image-2023-03-16-18-03-55-375.png, image-2023-03-16-18-05-03-201.png, image-2023-03-16-18-08-55-259.png, image-2023-03-16-18-11-28-239.png, image-2023-03-16-18-12-30-062.png, image-2023-03-16-18-17-57-745.png
>
>
> h1. wrong result with Pushdown runtime min-max filters
> h2. table description
> I have two tables in kudu,prod_{_}tm_jira{_}_AO_69E499_REQ_RELATION is like this:
>  
> {code:java}
> TABLE prod_tm_jira_AO_69E499_REQ_RELATION (
>     ID INT32 NOT NULL,
>     kudu_change_time UNIXTIME_MICROS NOT NULL,
>     kudu_update_time UNIXTIME_MICROS NOT NULL,
>     kudu_is_deleted BOOL NOT NULL,
>     CHILD INT64 NULLABLE,
>     CHILD_PROJECT INT64 NULLABLE,
>     PARENT INT64 NULLABLE,
>     PARENT_PROJECT INT64 NULLABLE,
>     SEQUENCE INT32 NULLABLE,
>     SEQ_BALANCER INT32 NULLABLE,
>     PRIMARY KEY (ID)
> )
> HASH (ID) PARTITIONS 2,
> RANGE (ID) (
>     PARTITION UNBOUNDED
> )
> OWNER yarn
> REPLICAS 3
> COMMENT {code}
>  
> and another table describe below:
>  
> {code:java}
> TABLE prod_tm_jira_jiraissue (
>     ID DECIMAL(18, 0) NOT NULL,
>     kudu_change_time UNIXTIME_MICROS NOT NULL,
>     kudu_update_time UNIXTIME_MICROS NOT NULL,
>     kudu_is_deleted BOOL NOT NULL,
>     pkey STRING NULLABLE,
>     issuenum DECIMAL(18, 0) NULLABLE,
>     PROJECT DECIMAL(18, 0) NULLABLE,
>     REPORTER STRING NULLABLE,
>     ASSIGNEE STRING NULLABLE,
>     CREATOR STRING NULLABLE,
>     issuetype STRING NULLABLE,
>     SUMMARY STRING NULLABLE,
>     DESCRIPTION STRING NULLABLE,
>     ENVIRONMENT STRING NULLABLE,
>     PRIORITY STRING NULLABLE,
>     RESOLUTION STRING NULLABLE,
>     issuestatus STRING NULLABLE,
>     CREATED UNIXTIME_MICROS NULLABLE,
>     UPDATED UNIXTIME_MICROS NULLABLE,
>     DUEDATE UNIXTIME_MICROS NULLABLE,
>     RESOLUTIONDATE UNIXTIME_MICROS NULLABLE,
>     VOTES DECIMAL(18, 0) NULLABLE,
>     WATCHES DECIMAL(18, 0) NULLABLE,
>     TIMEORIGINALESTIMATE DECIMAL(18, 0) NULLABLE,
>     TIMEESTIMATE DECIMAL(18, 0) NULLABLE,
>     TIMESPENT DECIMAL(18, 0) NULLABLE,
>     WORKFLOW_ID DECIMAL(18, 0) NULLABLE,
>     SECURITY DECIMAL(18, 0) NULLABLE,
>     FIXFOR DECIMAL(18, 0) NULLABLE,
>     COMPONENT DECIMAL(18, 0) NULLABLE,
>     ARCHIVEDBY STRING NULLABLE,
>     ARCHIVEDDATE UNIXTIME_MICROS NULLABLE,
>     ARCHIVED STRING NULLABLE,
>     PRIMARY KEY (ID)
> )
> HASH (ID) PARTITIONS 2,
> RANGE (ID) (
>     PARTITION UNBOUNDED
> )
> OWNER yarn
> REPLICAS 3
> COMMEN {code}
>  
> h2. table jiraissue has 295336 rows
> !image-2023-03-16-18-03-55-375.png|width=514,height=452!
> h2. table AO_69E499_REQ_RELATION table has 6273 rows:
>  
> !image-2023-03-16-18-05-03-201.png|width=636,height=475!
> h2. sql description
> {code:java}
> select
> c1.id,
> c3.child,
> c1.project,
> c1.summary,
> c1.description,
> c1.created
>  from prod__tm_jira.AO_69E499_REQ_RELATION c3
> left join prod__tm_jira.jiraissue c1
>  on c1.id=c3.child 
> where c1.issuetype IN ('11801','10400') and c1.id=324482 {code}
> !image-2023-03-16-18-08-55-259.png!
> when i execute the sql,i got the right result ,and from execute plan i found:
> !image-2023-03-16-18-11-28-239.png!
> after i execute follow sql,i expect more rows result but i got 0 result: 
> {code:java}
> select
> c1.id,
> c3.child,
> c1.project,
> c1.summary,
> c1.description,
> c1.created
>  from prod__tm_jira.AO_69E499_REQ_RELATION c3
> left join prod__tm_jira.jiraissue c1
>  on c1.id=c3.child 
> where c1.issuetype IN ('11801','10400')
> -- and c1.id=324482 {code}
> !image-2023-03-16-18-12-30-062.png!
> and the sql plan like below,i found c3.child didn't cast to decimal,and runtime filter works to id.
> !image-2023-03-16-18-17-57-745.png!
> I got 0 results,and i found impala runtime filter is worked,so is it possible that  impala didn't identify data type with runtime filter?in this case 'id' is decimal but 'child' is bigint.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org