You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/01/17 02:40:00 UTC

[jira] [Commented] (IMPALA-11843) IndexOutOfBoundsException in analytic limit pushdown

    [ https://issues.apache.org/jira/browse/IMPALA-11843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17677578#comment-17677578 ] 

ASF subversion and git services commented on IMPALA-11843:
----------------------------------------------------------

Commit b0009db40b7a532694987a0f4280b322d72f07b7 in impala's branch refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=b0009db40 ]

IMPALA-11843: Fix IndexOutOfBoundsException in analytic limit pushdown

When finding analytic conjuncts for analytic limit pushdown, the
following conditions are checked:
 - The conjunct should be a binary predicate
 - Left hand side is a SlotRef referencing the analytic expression, e.g.
   "rn" of "row_number() as rn"
 - The underlying analytic function is rank(), dense_rank() or row_number()
 - The window frame is UNBOUNDED PRECEDING to CURRENT ROW
 - Right hand side is a valid numeric limit
 - The op is =, <, or <=
See more details in AnalyticPlanner.inferPartitionLimits().

While checking the 2nd and 3rd condition, we get the source exprs of the
SlotRef. The source exprs could be empty if the SlotRef is actually
referencing a column of the table, i.e. a column materialized by the
scan node. Currently, we check the first source expr directly regardless
whether the list is empty, which causes the IndexOutOfBoundsException.

This patch fixes it by augmenting the check to consider an empty list.
Also fixes a similar code in AnalyticEvalNode.

Tests:
 - Add FE and e2e regression tests

Change-Id: I26d6bd58be58d09a29b8b81972e76665f41cf103
Reviewed-on: http://gerrit.cloudera.org:8080/19422
Reviewed-by: Aman Sinha <am...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> IndexOutOfBoundsException in analytic limit pushdown
> ----------------------------------------------------
>
>                 Key: IMPALA-11843
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11843
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 4.0.0, Impala 4.1.0, Impala 4.2.0, Impala 4.1.1
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>
> The following query fails with IndexOutOfBoundsException:
> {code:sql}
> create table tbl (id int);
> select id from (
>   select id, 
>     row_number() over (order by id) rn, 
>     max(id) over () max_id
>   from tbl
> ) t
> where id = max_id and rn < 10;
> ERROR: IndexOutOfBoundsException: Index: 0, Size: 0
> {code}
> The stacktrace in logs:
> {noformat}
> I0116 15:55:46.766265 23944 Frontend.java:2062] be402cb92ecc5490:11cbe79000000000] Analysis and authorization finished.
> I0116 15:55:46.803608 23944 jni-util.cc:288] be402cb92ecc5490:11cbe79000000000] java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>         at java.util.ArrayList.rangeCheck(ArrayList.java:659)
>         at java.util.ArrayList.get(ArrayList.java:435)
>         at org.apache.impala.planner.AnalyticPlanner.inferPartitionLimits(AnalyticPlanner.java:914)
>         at org.apache.impala.planner.AnalyticPlanner.createSingleNodePlan(AnalyticPlanner.java:115)
>         at org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:295)
>         at org.apache.impala.planner.SingleNodePlanner.createInlineViewPlan(SingleNodePlanner.java:1244)
>         at org.apache.impala.planner.SingleNodePlanner.createTableRefNode(SingleNodePlanner.java:2208)
>         at org.apache.impala.planner.SingleNodePlanner.createTableRefsPlan(SingleNodePlanner.java:931)
>         at org.apache.impala.planner.SingleNodePlanner.createSelectPlan(SingleNodePlanner.java:750)
>         at org.apache.impala.planner.SingleNodePlanner.createQueryPlan(SingleNodePlanner.java:278)
>         at org.apache.impala.planner.SingleNodePlanner.createSingleNodePlan(SingleNodePlanner.java:170)
>         at org.apache.impala.planner.Planner.createPlanFragments(Planner.java:120)
>         at org.apache.impala.planner.Planner.createPlans(Planner.java:249)
>         at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1733)
>         at org.apache.impala.service.Frontend.getPlannedExecRequest(Frontend.java:2344)
>         at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:2181)
>         at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1967)
>         at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1789)
>         at org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164)
> {noformat}
> There is a predicate "rn < 10" on the row_number() results so the limit is considered push down into the inline view to make it a TopN query. While considering the conjuncts, the other predicate "id = max_id" is also checked. It fails the following code in AnalyticPlanner.inferPartitionLimits():
> {code:java}
>       List<Expr> lhsSourceExprs = ((SlotRef) lhs).getDesc().getSourceExprs();
>       if (lhsSourceExprs.size() > 1 ||
>             !(lhsSourceExprs.get(0) instanceof AnalyticExpr)) {
>         continue;
>       }
> {code}
> [https://github.com/apache/impala/blob/f2f6b4b5804df036a5a7dc8ff23f8a0537b5bf97/fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java#L912-L916]
> 'lhsSourceExprs' is empty since "id" is a slot ref for a source table column. Thus lhsSourceExprs.get(0) throws the IndexOutOfBoundsException.
> To workaround this bug, users can disable this optimization by query option:
> {code:java}
> set ANALYTIC_RANK_PUSHDOWN_THRESHOLD=0;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org