You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (JIRA)" <ji...@apache.org> on 2019/08/15 14:18:00 UTC
[jira] [Commented] (IMPALA-8790) IllegalStateException: Illegal
reference to non-materialized slot
[ https://issues.apache.org/jira/browse/IMPALA-8790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908134#comment-16908134 ]
Quanlong Huang commented on IMPALA-8790:
----------------------------------------
We can work around the bug by rewriting the query to avoid putting the analytic functions and the aggregations in the same query block:
{code:sql}
explain select uid, cid,
rank() over (partition by uid order by cnt desc)
from (select uid, cid, count(*) as cnt from foo) w;
{code}
Impala has an optimization for analytic functions if its query block contains group-by expressions of aggregations in the same time. It will trim the exchange slots according to the partition-by and the group-by slots. See this [commit|https://github.com/apache/impala/commit/0b3124ab35402f1ab8141e8ffcca28a5a481c81e] for more details.
In IMPALA-110, there's a bug using wrong group-by expressions referencing the non-materialized slots (of the inline view) in this optimization. For the rewritten query, it avoids putting the analytic function rank() and the aggregation (count\(*\) group by ...) in the same query block. So the above optimization is bypassed, which avoids the bug.
Patch for review: https://gerrit.cloudera.org/c/14063
> IllegalStateException: Illegal reference to non-materialized slot
> -----------------------------------------------------------------
>
> Key: IMPALA-8790
> URL: https://issues.apache.org/jira/browse/IMPALA-8790
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Major
> Attachments: foo.parq
>
>
> Reproduce:
> {code:sql}
> $ hdfs dfs -put foo.parq hdfs:///tmp
> impala> create table foo (uid string, cid string) stored as parquet;
> impala> load data inpath 'hdfs:///tmp/foo.parq' into table foo;
> {code}
> With the stats, the following query hits an IllegalStateException:
> {code:sql}
> impala> compute stats foo;
> impala> explain select uid, cid,
> rank() over (partition by uid order by count(*) desc)
> from (select uid, cid from foo) w
> group by uid, cid;
> ERROR: IllegalStateException: Illegal reference to non-materialized slot: tid=1 sid=2{code}
> Without the stats, it runs successfully:
> {code:sql}
> impala> drop stats foo;
> impala> explain select uid, cid,
> rank() over (partition by uid order by count(*) desc)
> from (select uid, cid from foo) w
> group by uid, cid;
> +------------------------------------------------------------------------------------+
> | Explain String |
> +------------------------------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=84.02MB Threads=5 |
> | Per-Host Resource Estimates: Memory=304MB |
> | WARNING: The following tables are missing relevant table and/or column statistics. |
> | common_action.foo |
> | |
> | PLAN-ROOT SINK |
> | | |
> | 07:EXCHANGE [UNPARTITIONED] |
> | | |
> | 03:ANALYTIC |
> | | functions: rank() |
> | | partition by: uid |
> | | order by: count(*) DESC |
> | | window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
> | | row-size=40B cardinality=1.10K |
> | | |
> | 02:SORT |
> | | order by: uid ASC NULLS FIRST, count(*) DESC |
> | | row-size=32B cardinality=1.10K |
> | | |
> | 06:EXCHANGE [HASH(uid)] |
> | | |
> | 05:AGGREGATE [FINALIZE] |
> | | output: count:merge(*) |
> | | group by: uid, cid |
> | | row-size=32B cardinality=1.10K |
> | | |
> | 04:EXCHANGE [HASH(uid,cid)] |
> | | |
> | 01:AGGREGATE [STREAMING] |
> | | output: count(*) |
> | | group by: uid, cid |
> | | row-size=32B cardinality=1.10K |
> | | |
> | 00:SCAN HDFS [common_action.foo] |
> | HDFS partitions=1/1 files=1 size=5.19KB |
> | row-size=24B cardinality=1.10K |
> +------------------------------------------------------------------------------------+
> Fetched 37 row(s) in 0.03s
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org