You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Aman Sinha (Jira)" <ji...@apache.org> on 2021/05/10 03:21:00 UTC
[jira] [Created] (IMPALA-10697) NDV for rank() expression is
incorrect
Aman Sinha created IMPALA-10697:
-----------------------------------
Summary: NDV for rank() expression is incorrect
Key: IMPALA-10697
URL: https://issues.apache.org/jira/browse/IMPALA-10697
Project: IMPALA
Issue Type: Bug
Components: Frontend
Reporter: Aman Sinha
In the following query the cardinality of the final Aggregate is always 1 regardless of the cardinality of its child. This is because the NDV of the analytic expr such as RANK seems to always be computed as 1 which is incorrect.
{noformat}
Query: explain select rnk, count(*) from (
select * from
(SELECT rank() OVER (ORDER BY ss_net_profit ASC) rnk
FROM store_sales ss1
WHERE ss_store_sk = 4) v1
where rnk < 1000) v2
group by rnk
+------------------------------------------------------------------------------------------+
| Explain String |
+------------------------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=13.94MB Threads=3 |
| Per-Host Resource Estimates: Memory=142MB |
| Analyzed query: SELECT rnk, count(*) FROM (SELECT * FROM (SELECT rank() OVER |
| (ORDER BY ss_net_profit ASC) rnk FROM tpcds.store_sales ss1 WHERE ss_store_sk = |
| CAST(4 AS INT)) v1 WHERE rnk < CAST(1000 AS BIGINT)) v2 GROUP BY rnk |
| |
| F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 |
| | Per-Host Resources: mem-estimate=14.01MB mem-reservation=5.94MB thread-reservation=1 |
| PLAN-ROOT SINK |
| | output exprs: rnk, count(*) |
| | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0 |
| | |
| 04:AGGREGATE [FINALIZE] |
| | output: count(*) |
| | group by: rank() |
| | mem-estimate=10.00MB mem-reservation=1.94MB spill-buffer=64.00KB thread-reservation=0 |
| | tuple-ids=5 row-size=16B cardinality=1 |
| | in pipelines: 04(GETNEXT), 06(OPEN) |
| | |
| 03:SELECT |
| | predicates: rank() < CAST(1000 AS BIGINT) |
| | mem-estimate=0B mem-reservation=0B thread-reservation=0 |
| | tuple-ids=8,7 row-size=16B cardinality=999 |
| | in pipelines: 06(GETNEXT) |
| | |
| 02:ANALYTIC |
| | functions: rank() |
| | order by: ss_net_profit ASC |
| | window: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
| | mem-estimate=4.00MB mem-reservation=4.00MB spill-buffer=2.00MB thread-reservation=0 |
| | tuple-ids=8,7 row-size=16B cardinality=999 |
| | in pipelines: 06(GETNEXT) |
| | |
| 06:TOP-N |
| | order by: ss_net_profit ASC |
| | limit with ties: 999 |
| | mem-estimate=7.80KB mem-reservation=0B thread-reservation=0 |
| | tuple-ids=8 row-size=8B cardinality=999 |
| | in pipelines: 06(GETNEXT), 01(OPEN) |
| | |
| 05:EXCHANGE [UNPARTITIONED] |
| | mem-estimate=37.72KB mem-reservation=0B thread-reservation=0 |
| | tuple-ids=8 row-size=8B cardinality=999 |
| | in pipelines: 01(GETNEXT) |
| | |
| F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 |
| Per-Host Resources: mem-estimate=128.01MB mem-reservation=8.00MB thread-reservation=2 |
| 01:TOP-N |
| | order by: ss_net_profit ASC |
| | limit with ties: 999 |
| | source expr: rank() < CAST(1000 AS BIGINT) |
| | mem-estimate=7.80KB mem-reservation=0B thread-reservation=0 |
| | tuple-ids=8 row-size=8B cardinality=999 |
| | in pipelines: 01(GETNEXT), 00(OPEN) |
| | |
| 00:SCAN HDFS [tpcds.store_sales ss1, RANDOM] |
| HDFS partitions=1824/1824 files=1824 size=346.60MB |
| predicates: ss_store_sk = CAST(4 AS INT) |
| stored statistics: |
| table: rows=2.88M size=346.60MB |
| partitions: 1824/1824 rows=2.88M |
| columns: all |
| extrapolated-rows=disabled max-scan-range-rows=130.09K |
| mem-estimate=128.00MB mem-reservation=8.00MB thread-reservation=1 |
| tuple-ids=0 row-size=8B cardinality=480.07K |
| in pipelines: 00(GETNEXT) |
+------------------------------------------------------------------------------------------+
{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org