You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/04/17 16:49:00 UTC
[jira] [Updated] (IMPALA-9671) Improve SINGULAR ROW SRC Node
Explain Output
[ https://issues.apache.org/jira/browse/IMPALA-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong updated IMPALA-9671:
----------------------------------
Labels: complextype observability (was: complextype)
> Improve SINGULAR ROW SRC Node Explain Output
> --------------------------------------------
>
> Key: IMPALA-9671
> URL: https://issues.apache.org/jira/browse/IMPALA-9671
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Shant Hovsepian
> Assignee: Gabor Kaszab
> Priority: Minor
> Labels: complextype, observability
>
> For queries that involve more than one level of unnesting with complex/nested types the explain output can be tricky to read and reason about. The SUBPLAN node produces a tree shape that's not quite the same as other node types. In particular it can be tricky to understand what a SINGULAR ROW SRC node is acting on or producing.
> Currently the explain output for a SINGULAR ROW SRC doesn't provide any reference on what it's doing. It may not be a guarantee but leaf nodes in an Impala plan tree are usually annotated with the input source they are working on in square brackets "[]", for example SCAN and UNNEST nodes, but SINGULAR ROW SRC provides no such annotation. It would be great to fix this so that in explain strings.
> {{SINGULAR ROW SRC }}
> _{{becomes}}_
> {{SINGULAR ROW SRC [input]}}
> Take the query below (SET EXPLAIN_LEVEL=3):
>
> {code:java}
> Query: explain select c_custkey, o_orderkey, l_partkey from customer c, c.c_orders o, o.o_lineitems as li
> +----------------------------------------------------------------------------------------+
> | Explain String |
> +----------------------------------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=16.00MB Threads=3 |
> | Per-Host Resource Estimates: Memory=274MB |
> | Analyzed query: SELECT c_custkey, o_orderkey, l_partkey FROM |
> | tpch_nested_parquet.customer c, c.c_orders o, o.o_lineitems li |
> | |
> | F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 |
> | | Per-Host Resources: mem-estimate=10.06MB mem-reservation=0B thread-reservation=1 |
> | PLAN-ROOT SINK |
> | | output exprs: c_custkey, o_orderkey, l_partkey |
> | | mem-estimate=0B mem-reservation=0B thread-reservation=0 |
> | | |
> | 09:EXCHANGE [UNPARTITIONED] |
> | | mem-estimate=10.06MB mem-reservation=0B thread-reservation=0 |
> | | tuple-ids=2,1,0 row-size=48B cardinality=15.00M |
> | | in pipelines: 00(GETNEXT) |
> | | |
> | F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1 |
> | Per-Host Resources: mem-estimate=264.00MB mem-reservation=16.00MB thread-reservation=2 |
> | 01:SUBPLAN |
> | | mem-estimate=0B mem-reservation=0B thread-reservation=0 |
> | | tuple-ids=2,1,0 row-size=48B cardinality=15.00M |
> | | in pipelines: 00(GETNEXT) |
> | | |
> | |--08:NESTED LOOP JOIN [CROSS JOIN] |
> | | | mem-estimate=20B mem-reservation=0B thread-reservation=0 |
> | | | tuple-ids=2,1,0 row-size=48B cardinality=100 |
> | | | in pipelines: 00(GETNEXT) |
> | | | |
> | | |--02:SINGULAR ROW SRC |
> | | | parent-subplan=01 |
> | | | mem-estimate=0B mem-reservation=0B thread-reservation=0 |
> | | | tuple-ids=0 row-size=20B cardinality=1 |
> | | | in pipelines: 00(GETNEXT) |
> | | | |
> | | 04:SUBPLAN |
> | | | mem-estimate=0B mem-reservation=0B thread-reservation=0 |
> | | | tuple-ids=2,1 row-size=28B cardinality=100 |
> | | | in pipelines: 00(GETNEXT) |
> | | | |
> | | |--07:NESTED LOOP JOIN [CROSS JOIN] |
> | | | | mem-estimate=20B mem-reservation=0B thread-reservation=0 |
> | | | | tuple-ids=2,1 row-size=28B cardinality=10 |
> | | | | in pipelines: 00(GETNEXT) |
> | | | | |
> | | | |--05:SINGULAR ROW SRC |
> | | | | parent-subplan=04 |
> | | | | mem-estimate=0B mem-reservation=0B thread-reservation=0 |
> | | | | tuple-ids=1 row-size=20B cardinality=1 |
> | | | | in pipelines: 00(GETNEXT) |
> | | | | |
> | | | 06:UNNEST [o.o_lineitems li] |
> | | | parent-subplan=04 |
> | | | mem-estimate=0B mem-reservation=0B thread-reservation=0 |
> | | | tuple-ids=2 row-size=0B cardinality=10 |
> | | | in pipelines: 00(GETNEXT) |
> | | | |
> | | 03:UNNEST [c.c_orders o] |
> | | parent-subplan=01 |
> | | mem-estimate=0B mem-reservation=0B thread-reservation=0 |
> | | tuple-ids=1 row-size=0B cardinality=10 |
> | | in pipelines: 00(GETNEXT) |
> | | |
> | 00:SCAN HDFS [tpch_nested_parquet.customer c, RANDOM] |
> | HDFS partitions=1/1 files=4 size=289.13MB |
> | predicates: !empty(c.c_orders) |
> | predicates on o: !empty(o.o_lineitems) |
> | stored statistics: |
> | table: rows=150.00K size=289.13MB |
> | columns missing stats: c_orders |
> | extrapolated-rows=disabled max-scan-range-rows=50.11K |
> | mem-estimate=264.00MB mem-reservation=16.00MB thread-reservation=1 |
> | tuple-ids=0 row-size=20B cardinality=150.00K |
> | in pipelines: 00(GETNEXT) |
> +----------------------------------------------------------------------------------------+
> {code}
>
> It's easy to figure out what node 05 is doing but kind of tricky to understand what 02 is doing.
> One option would be for 02 to have the following annotation or something else more informative:
>
> {{SINGULAR ROW SRC [c.c_orders o, o.o_lineitems li]}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org