You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Shant Hovsepian (Jira)" <ji...@apache.org> on 2020/04/17 14:28:00 UTC

[jira] [Created] (IMPALA-9671) Improve SINGULAR ROW SRC Node Explain Output

Shant Hovsepian created IMPALA-9671:
---------------------------------------

             Summary: Improve SINGULAR ROW SRC Node Explain Output
                 Key: IMPALA-9671
                 URL: https://issues.apache.org/jira/browse/IMPALA-9671
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Shant Hovsepian
            Assignee: Gabor Kaszab


For queries that involve more than one level of unnesting with complex/nested types the explain output can be tricky to read and reason about. The SUBPLAN node produces a tree shape that's not quite the same as other node types. In particular it can be tricky to understand what a SINGULAR ROW SRC node is acting on or producing.

Currently the explain output for a SINGULAR ROW SRC doesn't provide any reference on what it's doing. It may not be a guarantee but leaf nodes in an Impala plan tree are usually annotated with the input source they are working on in square brackets "[]", for example SCAN and UNNEST nodes, but SINGULAR ROW SRC provides no such annotation. It would be great to fix this so that in explain strings.

{{SINGULAR ROW SRC }}

_{{becomes}}_

{{SINGULAR ROW SRC [input]}}

Take the query below (SET EXPLAIN_LEVEL=3):

 
{code:java}
Query: explain select c_custkey, o_orderkey, l_partkey from customer c, c.c_orders o, o.o_lineitems as li
+----------------------------------------------------------------------------------------+
| Explain String                                                                         |
+----------------------------------------------------------------------------------------+
| Max Per-Host Resource Reservation: Memory=16.00MB Threads=3                            |
| Per-Host Resource Estimates: Memory=274MB                                              |
| Analyzed query: SELECT c_custkey, o_orderkey, l_partkey FROM                           |
| tpch_nested_parquet.customer c, c.c_orders o, o.o_lineitems li                         |
|                                                                                        |
| F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1                                  |
| |  Per-Host Resources: mem-estimate=10.06MB mem-reservation=0B thread-reservation=1    |
| PLAN-ROOT SINK                                                                         |
| |  output exprs: c_custkey, o_orderkey, l_partkey                                      |
| |  mem-estimate=0B mem-reservation=0B thread-reservation=0                             |
| |                                                                                      |
| 09:EXCHANGE [UNPARTITIONED]                                                            |
| |  mem-estimate=10.06MB mem-reservation=0B thread-reservation=0                        |
| |  tuple-ids=2,1,0 row-size=48B cardinality=15.00M                                     |
| |  in pipelines: 00(GETNEXT)                                                           |
| |                                                                                      |
| F00:PLAN FRAGMENT [RANDOM] hosts=1 instances=1                                         |
| Per-Host Resources: mem-estimate=264.00MB mem-reservation=16.00MB thread-reservation=2 |
| 01:SUBPLAN                                                                             |
| |  mem-estimate=0B mem-reservation=0B thread-reservation=0                             |
| |  tuple-ids=2,1,0 row-size=48B cardinality=15.00M                                     |
| |  in pipelines: 00(GETNEXT)                                                           |
| |                                                                                      |
| |--08:NESTED LOOP JOIN [CROSS JOIN]                                                    |
| |  |  mem-estimate=20B mem-reservation=0B thread-reservation=0                         |
| |  |  tuple-ids=2,1,0 row-size=48B cardinality=100                                     |
| |  |  in pipelines: 00(GETNEXT)                                                        |
| |  |                                                                                   |
| |  |--02:SINGULAR ROW SRC                                                              |
| |  |     parent-subplan=01                                                             |
| |  |     mem-estimate=0B mem-reservation=0B thread-reservation=0                       |
| |  |     tuple-ids=0 row-size=20B cardinality=1                                        |
| |  |     in pipelines: 00(GETNEXT)                                                     |
| |  |                                                                                   |
| |  04:SUBPLAN                                                                          |
| |  |  mem-estimate=0B mem-reservation=0B thread-reservation=0                          |
| |  |  tuple-ids=2,1 row-size=28B cardinality=100                                       |
| |  |  in pipelines: 00(GETNEXT)                                                        |
| |  |                                                                                   |
| |  |--07:NESTED LOOP JOIN [CROSS JOIN]                                                 |
| |  |  |  mem-estimate=20B mem-reservation=0B thread-reservation=0                      |
| |  |  |  tuple-ids=2,1 row-size=28B cardinality=10                                     |
| |  |  |  in pipelines: 00(GETNEXT)                                                     |
| |  |  |                                                                                |
| |  |  |--05:SINGULAR ROW SRC                                                           |
| |  |  |     parent-subplan=04                                                          |
| |  |  |     mem-estimate=0B mem-reservation=0B thread-reservation=0                    |
| |  |  |     tuple-ids=1 row-size=20B cardinality=1                                     |
| |  |  |     in pipelines: 00(GETNEXT)                                                  |
| |  |  |                                                                                |
| |  |  06:UNNEST [o.o_lineitems li]                                                     |
| |  |     parent-subplan=04                                                             |
| |  |     mem-estimate=0B mem-reservation=0B thread-reservation=0                       |
| |  |     tuple-ids=2 row-size=0B cardinality=10                                        |
| |  |     in pipelines: 00(GETNEXT)                                                     |
| |  |                                                                                   |
| |  03:UNNEST [c.c_orders o]                                                            |
| |     parent-subplan=01                                                                |
| |     mem-estimate=0B mem-reservation=0B thread-reservation=0                          |
| |     tuple-ids=1 row-size=0B cardinality=10                                           |
| |     in pipelines: 00(GETNEXT)                                                        |
| |                                                                                      |
| 00:SCAN HDFS [tpch_nested_parquet.customer c, RANDOM]                                  |
|    HDFS partitions=1/1 files=4 size=289.13MB                                           |
|    predicates: !empty(c.c_orders)                                                      |
|    predicates on o: !empty(o.o_lineitems)                                              |
|    stored statistics:                                                                  |
|      table: rows=150.00K size=289.13MB                                                 |
|      columns missing stats: c_orders                                                   |
|    extrapolated-rows=disabled max-scan-range-rows=50.11K                               |
|    mem-estimate=264.00MB mem-reservation=16.00MB thread-reservation=1                  |
|    tuple-ids=0 row-size=20B cardinality=150.00K                                        |
|    in pipelines: 00(GETNEXT)                                                           |
+----------------------------------------------------------------------------------------+
{code}
 

It's easy to figure out what node 05 is doing but kind of tricky to understand what 02 is doing.

One option would be for 02 to have the following annotation or something else more informative:

 

{{SINGULAR ROW SRC [c.c_orders o, o.o_lineitems li]}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org