You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2019/04/03 18:14:02 UTC
[jira] [Assigned] (IMPALA-7293) Show tuple layout in explain plan

     [ https://issues.apache.org/jira/browse/IMPALA-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong reassigned IMPALA-7293:
-------------------------------------

    Assignee: Abhishek Rawat

> Show tuple layout in explain plan
> ---------------------------------
>
>                 Key: IMPALA-7293
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7293
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: Tim Armstrong
>            Assignee: Abhishek Rawat
>            Priority: Major
>              Labels: observability
>
> There's currently no way to tell in the explain plan what the contents of each tuple are. At explain_level>=2 we include "tuple-ids" but no information about what is actually in the tuples.
> {noformat}
> [localhost:21000] default> explain select min(regexp_replace(l_comment, ".", "x"))
> from tpch.lineitem; summary;
> Query: explain select min(regexp_replace(l_comment, ".", "x"))
> from tpch.lineitem
> +---------------------------------------------------------------------------------------+
> | Explain String                                                                        |
> +---------------------------------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=8.00MB Threads=3                            |
> | Per-Host Resource Estimates: Memory=284.00MB                                          |
> |                                                                                       |
> | F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1                                 |
> | |  Per-Host Resources: mem-estimate=10.00MB mem-reservation=0B thread-reservation=1   |
> | PLAN-ROOT SINK                                                                        |
> | |  mem-estimate=0B mem-reservation=0B thread-reservation=0                            |
> | |                                                                                     |
> | 03:AGGREGATE [FINALIZE]                                                               |
> | |  output: min:merge(regexp_replace(l_comment, '.', 'x'))                             |
> | |  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB thread-reservation=0   |
> | |  tuple-ids=1 row-size=16B cardinality=1                                             |
> | |                                                                                     |
> | 02:EXCHANGE [UNPARTITIONED]                                                           |
> | |  mem-estimate=0B mem-reservation=0B thread-reservation=0                            |
> | |  tuple-ids=1 row-size=16B cardinality=1                                             |
> | |                                                                                     |
> | F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3                                        |
> | Per-Host Resources: mem-estimate=274.00MB mem-reservation=8.00MB thread-reservation=2 |
> | 01:AGGREGATE                                                                          |
> | |  output: min(regexp_replace(l_comment, '.', 'x'))                                   |
> | |  mem-estimate=10.00MB mem-reservation=0B spill-buffer=2.00MB thread-reservation=0   |
> | |  tuple-ids=1 row-size=16B cardinality=1                                             |
> | |                                                                                     |
> | 00:SCAN HDFS [tpch.lineitem, RANDOM]                                                  |
> |    partitions=1/1 files=1 size=718.94MB                                               |
> |    stored statistics:                                                                 |
> |      table: rows=6001215 size=718.94MB                                                |
> |      columns: all                                                                     |
> |    extrapolated-rows=disabled max-scan-range-rows=1068457                             |
> |    mem-estimate=264.00MB mem-reservation=8.00MB thread-reservation=1                  |
> |    tuple-ids=0 row-size=42B cardinality=6001215                                       |
> +---------------------------------------------------------------------------------------+
> Fetched 32 row(s) in 0.01s
> Summary not available
> {noformat}
> We already have a debugString() methods that prints a human-readable representation. We could start off by printing a tuple descriptor per line at the end of the explain plan with basic information. I think we should only print materialised slots and print them in order of offset, e.g. for "select l_comment from tpch.lineitem where l_orderkey < 10" we would print something like this at the end of the explain plan
> {noformat}
> F01:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
> |  Per-Host Resources: mem-estimate=9.02MB mem-reservation=0B thread-reservation=1
> PLAN-ROOT SINK
> |  mem-estimate=0B mem-reservation=0B thread-reservation=0
> |
> 01:EXCHANGE [UNPARTITIONED]
> |  mem-estimate=9.02MB mem-reservation=0B thread-reservation=0
> |  tuple-ids=0 row-size=46B cardinality=600122
> |  in pipelines: 00(GETNEXT)
> |
> F00:PLAN FRAGMENT [RANDOM] hosts=3 instances=3
> Per-Host Resources: mem-estimate=264.00MB mem-reservation=8.00MB thread-reservation=2
> 00:SCAN HDFS [tpch.lineitem, RANDOM]
>    partitions=1/1 files=1 size=718.94MB
>    predicates: l_orderkey < CAST(10 AS BIGINT)
>    stored statistics:
>      table: rows=6001215 size=718.94MB
>      columns: all
>    extrapolated-rows=disabled max-scan-range-rows=1068457
>    mem-estimate=264.00MB mem-reservation=8.00MB thread-reservation=1
>    tuple-ids=0 row-size=46B cardinality=600122
>    in pipelines: 00(GETNEXT)
> Tuple 1:
> Slot 1: offset=0 type=STRING path=tpch.lineitem.l_comment nullable=true
> Slot 2: offset=12 type=BIGINT path=tpch.lineitem.l_comment nullable=true
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org