You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2014/07/16 18:58:04 UTC

[jira] [Commented] (DRILL-1150) Sub-optimal expression pushdown for slightly modified version of Tpch 19

    [ https://issues.apache.org/jira/browse/DRILL-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063712#comment-14063712 ] 

Aman Sinha commented on DRILL-1150:
-----------------------------------

Corrections:  the 2 Projects are above the Part scan, not Lineitem.  The expressions pushdown occurs on both sides of the join...for Lineitem and Part. 

> Sub-optimal expression pushdown for slightly modified version of Tpch 19
> ------------------------------------------------------------------------
>
>                 Key: DRILL-1150
>                 URL: https://issues.apache.org/jira/browse/DRILL-1150
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Aman Sinha
>            Assignee: Jinfeng Ni
>
> A slightly modified version of TPCH 19, called 19_1 in the TestTpchDistributed JUnit test suite produces the following plan on latest master version 699851b.   The plan shows several expressions pushed into the Project just above the Lineitem scan whereas these expressions should ideally be evaluated after the join since there is no need to evaluate the expression for a row that does not qualify the join.   Also notice that there are 2 Projects above the Lineitem scan...these should have been merged into one. 
> | 00-00    Screen
> 00-01      StreamAgg(group=[{}], revenue=[SUM($0)])
> 00-02        Project($f0=[*($2, -(1, $3))])
> 00-03          SelectionVectorRemover
> 00-04            Filter(condition=[OR(AND(=($15, 'Brand#41'), OR(=($14, 'SM CASE'), =($14, 'SM BOX'), =($14, 'SM PACK'), =($14, 'SM PKG')), $4, $5, $16, $17, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($6, 'DELIVER IN PERSON')), AND(=($18, 'Brand#13'), OR(=($14, 'MED BAG'), =($14, 'MED BOX'), =($14, 'MED PKG'), =($14, 'MED PACK')), $7, $8, $19, $20, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($9, 'DELIVER IN PERSON')), AND(=($21, 'Brand#55'), OR(=($14, 'LG CASE'), =($14, 'LG BOX'), =($14, 'LG PACK'), =($14, 'LG PKG')), $10, $11, $22, $23, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($12, 'DELIVER IN PERSON')))])
> 00-05              HashJoin(condition=[=($1, $13)], joinType=[inner])
> 00-07                Project(l_shipmode=[$5], l_partkey=[$4], l_extendedprice=[$3], l_discount=[$1], $f7=[>=($2, 2)], $f8=[<=($2, +(2, 10))], $f9=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f10=[>=($2, 14)], $f11=[<=($2, +(14, 10))], $f12=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f13=[>=($2, 23)], $f14=[<=($2, +(23, 10))], $f15=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"])
> 00-09                  ProducerConsumer
> 00-11                    Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tpch/lineitem.parquet]], selectionRoot=/tpch/lineitem.parquet, columns=[SchemaPath [`l_shipmode`], SchemaPath [`l_partkey`], SchemaPath [`l_extendedprice`], SchemaPath [`l_discount`], SchemaPath [`l_quantity`], SchemaPath [`l_shipinstruct`]]]])
> 00-06                Project(p_partkey=[$0], p_container=[$1], $f5=[$2], $f6=[$3], $f70=[$4], $f80=[$5], $f90=[$6], $f100=[$7], $f110=[$8], $f120=[$9], $f130=[$10])
> 00-08                  Project(p_partkey=[$2], p_container=[$3], $f5=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f6=[>=($0, 1)], $f7=[<=($0, 5)], $f8=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f9=[>=($0, 1)], $f10=[<=($0, 10)], $f11=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f12=[>=($0, 1)], $f13=[<=($0, 15)])
> 00-10                    ProducerConsumer
> 00-12                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tpch/part.parquet]], selectionRoot=/tpch/part.parquet, columns=[SchemaPath [`p_partkey`], SchemaPath [`p_container`], SchemaPath [`p_brand`], SchemaPath [`p_size`]]]])



--
This message was sent by Atlassian JIRA
(v6.2#6252)