You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2014/07/16 18:54:05 UTC

[jira] [Created] (DRILL-1150) Sub-optimal expression pushdown for slightly modified version of Tpch 19

Aman Sinha created DRILL-1150:
---------------------------------

             Summary: Sub-optimal expression pushdown for slightly modified version of Tpch 19
                 Key: DRILL-1150
                 URL: https://issues.apache.org/jira/browse/DRILL-1150
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Aman Sinha
            Assignee: Jinfeng Ni


A slightly modified version of TPCH 19, called 19_1 in the TestTpchDistributed JUnit test suite produces the following plan on latest master version 699851b.   The plan shows several expressions pushed into the Project just above the Lineitem scan whereas these expressions should ideally be evaluated after the join since there is no need to evaluate the expression for a row that does not qualify the join.   Also notice that there are 2 Projects above the Lineitem scan...these should have been merged into one. 


| 00-00    Screen
00-01      StreamAgg(group=[{}], revenue=[SUM($0)])
00-02        Project($f0=[*($2, -(1, $3))])
00-03          SelectionVectorRemover
00-04            Filter(condition=[OR(AND(=($15, 'Brand#41'), OR(=($14, 'SM CASE'), =($14, 'SM BOX'), =($14, 'SM PACK'), =($14, 'SM PKG')), $4, $5, $16, $17, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($6, 'DELIVER IN PERSON')), AND(=($18, 'Brand#13'), OR(=($14, 'MED BAG'), =($14, 'MED BOX'), =($14, 'MED PKG'), =($14, 'MED PACK')), $7, $8, $19, $20, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($9, 'DELIVER IN PERSON')), AND(=($21, 'Brand#55'), OR(=($14, 'LG CASE'), =($14, 'LG BOX'), =($14, 'LG PACK'), =($14, 'LG PKG')), $10, $11, $22, $23, OR(=($0, 'AIR'), =($0, 'AIR REG')), =($12, 'DELIVER IN PERSON')))])
00-05              HashJoin(condition=[=($1, $13)], joinType=[inner])
00-07                Project(l_shipmode=[$5], l_partkey=[$4], l_extendedprice=[$3], l_discount=[$1], $f7=[>=($2, 2)], $f8=[<=($2, +(2, 10))], $f9=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f10=[>=($2, 14)], $f11=[<=($2, +(14, 10))], $f12=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f13=[>=($2, 23)], $f14=[<=($2, +(23, 10))], $f15=[CAST($0):CHAR(17) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"])
00-09                  ProducerConsumer
00-11                    Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tpch/lineitem.parquet]], selectionRoot=/tpch/lineitem.parquet, columns=[SchemaPath [`l_shipmode`], SchemaPath [`l_partkey`], SchemaPath [`l_extendedprice`], SchemaPath [`l_discount`], SchemaPath [`l_quantity`], SchemaPath [`l_shipinstruct`]]]])
00-06                Project(p_partkey=[$0], p_container=[$1], $f5=[$2], $f6=[$3], $f70=[$4], $f80=[$5], $f90=[$6], $f100=[$7], $f110=[$8], $f120=[$9], $f130=[$10])
00-08                  Project(p_partkey=[$2], p_container=[$3], $f5=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f6=[>=($0, 1)], $f7=[<=($0, 5)], $f8=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f9=[>=($0, 1)], $f10=[<=($0, 10)], $f11=[CAST($1):CHAR(8) CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], $f12=[>=($0, 1)], $f13=[<=($0, 15)])
00-10                    ProducerConsumer
00-12                      Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=/tpch/part.parquet]], selectionRoot=/tpch/part.parquet, columns=[SchemaPath [`p_partkey`], SchemaPath [`p_container`], SchemaPath [`p_brand`], SchemaPath [`p_size`]]]])



--
This message was sent by Atlassian JIRA
(v6.2#6252)