You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Vishesh Garg (JIRA)" <ji...@apache.org> on 2015/10/29 04:58:27 UTC

[jira] [Created] (SPARK-11390) Query plan with/without filterPushdown indistinguishable

Vishesh Garg created SPARK-11390:
------------------------------------

             Summary: Query plan with/without filterPushdown indistinguishable
                 Key: SPARK-11390
                 URL: https://issues.apache.org/jira/browse/SPARK-11390
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.5.1
         Environment: All
            Reporter: Vishesh Garg
            Priority: Minor


The execution plan of a query remains the same regardless of whether the filterPushdown flag has been set to "true" or "false", as can be seen below: 
======
scala> sqlContext.setConf("spark.sql.orc.filterPushdown", "false")
scala>     sqlContext.sql("SELECT name FROM people WHERE age = 15").explain()
== Physical Plan ==
Project [name#6]
 Filter (age#7 = 15)
  Scan OrcRelation[hdfs://localhost:9000/user/spec/people][name#6,age#7]
scala> sqlContext.setConf("spark.sql.orc.filterPushdown", "true")
scala>     sqlContext.sql("SELECT name FROM people WHERE age = 15").explain()
== Physical Plan ==
Project [name#6]
 Filter (age#7 = 15)
  Scan OrcRelation[hdfs://localhost:9000/user/spec/people][name#6,age#7]
======
Ideally, when the filterPushdown flag is set to "true", both the scan and the filter nodes should be merged together to make it clear that the filtering is being done by the data source itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org