You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Rajkishore Hembram (JIRA)" <ji...@apache.org> on 2017/11/21 11:25:00 UTC

[jira] [Created] (SPARK-22573) SQL Planner is including unnecessary columns in the projection

Rajkishore Hembram created SPARK-22573:
------------------------------------------

             Summary: SQL Planner is including unnecessary columns in the projection
                 Key: SPARK-22573
                 URL: https://issues.apache.org/jira/browse/SPARK-22573
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.2.0
            Reporter: Rajkishore Hembram


While I was running TPC-H query 18 for benchmarking, I observed that the query plan for Apache Spark 2.2.0 is inefficient than other versions of Apache Spark. I noticed that the other versions of Apache Spark (2.0.2 and 2.1.2) are only including the required columns in the projections. But the query planner of Apache Spark 2.2.0 is including unnecessary columns into the projection for some of the queries and hence unnecessarily increasing the I/O. And because of that the Apache Spark 2.2.0 is taking more time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org