You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:35:58 UTC

[jira] [Resolved] (SPARK-14820) Reduce shuffle data by pushing filter toward storage

     [ https://issues.apache.org/jira/browse/SPARK-14820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-14820.
----------------------------------
    Resolution: Incomplete

> Reduce shuffle data by pushing filter toward storage
> ----------------------------------------------------
>
>                 Key: SPARK-14820
>                 URL: https://issues.apache.org/jira/browse/SPARK-14820
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.1
>            Reporter: Ali Tootoonchian
>            Priority: Trivial
>              Labels: bulk-closed
>         Attachments: Reduce Shuffle Data by pushing filter toward storage.pdf
>
>
> SQL query planner can have intelligence to push down filter commands towards the storage layer. If we optimize the query planner such that the IO to the storage is reduced at the cost of running multiple filters (i.e., compute), this should be desirable when the system is IO bound.
> Proven analysis and example is attached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org