You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:23:45 UTC

[jira] [Updated] (SPARK-14715) Provide a way to mask partitions of a Dataset/Dataframe

     [ https://issues.apache.org/jira/browse/SPARK-14715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon updated SPARK-14715:
---------------------------------
    Labels: bulk-closed  (was: )

> Provide a way to mask partitions of a Dataset/Dataframe
> -------------------------------------------------------
>
>                 Key: SPARK-14715
>                 URL: https://issues.apache.org/jira/browse/SPARK-14715
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Anderson de Andrade
>            Priority: Major
>              Labels: bulk-closed
>
> If a Dataset/Dataframe were to have a custom partitioning by key(s), it would be very efficient to just mask partitions when filtering by the same key(s). This feature is already provide by PartitionPruningRDD on RDDs. We need something similar on the Dataset/Dataframe space.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org