You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Patrick Woody (JIRA)" <ji...@apache.org> on 2017/07/05 13:50:00 UTC

[jira] [Created] (SPARK-21317) Avoid unnecessary sort in FileFormatWriter if data is already bucketed

Patrick Woody created SPARK-21317:
-------------------------------------

             Summary: Avoid unnecessary sort in FileFormatWriter if data is already bucketed
                 Key: SPARK-21317
                 URL: https://issues.apache.org/jira/browse/SPARK-21317
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.1.1
            Reporter: Patrick Woody


When bucketing in FileFormatWriter, the partition is always sorted on bucketIdExpression, the partition id produced by the hash bucketing. If the data is already bucketed in that format, then this expression will be constant so there is no need to sort.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org