You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Maciej Bryński (JIRA)" <ji...@apache.org> on 2016/10/05 11:52:20 UTC
[jira] [Created] (SPARK-17786) [SPARK 2.0] Sorting algorithm gives
higher skewness of output
Maciej Bryński created SPARK-17786:
--------------------------------------
Summary: [SPARK 2.0] Sorting algorithm gives higher skewness of output
Key: SPARK-17786
URL: https://issues.apache.org/jira/browse/SPARK-17786
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.0.1
Reporter: Maciej Bryński
Hi,
I'm using df.sort("column") to sort my data before saving it to parquet.
When using Spark 1.6.2 all partitions were similar in size.
On Spark 2.0.0 three of the partitions are much bigger than rest.
Can I go back to previous behaviour of sorting ?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org