You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Robert Kruszewski (JIRA)" <ji...@apache.org> on 2016/08/10 01:28:20 UTC
[jira] [Created] (SPARK-16984) executeTake tries all partitions if
first parition is empty
Robert Kruszewski created SPARK-16984:
-----------------------------------------
Summary: executeTake tries all partitions if first parition is empty
Key: SPARK-16984
URL: https://issues.apache.org/jira/browse/SPARK-16984
Project: Spark
Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Robert Kruszewski
in executeTake if the number of rows returned by first partition is 0 we try all partitions next time. This can lead to pathological cases where your first partition is empty and rest have data. This unfortunately can happen with skewed data. Empirically observed it's better to make few roundtrips instead of potentially killing driver with big collect
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org