You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/08/25 22:43:20 UTC

[jira] [Assigned] (SPARK-17247) when fall back to hdfs is enabled for stats calculation, the hdfs listing and size calcuation should be terminated as soon as total size > broadcast threshold

     [ https://issues.apache.org/jira/browse/SPARK-17247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-17247:
------------------------------------

    Assignee:     (was: Apache Spark)

> when fall back to hdfs is enabled for stats calculation, the hdfs listing and size calcuation should be terminated as soon as total size > broadcast threshold
> --------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17247
>                 URL: https://issues.apache.org/jira/browse/SPARK-17247
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Parth Brahmbhatt
>
> Currently when user enables spark.sql.statistics.fallBackToHdfs and no stats are available from metastore we fall back to hdfs. This is useful join optimization however this can slow things down. To speed up the operation we could stop size calculation as soon as we hit the broadcast threshold as the accuracy of size is not important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org