You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/03/23 06:54:25 UTC

[jira] [Commented] (SPARK-14091) Consider improving performance of SparkContext.getCallSite()

    [ https://issues.apache.org/jira/browse/SPARK-14091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207914#comment-15207914 ] 

Apache Spark commented on SPARK-14091:
--------------------------------------

User 'rajeshbalamohan' has created a pull request for this issue:
https://github.com/apache/spark/pull/11911

> Consider improving performance of SparkContext.getCallSite()
> ------------------------------------------------------------
>
>                 Key: SPARK-14091
>                 URL: https://issues.apache.org/jira/browse/SPARK-14091
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Rajesh Balamohan
>
> Currently SparkContext.getCallSite() makes a call to Utils.getCallSite().
> {noformat}
>   private[spark] def getCallSite(): CallSite = {
>     val callSite = Utils.getCallSite()
>     CallSite(
>       Option(getLocalProperty(CallSite.SHORT_FORM)).getOrElse(callSite.shortForm),
>       Option(getLocalProperty(CallSite.LONG_FORM)).getOrElse(callSite.longForm)
>     )
>   }
> {noformat}
> However, in some places utils.withDummyCallSite(sc) is invoked to avoid expensive threaddumps within getCallSite().  But Utils.getCallSite() is evaluated earlier causing threaddumps to be computed.  This would impact when lots of RDDs are created (e.g spends close to 3-7 seconds when 1000+ are RDDs are present, which can have significant impact when entire query runtime is in the order of 10-20 seconds)
> Creating this jira to consider evaluating getCallSite only when needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org