You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:13:00 UTC

[jira] [Resolved] (SPARK-18404) RPC call from executor to driver blocks when getting map output locations (Netty Only)

     [ https://issues.apache.org/jira/browse/SPARK-18404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-18404.
----------------------------------
    Resolution: Incomplete

> RPC call from executor to driver blocks when getting map output locations (Netty Only)
> --------------------------------------------------------------------------------------
>
>                 Key: SPARK-18404
>                 URL: https://issues.apache.org/jira/browse/SPARK-18404
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.0
>            Reporter: Jeffrey Shmain
>            Priority: Major
>              Labels: bulk-closed
>
> Compared identical application run on Spark 1.5 and Spark 1.6.  Noticed that jobs became slower. After looking at it closer, found that 75% of tasks finished same or above, and 25% had significant delays (unrelated to data skew and GC)
> After more debugging noticed that the executors are blocking for few seconds (sometimes 25) on this call:
> https://github.com/apache/spark/blob/39e2bad6a866d27c3ca594d15e574a1da3ee84cc/core/src/main/scala/org/apache/spark/MapOutputTracker.scala#L199
>        logInfo("Doing the fetch; tracker endpoint = " + trackerEndpoint)
>         // This try-finally prevents hangs due to timeouts:
>         try {
>           val fetchedBytes = askTracker[Array[Byte]](GetMapOutputStatuses(shuffleId))
>           fetchedStatuses = MapOutputTracker.deserializeMapStatuses(fetchedBytes)
>           logInfo("Got the output locations")
> So the regression seems to be related changing the default from from Akka to Netty.  
> This was an application working with RDDs, submitting 10 concurrent queries at a time.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org