You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by Victsm <gi...@git.apache.org> on 2018/05/22 21:56:09 UTC

[GitHub] spark pull request #21402: SPARK-24335 Spark external shuffle server improve...

GitHub user Victsm opened a pull request:

    https://github.com/apache/spark/pull/21402

    SPARK-24335 Spark external shuffle server improvement to better handle block fetch requests.

    
    
    ## What changes were proposed in this pull request?
    
    Right now, the default server side netty handler threads is 2 * # cores, and can be further configured with parameter spark.shuffle.io.serverThreads.
    In order to process a client request, it would require one available server netty handler thread.
    However, when the server netty handler threads start to process ChunkFetchRequests, they will be blocked on disk I/O, mostly due to disk contentions from the random read operations initiated by all the ChunkFetchRequests received from clients.
    As a result, when the shuffle server is serving many concurrent ChunkFetchRequests, the server side netty handler threads could all be blocked on reading shuffle files, thus leaving no handler thread available to process other types of requests which should all be very quick to process.
    
    This issue could potentially be fixed by limiting the number of netty handler threads that could get blocked when processing ChunkFetchRequest. We have a patch to do this by using a separate EventLoopGroup with a dedicated ChannelHandler to process ChunkFetchRequest. This enables shuffle server to reserve netty handler threads for non-ChunkFetchRequest, thus enabling consistent processing time for these requests which are fast to process. After deploying the patch in our infrastructure, we no longer see timeout issues with either executor registration with local shuffle server or shuffle client establishing connection with remote shuffle server.
    
    ## How was this patch tested?
    
    Added unit test for this patch.
    In addition, we built an internal tool to stress test Spark shuffle service and have observed significant improvement after applying this patch.
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Victsm/spark SPARK-24355

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21402.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21402
    
----
commit d862a2dec30f1fefe03d26e0b8a07d10eac7dbf5
Author: Min Shen <ms...@...>
Date:   2017-09-11T03:59:25Z

    SPARK-24335 Spark external shuffle server improvement to better handle block fetch requests.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91593/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    > StreamRequest will not block the server netty handler thread.
    
    Hmm, I'm not so sure that's accurate. I think the main difference is that I don't think there is currently any code path that sends a `StreamRequest` to the shuffle service. But for example if many executors request files from the driver simultaneously, you could potentially end up in the same situation. It's a less serious issue since I think it's a lot less common for large files to be transferred that way, at least after startup.
    
    I took a look at the code and it seems the actual change that avoids the disk thrashing is the synchronization done in the chunk fetch handler; it only allows a certain number of threads to actually do disk reads simultaneously. That's an improvement already, but a couple of questions popped in my head when I read your comment:
    
    - how does that relate to maxChunksBeingTransferred()? Aren't both settings effectively a limit on the number of requests being serviced, making the existing one a little redundant?
    
    - would there be benefits by trying to add some sort of disk affinity to these threads? e.g. send fetch requests hitting different disks to different queues.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    shall we close it since #22173 is merged?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    @redsanket you should close it by yourself.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    while we are here, could we also add or at least propose some metrics around this, such as number of open block failure, or even number of block threads?
    
    we have suffer a lot from the lack of visibility


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by Victsm <gi...@git.apache.org>.
Github user Victsm commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    @felixcheung 
    Adding these metrics are indeed things we have been working on recently.
    I'd prefer to propose it in a separate ticket.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24335 Spark external shuffle server improvement to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91444/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by Victsm <gi...@git.apache.org>.
Github user Victsm commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Using a percentage to configure the number of threads to handle chunk fetch requests does make sense. Will update the PR for this change.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    I think we should still respect `spark.shuffle.io.serverThreads`, can we set a percentage of the server threads as the upper bound to handle chunk fetch requests?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    what's the strategy here? if the number of block fetch requests has reached the limitation, shall we fail following block fetch requests immediately?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    I'm fine of adding metrics in another PR, please add a TODO in the code comment.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by redsanket <gi...@git.apache.org>.
Github user redsanket commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    @cloud-fan yes we can close this


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    cc @zsxwing @jiangxb1987 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    **[Test build #91444 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91444/testReport)** for PR 21402 at commit [`d862a2d`](https://github.com/apache/spark/commit/d862a2dec30f1fefe03d26e0b8a07d10eac7dbf5).
     * This patch **fails RAT tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `public class ChunkFetchRequestHandler extends SimpleChannelInboundHandler<ChunkFetchRequest> `
      * `public class TransportChannelHandler extends SimpleChannelInboundHandler<Message> `


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    **[Test build #91593 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91593/testReport)** for PR 21402 at commit [`d862a2d`](https://github.com/apache/spark/commit/d862a2dec30f1fefe03d26e0b8a07d10eac7dbf5).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    **[Test build #91593 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91593/testReport)** for PR 21402 at commit [`d862a2d`](https://github.com/apache/spark/commit/d862a2dec30f1fefe03d26e0b8a07d10eac7dbf5).
     * This patch **fails RAT tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `public class ChunkFetchRequestHandler extends SimpleChannelInboundHandler<ChunkFetchRequest> `
      * `public class TransportChannelHandler extends SimpleChannelInboundHandler<Message> `


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by redsanket <gi...@git.apache.org>.
Github user redsanket commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    @Victsm @vanzin i want to get this going, is it better if I have a PR up with the requested changes and concerns?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by Victsm <gi...@git.apache.org>.
Github user Victsm commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    `StreamRequest` will not block the server netty handler thread. Only `ChunkFetchRequest` does, as this request is the one that actually triggers seeking the disk to return the shuffle block to the remote shuffle client, and could block server nettty handler threads due to the concurrent random reads if the shuffle files are written to limited number of hard disk drives.
    
    The strategy in this patch is to use a separate `EventLoopGroup` associated with a dedicated Netty channel handler to handle `ChunkFetchRequest`. The config introduced in this PR, `spark.shuffle.server.chunkFetchHandlerThreads`, specifies the number of threads this dedicated `EventLoopGroup` has. Assume shuffle server is configured with `spark.shuffle.io.serverThreads = 45` and `spark.shuffle.server.chunkFetchHandlerThreads = 40`. This will create 2 separate Netty thread pools in shuffle server, one with 45 threads (main thread pool) and one with 40 threads (chunk fetch thread pool). All request except `ChunkFetchRequest` will be processed by one of the threads in the main thread pool. When a `ChunkFetchRequest` comes in, it will be processed by one of the threads in the chunk fetch thread pool.
    
    Processing the `ChunkFetchRequest` itself does not block a Netty thread, it's writing the response in the underlying `Channel` that blocks a Netty thread, since shuffle files are being read here. Although `ChunkFetchRequest` is handled in the separate chunk fetch thread pool, writing response on the underlying `Channel` will always be processed by the thread (`EventLoop`) that the `Channel` is registered with. Since the `Channel` is always registered with a `EventLoop` (thread) from the main thread pool, so the blocking I/O operation blocks a thread in the main thread pool while the `ChunkFetchRequest` is processed by a thread in the chunk fetch thread pool. These 2 threads are like a pair of producer and consumer, while the thread in the chunk fetch thread pool produces a blocking I/O task and the the thread in the main thread pool consumes this task and gets blocked.
    
    In this PR, the `ChunkFetchRequestHandler` performs a blocking operation with `channel.writeAndFlush(result).sync()`, thus synchronizing between the producer and the consumer threads. This way, when the `ChunkFetchRequestHandler` processes 40 `ChunkFetchRequest` and blocks 40 threads from the main thread pool, the 40 threads in the chunk fetch thread pool are also blocked. Thus when a new `ChunkFetchRequest` is received by the shuffle server, it has to wait for one of the threads in chunk fetch thread pool to become available before it can be processed. In the meantime, there are still 5 threads in the main thread pool that's not blocked by `ChunkFetchRequest`, and they will be available to handle all other types of requests which are much quicker to process.
    
    If the number of `ChunkFetchRequest` has reached the configured limit, it will not fail following `ChunkFetchRequest` immediately. These requests will just wait for availability of threads in the chunk fetch thread pool. This is the same behavior before this PR. Difference is that these `ChunkFetchRequest` could block all threads in main thread pool, causing both `ChunkFetchRequest` and other types of requests to wait for availability of threads in the main thread pool, thus timing out on executor registration or Sasl bootstrap.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    to confirm, `StreamRequest` will not block the server netty handler thread, right?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    **[Test build #91444 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91444/testReport)** for PR 21402 at commit [`d862a2d`](https://github.com/apache/spark/commit/d862a2dec30f1fefe03d26e0b8a07d10eac7dbf5).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Another question before I forget about it again: with the current code, if `chunkFetchHandlerThreads()` is equal or greater the size of the main thread pool won't that allow the current bad behavior to still exist?
    
    I think Wenchen suggested that that value should be a percentage of the number of threads for the other pool, which would avoid that problem (as long as the percentage is < 100). But a check that results in an error or at least a noisy warning log could at least help users detect a misconfiguration.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #21402: SPARK-24355 Spark external shuffle server improve...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/21402


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24335 Spark external shuffle server improvement to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #21402: SPARK-24355 Spark external shuffle server improvement to...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/21402
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org