You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/11/05 12:51:56 UTC

[GitHub] [spark] HeartSaVioR commented on pull request #30139: [SPARK-31069][CORE] high cpu caused by chunksBeingTransferred in external shuffle service

HeartSaVioR commented on pull request #30139:
URL: https://github.com/apache/spark/pull/30139#issuecomment-722358480


   The root issue here is that `numChunksBeingTransferred` is called quite often.
   
   If I understand correctly, the counter in stream is either increased or decreased via `processFetchRequest` & `processStreamRequest`, which `chunksBeingTransferred` is also called in prior. That said, the number of operations across counters would be just 2x of the number of calls for chunksBeingTransferred, and the difference of cost is significant. (increase/decrease atomic integer vs iterate all stream entities and read the atomic integer, and sum up) The cost for latter is linearly increasing based on the number of entities in streams, hence the problem appears if the number of entities is quite huge.
   
   So that sounds like a trade-off. Probably it would be pretty much beneficial to leave it as it is if we assume the number of stream entities will retain small enough, but if we assume the case where the number of stream entities are quite large, the cost of synchronizing numChunksBeingTransferred is going to be smaller.
   
   Possible alternative would be reducing the calculation of numChunksBeingTransferred - use cached value and update on condition like rate (once in 5 calls) or interval (update if the cached value is calculated earlier than XX seconds prior). We agreed that the value of numChunksBeingTransferred doesn't need to be strictly accurate, so this might be acceptable.
   
   Would like to hear the voices. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org