You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "TakawaAkirayo (via GitHub)" <gi...@apache.org> on 2024/03/03 05:06:27 UTC

[PR] [SPARK-47253][CORE][3.5] Allow LiveEventBus to stop without the completely draining of event queue [spark]

TakawaAkirayo opened a new pull request, #45364:
URL: https://github.com/apache/spark/pull/45364

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'common/utils/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   Add config spark.scheduler.listenerbus.eventqueue.waitForEventDispatchExitOnStop(default true).
   Before this PR: The event queue will wait for the event to drain completely on stop.
   After this PR: Allow user to control this behavior(wait for event completely drained or not) by spark config.
   
   
   ### Why are the changes needed?
   ####Problem statement:
   The SparkContext.stop() hung a long time on LiveEventBus.stop() when listeners slow
   
   ####User scenarios:
   We have a centralized service with multiple instances to regularly execute user's scheduled tasks.
   For each user task within one service instance, the process is as follows:
   
   1.Create a Spark session directly within the service process with an account defined in the task.
   2.Instantiate listeners by class names and register them with the SparkContext. The JARs containing the listener classes are uploaded to the service by the user.
   3.Prepare resources.
   4.Run user logic (Spark SQL).
   5.Stop the Spark session by invoking SparkSession.stop().
   
   In step 5, it will wait for the LiveEventBus to stop, which requires the remaining events to be completely drained by each listener.
   
   Since the listener is implemented by users and we cannot prevent some heavy stuffs within the listener on each event, there are cases where a single heavy job has over 30,000 tasks,
   and it could take 30 minutes for the listener to process all the remaining events, because within the listener, it requires a coarse-grained global lock and update the internal status to the remote database.
   
   This kind of delay affects other user tasks in the queue. Therefore, from the server side perspective, we need the guarantee that the stop operation finishes quickly.
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   Add config spark.scheduler.listenerbus.eventqueue.waitForEventDispatchExitOnStop.
   Default is true, it will wait for the event to drain completely. If set to false, the LivenEventBus will stop without waitting for the event to drain on each queue.
   
   
   ### How was this patch tested?
   By UT and verified the feature in out production environment
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47253][CORE][3.5] Allow LiveEventBus to stop without the completely draining of event queue [spark]

Posted by "TakawaAkirayo (via GitHub)" <gi...@apache.org>.
TakawaAkirayo closed pull request #45364: [SPARK-47253][CORE][3.5] Allow LiveEventBus to stop without the completely draining of event queue
URL: https://github.com/apache/spark/pull/45364


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org