You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/12/04 15:12:03 UTC

[GitHub] [spark] tedyu opened a new pull request, #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

tedyu opened a new pull request, #38902:
URL: https://github.com/apache/spark/pull/38902

   ### What changes were proposed in this pull request?
   This PR adjusts graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config `hadoop.service.shutdown.timeout`.
   
   ### Why are the changes needed?
   This is a follow on for commit 9ab8ba73a384b207bd7bbea5db40eeccdc76ffa4, `Shorten graceful shutdown time of ExecutorPodsSnapshotsStoreImpl to prevent blocking shutdown process`.
   
   Since the value of `hadoop.service.shutdown.timeout` may be different from the default value, we should adjust the value for `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` accordingly.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Existing test suite.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1341283614

   bq. when they increases 20s to hadoop.service.shutdown.timeout
   
   Here is my proposed change:
   ```
           if (hadoopTimeout-GRACE_PERIOD_FOR_HADOOP <= awaitSeconds) {
               awaitSeconds = hadoopTimeout-GRACE_PERIOD_FOR_HADOOP
   ```
   where `GRACE_PERIOD_FOR_HADOOP` is 8. This means the user cannot increase beyond 22s (for the default value of `hadoop.service.shutdown.timeout`).
   
   @dongjoon-hyun 
   Can you take a closer look at this PR since @pan3793 doesn't object ?
   
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu closed pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
tedyu closed pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
URL: https://github.com/apache/spark/pull/38902


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1336436558

   cc @holdenk @pan3793 @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1338693017

   Reading @pan3793 's comment again:
   ```
   it's mostly because listeners cost much time during the shutdown phase, automatically increasing KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD is not what I want.
   ```
   With higher `hadoop.service.shutdown.timeout` value (than the default), `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` should be higher as well (to accommodate the listeners).
   
   @pan3793 
   Can you elaborate on your last sentence ?
   
   Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] pan3793 commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
pan3793 commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1338013789

   Usually, it's a good idea to make the default value of configurations adaptive, but I'm not sure about this one. (choosing the 20s as the default value is because I don't want to change the behavior a lot from the original 30s, actually, I set this value to 10s in our internal).
   
   In our practice, as all shutdown procedures in SparkShutdownHookManager shares `hadoop.service.shutdown.timeout`, the procedures are usually blocked by custom listeners(they are provided by different teams or spark users, and some of them may do a final flush when accepting `SparkListenerApplicationEnd` that may cost dozens of seconds or even minutes), we recommend all of them to set a timeout to 10s or shorter, so they can works fine w/ default 30s timeout limits in most case. So in my case, if I enlarge `hadoop.service.shutdown.timeout`, it's mostly because listeners cost much time during the shutdown phase, automatically increasing `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` is not what I want.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu closed pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
tedyu closed pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
URL: https://github.com/apache/spark/pull/38902


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1337558157

   We can define `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` as the percentage based on the value of `hadoop.service.shutdown.timeout`. In @pan3793 's PR, the percentage is 30%.
   This way, when the value of `hadoop.service.shutdown.timeout` is bigger than 30s, we can utilize the bigger timeout.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1338070280

   @pan3793 
   Thanks for sharing the background.
   The current formation of this PR goes along with your change.
   
   If `hadoop.service.shutdown.timeout` is lowered, we want to use smaller value for `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` config so that the intention for the shutdown timeout is satisfied.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] pan3793 commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
pan3793 commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1341265075

   @tedyu sorry if my words is ambiguous, the "listener" means `org.apache.spark.scheduler.SparkLisenter`, not the subscribers of "kubernetes executor snapshots".
   
   During shutdown process, they share the "hadoop.service.shutdown.timeout", increasing KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD means decreasing grace period of `listenerBus.stop()`.
   
   https://github.com/apache/spark/blob/1d2159f888139e65c80db2003d521b0f684df83a/core/src/main/scala/org/apache/spark/SparkContext.scala#L2125-L2136
   
   For our cases, the authors of listener may not know much about the Spark internal, when they increases 20s to `hadoop.service.shutdown.timeout`, they may expect that the listener will get another 20s grace period.
   
   To be clear, I am not oppose to your idea, just left my concerns.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1336495422

   Sure.
   Can I gather some feedback before creating the JIRA ?
   
   e.g. One option is to redefine `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` to have the meaning of `GRACE_PERIOD_FOR_HADOOP` shown in this PR (assuming `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` hasn't appeared in any Spark release yet).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1336494905

   Could you use a new JIRA ID? The proposal has a little different goal and sideeffect.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org