You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/12/04 15:12:03 UTC
[GitHub] [spark] tedyu opened a new pull request, #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
tedyu opened a new pull request, #38902:
URL: https://github.com/apache/spark/pull/38902
### What changes were proposed in this pull request?
This PR adjusts graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config `hadoop.service.shutdown.timeout`.
### Why are the changes needed?
This is a follow on for commit 9ab8ba73a384b207bd7bbea5db40eeccdc76ffa4, `Shorten graceful shutdown time of ExecutorPodsSnapshotsStoreImpl to prevent blocking shutdown process`.
Since the value of `hadoop.service.shutdown.timeout` may be different from the default value, we should adjust the value for `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` accordingly.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Existing test suite.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1341283614
bq. when they increases 20s to hadoop.service.shutdown.timeout
Here is my proposed change:
```
if (hadoopTimeout-GRACE_PERIOD_FOR_HADOOP <= awaitSeconds) {
awaitSeconds = hadoopTimeout-GRACE_PERIOD_FOR_HADOOP
```
where `GRACE_PERIOD_FOR_HADOOP` is 8. This means the user cannot increase beyond 22s (for the default value of `hadoop.service.shutdown.timeout`).
@dongjoon-hyun
Can you take a closer look at this PR since @pan3793 doesn't object ?
Thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] tedyu closed pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
tedyu closed pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
URL: https://github.com/apache/spark/pull/38902
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1336436558
cc @holdenk @pan3793 @dongjoon-hyun
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1338693017
Reading @pan3793 's comment again:
```
it's mostly because listeners cost much time during the shutdown phase, automatically increasing KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD is not what I want.
```
With higher `hadoop.service.shutdown.timeout` value (than the default), `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` should be higher as well (to accommodate the listeners).
@pan3793
Can you elaborate on your last sentence ?
Thanks
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] pan3793 commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
pan3793 commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1338013789
Usually, it's a good idea to make the default value of configurations adaptive, but I'm not sure about this one. (choosing the 20s as the default value is because I don't want to change the behavior a lot from the original 30s, actually, I set this value to 10s in our internal).
In our practice, as all shutdown procedures in SparkShutdownHookManager shares `hadoop.service.shutdown.timeout`, the procedures are usually blocked by custom listeners(they are provided by different teams or spark users, and some of them may do a final flush when accepting `SparkListenerApplicationEnd` that may cost dozens of seconds or even minutes), we recommend all of them to set a timeout to 10s or shorter, so they can works fine w/ default 30s timeout limits in most case. So in my case, if I enlarge `hadoop.service.shutdown.timeout`, it's mostly because listeners cost much time during the shutdown phase, automatically increasing `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` is not what I want.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] tedyu closed pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
tedyu closed pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
URL: https://github.com/apache/spark/pull/38902
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1337558157
We can define `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` as the percentage based on the value of `hadoop.service.shutdown.timeout`. In @pan3793 's PR, the percentage is 30%.
This way, when the value of `hadoop.service.shutdown.timeout` is bigger than 30s, we can utilize the bigger timeout.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1338070280
@pan3793
Thanks for sharing the background.
The current formation of this PR goes along with your change.
If `hadoop.service.shutdown.timeout` is lowered, we want to use smaller value for `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` config so that the intention for the shutdown timeout is satisfied.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] pan3793 commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
pan3793 commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1341265075
@tedyu sorry if my words is ambiguous, the "listener" means `org.apache.spark.scheduler.SparkLisenter`, not the subscribers of "kubernetes executor snapshots".
During shutdown process, they share the "hadoop.service.shutdown.timeout", increasing KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD means decreasing grace period of `listenerBus.stop()`.
https://github.com/apache/spark/blob/1d2159f888139e65c80db2003d521b0f684df83a/core/src/main/scala/org/apache/spark/SparkContext.scala#L2125-L2136
For our cases, the authors of listener may not know much about the Spark internal, when they increases 20s to `hadoop.service.shutdown.timeout`, they may expect that the listener will get another 20s grace period.
To be clear, I am not oppose to your idea, just left my concerns.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] tedyu commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
tedyu commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1336495422
Sure.
Can I gather some feedback before creating the JIRA ?
e.g. One option is to redefine `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` to have the meaning of `GRACE_PERIOD_FOR_HADOOP` shown in this PR (assuming `KUBERNETES_EXECUTOR_SNAPSHOTS_SUBSCRIBERS_GRACE_PERIOD` hasn't appeared in any Spark release yet).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #38902: [SPARK-41136][K8S][FOLLOW-ON] Adjust graceful shutdown time of ExecutorPodsSnapshotsStoreImpl according to hadoop config
Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #38902:
URL: https://github.com/apache/spark/pull/38902#issuecomment-1336494905
Could you use a new JIRA ID? The proposal has a little different goal and sideeffect.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org