You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/07/23 21:34:09 UTC

[GitHub] [spark] vanzin opened a new pull request #25236: [SPARK-28487][k8s] More responsive dynamic allocation with K8S.

vanzin opened a new pull request #25236: [SPARK-28487][k8s] More responsive dynamic allocation with K8S.
URL: https://github.com/apache/spark/pull/25236
 
 
   This change implements a few changes to the k8s pod allocator so
   that it behaves a little better when dynamic allocation is on.
   
   (i) Allow the application to ramp up immediately when there's a
   change in the target number of executors. Without this change,
   scaling would only trigger when a change happened in the state of
   the cluster, e.g. an executor going down, or when the periodical
   snapshot was taken (default every 30s).
   
   (ii) Get rid of pending pod requests, both acknowledged (i.e. Spark
   knows that a pod is pending resource allocation) and unacknowledged
   (i.e. Spark has requested the pod but the API server hasn't created it
   yet), when they're not needed anymore. This avoids starting those
   executors to just remove them after the idle timeout, wasting resources
   in the meantime.
   
   (iii) Re-work some of the code to avoid unnecessary logging. While not
   bad without dynamic allocation, the existing logging was very chatty
   when dynamic allocation was on. With the changes, all the useful
   information is still there, but only when interesting changes happen.
   
   (iv) Gracefully shut down executors when they become idle. Just deleting
   the pod causes a lot of ugly logs to show up, so it's better to ask pods
   to exit nicely. That also allows Spark to respect the "don't delete
   pods" option when dynamic allocation is on.
   
   Tested on a small k8s cluster running different TPC-DS workloads.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org