You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by jose-torres <gi...@git.apache.org> on 2018/05/17 14:50:45 UTC

[GitHub] spark issue #21353: [SPARK-24304][SS] Scheduler changes for continuous proce...

Github user jose-torres commented on the issue:

    https://github.com/apache/spark/pull/21353
  
    As I've mentioned elsewhere, stages are currently submitted sequentially. That is, for a stage X, all the stage dependencies of X are completed before the tasks within X start. This change proposes to violate that invariant, and it's not obvious that this is a safe approach. The questions we need to answer are:
    * How can we attempt to validate that this is indeed safe to change, and will not break the scheduler or things dependent on it in subtle ways?
    * What benefits do we derive from adding the additional risk of a scheduler change, rather than handling continuous shuffles entirely at the RDD layer?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org