You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/10/14 20:10:42 UTC

[GitHub] [spark] squito commented on issue #25577: [WIP][CORE][SPARK-28867] InMemoryStore checkpoint to speed up replay log file in HistoryServer

squito commented on issue #25577: [WIP][CORE][SPARK-28867] InMemoryStore checkpoint to speed up replay log file in HistoryServer
URL: https://github.com/apache/spark/pull/25577#issuecomment-541894190
 
 
   On taking the snapshot in the driver -- I have nothing against that, if we can get it to work, but I wouldn't ignore making this work in the SHS.  (a) the driver is already managing a lot -- its a single point of failure for spark applications, it can easily get overwhelmed with other things, and all sorts of things will go wrong if we take too long to make that snapshot.  (b) while I think @zsxwing has a good idea on how to make it work, note that its a lot more complicated than doing it in the SHS, as @HeartSaVioR has pointed already there are not the same concerns there.
   
   Of course, the SHS also easily gets overwhelmed -- but there are other things we can do to improve that, without putting pressure on the driver.  You could create multiple instances of the SHS (with a master serving the application listing, but sharding the individual app UIs among the slaves); you could generate the "pre-parsed" version in a standalone process which doesn't even serve a UI at all; you could enable faster re-parsing for the UI while still leveraging in-memory state, which is much simpler (SPARK-20656).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org