You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/10/16 14:04:15 UTC
[GitHub] [spark] Ngone51 commented on issue #25577: [WIP][CORE][SPARK-28867] InMemoryStore checkpoint to speed up replay log file in HistoryServer

Ngone51 commented on issue #25577: [WIP][CORE][SPARK-28867] InMemoryStore checkpoint to speed up replay log file in HistoryServer
URL: https://github.com/apache/spark/pull/25577#issuecomment-542717490
 
 
   > As long as you have the SHS configured to use local disk, after it parses the logs once, it'll just read the leveldb kvstore which will be very fast.
   
   Yeah, but this has a prerequisite that SHS should always be running with those in-progress applications. If SHS is not running(e.g. not started, crashed) while in-progress applications are running, we'll have completed event log files at the end. Then, a running SHS would need to parse those completed event log files from start to end.
   
   But, to be honest, I feel this can be rare case as user should have a running SHS along with in-progress application in most time if he/she really wants to use SHS. But, whenever SHS shutdown or user directly provides some completed event log files from somewhere else, problem of slow replaying still exists.
   
   And I agree that, with SPARK-28594, the first time parse (or say, replay) in SHS will be much better than current, as we'd always parse files from start to end whenever that in-progress files updated currently. Of course, we should let SPARK-28594 or start a new task to support optimizing single file later while it only supports multiple rolled files yet. 
   
   All in all, if we could ignore that rare case, SPARK-28867 wouldn't be really necessary after SPARK-28594 is done.
   
   > Is your goal to avoid having the SHS even parse the file one time?
   
   If an application finished normally, then we could avoid parsing the file. If not(e.g. crash),
   we'd need to do incremental parse basing on the snapshot before crash. 
   
   >  If you really wanted to do that, I'd have the driver just write out the leveldb kvstore when the application terminates. 
   
   Actually, this is the original plan when we try to do this. But as @gengliangwang points out that application may crash unexpectedly and LevelDB could be corrupt. So, we made current plan, which snapshot periodically and do incremental parse if application crashes.
   
   > the pr description seems to mostly focus on in-progress, so I'm surprised you're saying this is primarily for complete applications.
   
   Sorry for the misleading. As mentioned above, if an application finished normally, then, we don't need to parse the file. So this path can be more simple. But if application crashes, then, we'd need to do incremental parse, which requires more works(e.g. recover live entities, decide from where to continue). So, I paid more effort on explaining how do we handle in-progress application.
   
   
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org