You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by vanzin <gi...@git.apache.org> on 2017/10/26 22:37:39 UTC

[GitHub] spark pull request #19582: [SPARK-20644][core] Initial ground work for kvsto...

GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/19582

    [SPARK-20644][core] Initial ground work for kvstore UI backend.

    There are two somewhat unrelated things going on in this patch, but
    both are meant to make integration of individual UI pages later on
    much easier.
    
    The first part is some tweaking of the code in the listener so that
    it does less updates of the kvstore for data that changes fast; for
    example, it avoids writing changes down to the store for every
    task-related event, since those can arrive very quickly at times.
    Instead, for these kinds of events, it chooses to only flush things
    if a certain interval has passed. The interval is based on how often
    the current spark-shell code updates the progress bar for jobs, so
    that users can get reasonably accurate data.
    
    The code also delays as much as possible hitting the underlying kvstore
    when replaying apps in the history server. This is to avoid unnecessary
    writes to disk.
    
    The second set of changes prepare the history server and SparkUI for
    integrating with the kvstore. A new class, AppStatusStore, is used
    for translating between the stored data and the types used in the
    UI / API. The SHS now populates a kvstore with data loaded from
    event logs when an application UI is requested.
    
    Because this store can hold references to disk-based resources, the
    code was modified to retrieve data from the store under a read lock.
    This allows the SHS to detect when the store is still being used, and
    only update it (e.g. because an updated event log was detected) when
    there is no other thread using the store.
    
    This changed ended up creating a lot of churn in the ApplicationCache
    code, which was cleaned up a lot in the process. I also removed some
    metrics which don't make too much sense with the new code.
    
    Tested with existing and added unit tests, and by making sure the SHS
    still works on a real cluster.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-20644

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19582.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19582
    
----
commit f73af34cabb8f4e7e993e6c6d88d4de603776b8e
Author: Marcelo Vanzin <va...@cloudera.com>
Date:   2016-11-23T21:59:35Z

    [SPARK-20644][core] Initial ground work for kvstore UI backend.
    
    There are two somewhat unrelated things going on in this patch, but
    both are meant to make integration of individual UI pages later on
    much easier.
    
    The first part is some tweaking of the code in the listener so that
    it does less updates of the kvstore for data that changes fast; for
    example, it avoids writing changes down to the store for every
    task-related event, since those can arrive very quickly at times.
    Instead, for these kinds of events, it chooses to only flush things
    if a certain interval has passed. The interval is based on how often
    the current spark-shell code updates the progress bar for jobs, so
    that users can get reasonably accurate data.
    
    The code also delays as much as possible hitting the underlying kvstore
    when replaying apps in the history server. This is to avoid unnecessary
    writes to disk.
    
    The second set of changes prepare the history server and SparkUI for
    integrating with the kvstore. A new class, AppStatusStore, is used
    for translating between the stored data and the types used in the
    UI / API. The SHS now populates a kvstore with data loaded from
    event logs when an application UI is requested.
    
    Because this store can hold references to disk-based resources, the
    code was modified to retrieve data from the store under a read lock.
    This allows the SHS to detect when the store is still being used, and
    only update it (e.g. because an updated event log was detected) when
    there is no other thread using the store.
    
    This changed ended up creating a lot of churn in the ApplicationCache
    code, which was cleaned up a lot in the process. I also removed some
    metrics which don't make too much sense with the new code.
    
    Tested with existing and added unit tests, and by making sure the SHS
    still works on a real cluster.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19582: [SPARK-20644][core] Initial ground work for kvsto...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19582#discussion_r148713957
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -528,4 +544,21 @@ private class AppStatusListener(kvstore: KVStore) extends SparkListener with Log
         entity.write(kvstore)
       }
     
    +  /** Update a live entity only if it hasn't been updated in the last configured period. */
    +  private def maybeUpdate(entity: LiveEntity): Unit = {
    +    if (liveUpdatePeriodNs >= 0) {
    +      val now = System.nanoTime()
    --- End diff --
    
    `System.nanoTime()` can be somewhat expensive, right?  there are a few places you are calling this repeatedly, might as well call `nanoTime()` once outside of this method and pass it in.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19582: [SPARK-20644][core] Initial ground work for kvsto...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19582#discussion_r148836744
  
    --- Diff: core/src/main/scala/org/apache/spark/status/LiveEntity.scala ---
    @@ -37,8 +37,11 @@ import org.apache.spark.util.kvstore.KVStore
      */
     private[spark] abstract class LiveEntity {
     
    +  var lastWriteTime = 0L
    +
       def write(store: KVStore): Unit = {
         store.write(doUpdate())
    +    lastWriteTime = System.nanoTime()
    --- End diff --
    
    you could also pass the nanoTime down into this, to avoid calling it again


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19582: [SPARK-20644][core] Initial ground work for kvsto...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19582#discussion_r148913864
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
    @@ -486,8 +580,21 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
         bus.addListener(listener)
     
         replay(fileStatus, isApplicationCompleted(fileStatus), bus, eventsFilter)
    -    listener.applicationInfo.foreach(addListing)
    -    listing.write(LogInfo(logPath.toString(), fileStatus.getLen()))
    +    listener.applicationInfo.foreach { app =>
    +      // Invalidate the existing UI for the reloaded app attempt, if any. Note that this does
    +      // not remove the UI from the active list; that has to be done in onUIDetached, so that
    +      // cleanup of files can be done in a thread-safe manner. It does mean the UI will remain
    +      // in memory for longer than it should.
    --- End diff --
    
    Done. Also updated a bunch of other stale comments.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19582: [SPARK-20644][core] Initial ground work for kvsto...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19582


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    For context:
    
    * Project link: https://issues.apache.org/jira/browse/SPARK-18085
    * Upcoming PRs that build on this code: https://github.com/vanzin/spark/pulls
    
    A special note about this PR: this marks a sort of "point of no return" for the UI. Once this is in, the UI will be in a weird franken-state until https://github.com/vanzin/spark/pull/51 / SPARK-20653 is committed. Until then, there will be duplicate listeners collecting data, which can slow things down a bit in the event bus, and also increase memory usage.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    **[Test build #83424 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83424/testReport)** for PR 19582 at commit [`537c7b4`](https://github.com/apache/spark/commit/537c7b4bf27e1392fd4868db688f7d2f48baa3a5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    merged to master


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83096/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    **[Test build #83096 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83096/testReport)** for PR 19582 at commit [`f73af34`](https://github.com/apache/spark/commit/f73af34cabb8f4e7e993e6c6d88d4de603776b8e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19582: [SPARK-20644][core] Initial ground work for kvsto...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19582#discussion_r148842854
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -528,4 +544,21 @@ private class AppStatusListener(kvstore: KVStore) extends SparkListener with Log
         entity.write(kvstore)
       }
     
    +  /** Update a live entity only if it hasn't been updated in the last configured period. */
    +  private def maybeUpdate(entity: LiveEntity): Unit = {
    +    if (liveUpdatePeriodNs >= 0) {
    +      val now = System.nanoTime()
    --- End diff --
    
    I didn't notice that method ever showing up when profiling; my guess is it's just a read from some CPU register (TSC?) and so reasonably cheap.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    **[Test build #83096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83096/testReport)** for PR 19582 at commit [`f73af34`](https://github.com/apache/spark/commit/f73af34cabb8f4e7e993e6c6d88d4de603776b8e).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83424/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by vanzin <gi...@git.apache.org>.
Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    @squito 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    **[Test build #83424 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83424/testReport)** for PR 19582 at commit [`537c7b4`](https://github.com/apache/spark/commit/537c7b4bf27e1392fd4868db688f7d2f48baa3a5).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19582: [SPARK-20644][core] Initial ground work for kvsto...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19582#discussion_r148847694
  
    --- Diff: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -528,4 +544,21 @@ private class AppStatusListener(kvstore: KVStore) extends SparkListener with Log
         entity.write(kvstore)
       }
     
    +  /** Update a live entity only if it hasn't been updated in the last configured period. */
    +  private def maybeUpdate(entity: LiveEntity): Unit = {
    +    if (liveUpdatePeriodNs >= 0) {
    +      val now = System.nanoTime()
    --- End diff --
    
    I think it can vary a lot with OS (and maybe the hardware?)  Unfortunately when searching now, most of the references are really dated, I have no idea what info is obsolete.  But there is enough evidence it seems prudent to avoid calling it a lot in case its slow in some situations.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19582: [SPARK-20644][core] Initial ground work for kvstore UI b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19582
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83420/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19582: [SPARK-20644][core] Initial ground work for kvsto...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19582#discussion_r148901343
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
    @@ -486,8 +580,21 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
         bus.addListener(listener)
     
         replay(fileStatus, isApplicationCompleted(fileStatus), bus, eventsFilter)
    -    listener.applicationInfo.foreach(addListing)
    -    listing.write(LogInfo(logPath.toString(), fileStatus.getLen()))
    +    listener.applicationInfo.foreach { app =>
    +      // Invalidate the existing UI for the reloaded app attempt, if any. Note that this does
    +      // not remove the UI from the active list; that has to be done in onUIDetached, so that
    +      // cleanup of files can be done in a thread-safe manner. It does mean the UI will remain
    +      // in memory for longer than it should.
    --- End diff --
    
    it took me some time to figure how the cache & invalidation worked, mostly because I wasn't looking in the right places.  I don't think you've made this any more confusing than it was before (in fact its probably better), but seems like a good opportunity to improve commenting a little.  I think it might help to have one comment in the code where the entire sequence is described ( here on `mergeApplicationListing`, or on `AppCache`, or `ApplicationCacheCheckFilter`, doesn't really matter, but they could all reference the longer comment).  if I understand correctly, it would be something like:
    
    Logs of incomplete apps are regularly polled to see if they have been updated (based on an increase in file size).  If they have, the existing data for that app is marked as invalid in LoadedAppUI.  However, no memory is freed, no files are cleaned up at this time, nor is a new UI built.  On each request for one app's UI, the application cache is checked  to see if it has a valid `LoadedAppUI` in the cache.  If there is data in the cache and its valid, then its served.  If there is data in the cache but it is invalid, then the UI is rebuilt from the raw event logs.  If there is nothing in the cache, then the UI is built from the raw event logs and added to the cache.  This may kick another entry out of the cache -- if its for an incomplete app, then any KVStore data written to disk is deleted (as the KVStore for an incomplete app is always regenerated from scratch anyway). 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19582: [SPARK-20644][core] Initial ground work for kvsto...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19582#discussion_r148879455
  
    --- Diff: core/src/main/scala/org/apache/spark/status/storeTypes.scala ---
    @@ -17,12 +17,17 @@
     
     package org.apache.spark.status
     
    +import java.lang.{Integer => JInteger, Long => JLong}
    +
     import com.fasterxml.jackson.annotation.JsonIgnore
     
     import org.apache.spark.status.KVUtils._
     import org.apache.spark.status.api.v1._
     import org.apache.spark.util.kvstore.KVIndex
     
    +private[spark] case class AppStatusStoreMetadata(
    +    val version: Long)
    --- End diff --
    
    nit: `val` is unnecessary


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org