You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by squito <gi...@git.apache.org> on 2016/02/08 18:50:30 UTC

[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

GitHub user squito opened a pull request:

    https://github.com/apache/spark/pull/11118

    [WebUI][SPARK-7889] HistoryServer updates UI for incomplete apps

    When the HistoryServer is showing an incomplete app, it needs to check if there is a newer version of the app available.  It does this by checking if a version of the app has been loaded with a larger *filesize*.  If so, it detaches the current UI, attaches the new one, and redirects back to the same URL to show the new UI.
    
    https://issues.apache.org/jira/browse/SPARK-7889

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/squito/spark SPARK-7889-alternate

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11118.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11118
    
----
commit ae0b8b223698546f2e0bf89da52f3c8883586870
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-06-22T12:54:25Z

    SPARK-7889 cache with initial unit tests

commit 72945e22df00c4ca016aeb60a0697d33ec08dabe
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-06-22T13:46:50Z

    SPARK-7889 cache tests of refresh interval working.

commit 6764f499177d0ce0c422cb5206848ef0270bd0e3
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-06-22T14:36:01Z

    SPARK-7889 switch to explicit scan of SparkUI attempts to determine completion state; use mock spark UI

commit bd47afb6f8b37426df1434a1bc2a43abdd81d65e
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-06-22T14:45:00Z

    SPARK-7889 use spark.history.cache.interval as config option; set default = 60; document

commit aaadf645d7ca82ae111cc3a61a0bf2d75cdb1d99
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-23T14:39:51Z

    SPARK-7889 resync with trunk & slightly improve code layout

commit fe6d9e0e74e7de1ba9db938b34343f412e883bec
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-23T15:31:14Z

    SPARK-7889 scalastyle

commit f69d391fc6bcd52fd9247d6a65b849dc805e0024
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-23T15:43:21Z

    SPARK-7889 scalastyle on ApplicationCacheSuite

commit 334fc91ebd6b7c3d934733adcef93bd5ff9a1325
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-23T18:24:42Z

    SPARK-7889 cleanup of comments and imports & the like

commit c19fee20cd74b866bc110b0b78bb53827f9ac1b6
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-24T15:16:41Z

    SPARK-7889 intermin checkpoint on changes; about to move cache map from string to case class

commit ec5b65238639c7852d1ae0ff7dd25f4e70118079
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-24T20:29:57Z

    SPARK-7889 Cache designed to ask history server/provider for updates; metrics used to track load & time —and for testing

commit feda232876ab5c466d0e009fe9318ed1fbbe5d49
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-24T22:10:02Z

    SPARK-7889 playing losing battles with a test

commit 04d8c64bbda500f0fe55fae262a3485da8118c5c
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-24T22:58:49Z

    SPARK-7889 test working, verified other tests in package work, review comments & stylecheck

commit 9a7ca9f2860500f7833e649803dd637a0e3fb9c1
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-24T23:11:59Z

    SPARK-7889 - stylecheck on tests apparently skipped on mvn

commit a1024aa2e7a57028da34fa8b226438af58ddbbda
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-25T15:05:17Z

    SPARK-7889 style and javadoc only

commit 6e0e26dc9c5e960bf2b3369788d5d5ee180f41da
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-25T18:11:05Z

    SPARK-7889 pull out metrics into own class, make visible for testing

commit b3c70697175e1971e185b880b41b80183b6d66ea
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-25T18:12:05Z

    history server web UI update test from @squito

commit ea2afbbf01827215cfbdd713b22a8d8f52d42210
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-25T18:12:48Z

    SPARK-7889 more on the cache suite tests: there is no evidence that eviction is taking place

commit d113c5b5f8b2dec57c3fa6a84969817f66a45cba
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-25T19:34:09Z

    SPARK-7889: tests to make sure app eviction is taking place and that this triggers callbacs in the cache

commit a128d8c2141d394a2cee82d4eae542762a51a10d
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-25T20:20:11Z

    SPARK-7889 code style check

commit f1c7fe5ebca5cefebd2fadf70e52c85e5f865be5
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-11-25T21:00:44Z

    SPARK-7889: adding return type of tests embedded inside a test method to keep scalastyle happy

commit 07f1af464b2784350045295f0803c32c38e504f1
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-12-01T15:09:45Z

    SPARK-7889 starting to add web filters

commit a33bdd76f1f14df89a98e04ea6f7eeb5d790deae
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-12-01T18:18:37Z

    SPARK-7889 ongoing test dev. Looks like the history provider isn't picking up any changes

commit 9166ba65d89b445a5f6482e66bc3f51246d0845f
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-12-01T18:52:20Z

    SPARK-7889: looks like the test is failing because a new history isn't being saved

commit 523390a0d8f974f75d784fb17091364ba6d186cd
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-12-01T20:27:36Z

    SPARK-7889 address scalastyle warnings

commit 9831ad46ea86c59e21d4fadaa94d52a668afffc5
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-12-01T21:49:34Z

    SPARK-7889 : we aren't getting an updated log file on the second parallelize().count() call, so the FS history provider isn't seeing an update, etc, etc.

commit 6fdaab1e6a6c47b69a1cc9320994f9c3964dd4ea
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-12-02T15:54:53Z

    SPARK-7889: filesize update time included in probe; update thread also scans through modified files to verify this takes.

commit 163e218d812aa5448803db1e460c705a56df310f
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-12-02T15:55:42Z

    SPARK-7889: not all filesystems update modtime on a rename; the EventLoggingListener attempts to do so afterwards, swallowing exceptions raised

commit bc3a2e35f0b32c3d908207cf8f3c700f1cfe06bf
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-12-03T16:16:28Z

    SPARK-7889 cache update works in tests -unreliably. Traces imply that its a race condition between probe time and the scanner thread -if the initial load is after the file update but before the scanner thread has looked @ the file, the file isn't detected as updated. The provider has to return the actual file timestamp of its choice for use in update checks, not the time that the initial load took place

commit f81cfe13fc7ae2fb967a8ce0e556bb85ca38066b
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-12-03T19:04:22Z

    SPARK-7889 still looking at a race condition in the test. This adds more time details, but I'm about to move the fshistory off time and into a generic "attempt version" counter which will be compared on the probe. If an update has happened, this will know

commit 78a463edf747e6df321b1e06f1960c5cc3456acf
Author: Steve Loughran <st...@hortonworks.com>
Date:   2015-12-03T19:16:10Z

    SPARK-7889 moved off time differences to a simple generational counter and equality check

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182980915
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11118#discussion_r52461110
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
    @@ -511,6 +545,14 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
           bus: ReplayListenerBus): Option[FsApplicationAttemptInfo] = {
         val logPath = eventLog.getPath()
         logInfo(s"Replaying log path: $logPath")
    +    // Note that the eventLog may have *increased* in size since when we grabbed the filestatus,
    +    // and when we read the file here.  That is OK -- it may result in an unnecessary refresh
    +    // when there is no update, but will not result in missing an update.  We *must* prevent
    +    // an error the other way -- if we report a size bigger (ie later) than the file that is
    +    // actually read, we may never refresh the app
    +    // we expect FileStatus to return the file size when it was initially created, but the api
    +    // is not explicit about this so lets be extra-safe.
    +    val eventLogLength = eventLog.getLen()
    --- End diff --
    
    this is usually just another call to getFileStatus().length; {{FileStatus}} is required to be static once created. (see http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/filesystem.html, though it skimps on concurrency issues)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181508356
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/50929/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11118#discussion_r52461352
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
    @@ -551,6 +597,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
        *
        * Note that DistributedFileSystem is a `@LimitedPrivate` class, which for all practical reasons
        * makes it more public than not.
    +   * HADOOP-12614 proposes adding a public version of this probe, one which other filesystems could
    --- End diff --
    
    let's just cut these two comment lines; the don't add anything


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11118#discussion_r52614670
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala ---
    @@ -511,6 +545,14 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
           bus: ReplayListenerBus): Option[FsApplicationAttemptInfo] = {
         val logPath = eventLog.getPath()
         logInfo(s"Replaying log path: $logPath")
    +    // Note that the eventLog may have *increased* in size since when we grabbed the filestatus,
    +    // and when we read the file here.  That is OK -- it may result in an unnecessary refresh
    +    // when there is no update, but will not result in missing an update.  We *must* prevent
    +    // an error the other way -- if we report a size bigger (ie later) than the file that is
    +    // actually read, we may never refresh the app
    +    // we expect FileStatus to return the file size when it was initially created, but the api
    +    // is not explicit about this so lets be extra-safe.
    +    val eventLogLength = eventLog.getLen()
    --- End diff --
    
    ah I see, I expected it to behave that way but couldn't find any documentation which really made that explicit.  I guess you're saying its guaranteed by the post-conditions for getFileStatus()?  I've updated the comment now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181613600
  
    **[Test build #2526 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2526/consoleFull)** for PR 11118 at commit [`488da80`](https://github.com/apache/spark/commit/488da804320bcb4ab9b09526ac234f8436ce2b6a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181601313
  
    **[Test build #50936 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/50936/consoleFull)** for PR 11118 at commit [`488da80`](https://github.com/apache/spark/commit/488da804320bcb4ab9b09526ac234f8436ce2b6a).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182918247
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/11118


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181500981
  
    **[Test build #2524 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2524/consoleFull)** for PR 11118 at commit [`bfbf348`](https://github.com/apache/spark/commit/bfbf348020c92232624f8009f1c4312c21bd964b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181601687
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-183004244
  
    **[Test build #51119 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51119/consoleFull)** for PR 11118 at commit [`04f5385`](https://github.com/apache/spark/commit/04f538546eab58ca67ee9ca30a8fdcc2460ea09b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-183262779
  
    Good Q. We thought it'd be simple at first too.
    
    1. We need a notion of "out-of-dateness" which (a) supports different back ends, and (b) works reliably for files stored in hdfs:// and other filesystems (not handled against S3 or other object stores, but that's because they only save their data on a close(), that is: the end of a successful application.
    1. The google cache class is, well, limited. Essentially what we are doing is adding a probe to the cache entries which is triggered on retrieval, which can then cause a new web UI to be loaded.
    1. the current probe comes from the fs provider. Initially the patch looked at modification timestamps, but that proved unreliable (modtime granularity and issues about when it actually becomes visible in the namenode). Hence, a move to file length.
    1. The timeline provider, which I'm no working on elsewhere, does a GET of the timeline server metadata for that instance, looks at an event count pushed up there. That one is going to add a bit of a window on checks too, (somehow), to keep load on Yarn timeline server down.
    1. We need to trigger an update check on GETs all the way down the UI. The way the servlet API works, something that still expects to be configured by `web.xml`, that's hard to do without singletons, hence the singleton at the bottom.
    1. Finally there's some metrics of what's going on. SPARK-11373 adds metrics to the history server, of which this becomes a part.
    1. Oh, and then there's the tests. They actually use the metrics as the grey-box view into the cache, ensures that the metrics actually get written, and that they'll remain stable over time. Break the metrics and the tests fail, so you find out before ops teams come after you.
    
    There's actually two other bigger things which would be possible to do on this chain
    
    1. incremental playback of changes. Rather than replay an entire app's history, start from where you left off. (i.e. file.length()+1). Maybe I'll look at that sometime, as it would really benefit streaming work.
    1. something that works on object stores. There'd I'd go for spark application instances to write to HDFS, with a copy to S3 on completion —and the history provider to be able to (a) scan both dirs, (b) do the copy if the app is no longer running (i.e. fails while declared incomplete). That's not on my todo list.
    
    Oh, and faster boot time with a summary file alongside the full history, with main details (finished: Boolean, spark-version, ...) so that the boot time goes from O(apps*events) to O(apps)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182943275
  
    **[Test build #51117 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51117/consoleFull)** for PR 11118 at commit [`57e937b`](https://github.com/apache/spark/commit/57e937bd6261f66e60b5b58a3fc3062b2d90f785).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181586315
  
    **[Test build #2525 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2525/consoleFull)** for PR 11118 at commit [`d4740bc`](https://github.com/apache/spark/commit/d4740bc20b464ede8ae64be7b1a3bc47d06e3ecd).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182951344
  
    **[Test build #51119 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51119/consoleFull)** for PR 11118 at commit [`04f5385`](https://github.com/apache/spark/commit/04f538546eab58ca67ee9ca30a8fdcc2460ea09b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-183001005
  
    **[Test build #2536 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2536/consoleFull)** for PR 11118 at commit [`04f5385`](https://github.com/apache/spark/commit/04f538546eab58ca67ee9ca30a8fdcc2460ea09b).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11118#discussion_r52203370
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala ---
    @@ -0,0 +1,669 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.deploy.history
    +
    +import java.util.NoSuchElementException
    +import javax.servlet.{DispatcherType, Filter, FilterChain, FilterConfig, ServletException, ServletRequest, ServletResponse}
    +import javax.servlet.http.{HttpServletRequest, HttpServletResponse}
    +
    +import scala.collection.JavaConverters._
    +import scala.util.control.NonFatal
    +
    +import com.codahale.metrics.{Counter, MetricRegistry, Timer}
    +import com.google.common.cache.{CacheBuilder, CacheLoader, LoadingCache, RemovalListener, RemovalNotification}
    +import org.eclipse.jetty.servlet.FilterHolder
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.metrics.source.Source
    +import org.apache.spark.ui.SparkUI
    +import org.apache.spark.util.Clock
    +
    +/**
    + * Cache for applications.
    + *
    + * Completed applications are cached for as long as there is capacity for them.
    + * Incompleted applications have their update time checked on every
    + * retrieval; if the cached entry is out of date, it is refreshed.
    + *
    + * @note there must be only one instance of [[ApplicationCache]] in a
    + * JVM at a time. This is because a static field in [[ApplicationCacheCheckFilterRelay]]
    + * keeps a reference to the cache so that HTTP requests on the attempt-specific web UIs
    + * can probe the current cache to see if the attempts have changed.
    + *
    + * Creating multiple instances will break this routing.
    + * @param operations implementation of record access operations
    + * @param refreshInterval interval between refreshes in milliseconds.
    + * @param retainedApplications number of retained applications
    + * @param clock time source
    + */
    +private[history] class ApplicationCache(
    +    val operations: ApplicationCacheOperations,
    +    val refreshInterval: Long,
    +    val retainedApplications: Int,
    +    val clock: Clock) extends Logging {
    +
    +  /**
    +   * Services the load request from the cache.
    +   */
    +  private val appLoader = new CacheLoader[CacheKey, CacheEntry] {
    +
    +    /** the cache key doesn't match a cached entry, or the entry is out-of-date, so load it. */
    +    override def load(key: CacheKey): CacheEntry = {
    +      loadApplicationEntry(key.appId, key.attemptId)
    +    }
    +
    +  }
    +
    +  /**
    +   * Handler for callbacks from the cache of entry removal.
    +   */
    +  private val removalListener = new RemovalListener[CacheKey, CacheEntry] {
    +
    +    /**
    +     * Removal event notifies the provider to detach the UI.
    +     * @param rm removal notification
    +     */
    +    override def onRemoval(rm: RemovalNotification[CacheKey, CacheEntry]): Unit = {
    +      metrics.evictionCount.inc()
    +      val key = rm.getKey
    +      logDebug(s"Evicting entry ${key}")
    +      operations.detachSparkUI(key.appId, key.attemptId, rm.getValue().ui)
    +    }
    +  }
    +
    +  /**
    +   * The cache of applications.
    +   *
    +   * Tagged as `protected` so as to allow subclasses in tests to accesss it directly
    +   */
    +  protected val appCache: LoadingCache[CacheKey, CacheEntry] = {
    +    CacheBuilder.newBuilder()
    +        .maximumSize(retainedApplications)
    +        .removalListener(removalListener)
    +        .build(appLoader)
    +  }
    +
    +  /**
    +   * The metrics which are updated as the cache is used.
    +   */
    +  val metrics = new CacheMetrics("history.cache")
    +
    +  init()
    +
    +  /**
    +   * Perform any startup operations.
    +   *
    +   * This includes declaring this instance as the cache to use in the
    +   * [[ApplicationCacheCheckFilterRelay]].
    +   */
    +  private def init(): Unit = {
    +    ApplicationCacheCheckFilterRelay.setApplicationCache(this)
    +  }
    +
    +  /**
    +   * Stop the cache.
    +   * This will reset the relay in [[ApplicationCacheCheckFilterRelay]].
    +   */
    +  def stop(): Unit = {
    +    ApplicationCacheCheckFilterRelay.resetApplicationCache()
    +  }
    +
    +  /**
    +   * Get an entry.
    +   *
    +   * Cache fetch/refresh will have taken place by the time this method returns.
    +   * @param appAndAttempt application to look up in the format needed by the history server web UI,
    +   *                      `appId/attemptId` or `appId`.
    +   * @return the entry
    +   */
    +  def get(appAndAttempt: String): SparkUI = {
    +    val parts = splitAppAndAttemptKey(appAndAttempt)
    +    get(parts._1, parts._2)
    +  }
    +
    +  /**
    +   * Get the Spark UI, converting a lookup failure from an exception to `None`.
    +   * @param appAndAttempt application to look up in the format needed by the history server web UI,
    +   *                      `appId/attemptId` or `appId`.
    +   * @return the entry
    +   */
    +  def getSparkUI(appAndAttempt: String): Option[SparkUI] = {
    +    try {
    +      val ui = get(appAndAttempt)
    +      Some(ui)
    +    } catch {
    +      case NonFatal(e) => e.getCause() match {
    +        case nsee: NoSuchElementException =>
    +          None
    +        case cause: Exception => throw cause
    +      }
    +    }
    +  }
    +
    +  /**
    +   * Get the associated spark UI.
    +   *
    +   * Cache fetch/refresh will have taken place by the time this method returns.
    +   * @param appId application ID
    +   * @param attemptId optional attempt ID
    +   * @return the entry
    +   */
    +  def get(appId: String, attemptId: Option[String]): SparkUI = {
    +    lookupAndUpdate(appId, attemptId)._1.ui
    +  }
    +
    +  /**
    +   * Look up the entry; update it if needed.
    +   * @param appId application ID
    +   * @param attemptId optional attempt ID
    +   * @return the underlying cache entry -which can have its timestamp changed, and a flag to
    +   *         indicate that the entry has changed
    +   */
    +  private def lookupAndUpdate(appId: String, attemptId: Option[String]): (CacheEntry, Boolean) = {
    +    metrics.lookupCount.inc()
    +    val cacheKey = CacheKey(appId, attemptId)
    +    var entry = appCache.getIfPresent(cacheKey)
    +    var updated = false
    +    if (entry == null) {
    +      // no entry, so fetch without any post-fetch probes for out-of-dateness
    +      // this will trigger a callback to loadApplicationEntry()
    +      entry = appCache.get(cacheKey)
    +    } else if (!entry.completed) {
    +      val now = clock.getTimeMillis()
    +      if (now - entry.probeTime > refreshInterval) {
    +        log.debug(s"Probing at time $now for updated application $cacheKey -> $entry")
    +        metrics.updateProbeCount.inc()
    +        updated = time(metrics.updateProbeTimer) {
    +          entry.updateProbe()
    --- End diff --
    
    Note that this check is now extremely cheap (at least with the `FSHistoryProvider`).  Actually checking for an update to the logs happens on its own schedule, as that scans logs looking for both new apps and updates to existing ones.  That suggests that we could either drop this extra interval completely, and just do this check on every request, or if we want to leave it for other `HistoryProvider`s, we could at least make the default very rapid.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182943471
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51110/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182980939
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51117/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182935190
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51105/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by shaneknapp <gi...@git.apache.org>.
Github user shaneknapp commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182945362
  
    jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182403322
  
    LGTM; unifying the different probes for new-ness makes sense.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181535668
  
    **[Test build #2525 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2525/consoleFull)** for PR 11118 at commit [`d4740bc`](https://github.com/apache/spark/commit/d4740bc20b464ede8ae64be7b1a3bc47d06e3ecd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182951382
  
    **[Test build #2536 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2536/consoleFull)** for PR 11118 at commit [`04f5385`](https://github.com/apache/spark/commit/04f538546eab58ca67ee9ca30a8fdcc2460ea09b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182916241
  
    **[Test build #51105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51105/consoleFull)** for PR 11118 at commit [`2286aa8`](https://github.com/apache/spark/commit/2286aa895677aee66f42ccef3f12a5c3d5c3938e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181507545
  
    **[Test build #2524 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2524/consoleFull)** for PR 11118 at commit [`bfbf348`](https://github.com/apache/spark/commit/bfbf348020c92232624f8009f1c4312c21bd964b).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181502638
  
    **[Test build #50929 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/50929/consoleFull)** for PR 11118 at commit [`bfbf348`](https://github.com/apache/spark/commit/bfbf348020c92232624f8009f1c4312c21bd964b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181499494
  
    Reviewers: note this was done primarily by @steveloughran , for now just posting this as a potential simplification to consider vs. https://github.com/apache/spark/pull/6935.
    
    Steve -- the main change here is just in `FsHistoryProvider`.  I noticed that `checkForLogs` was going to reload the incomplete apps *anyway* on the regular scans.  So it seemed like there was no point in adding special logic to do a separate set of reloading.  Maybe that reloading is dumb, and we should change that to separate out finding new apps vs. reloading existing ones, but I think we might be better off addressing that separately.
    
    It could be that there is some hole in the logic here that I'm not seeing, but tests pass, so that means we should probably update the tests to address the hole.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181562007
  
    **[Test build #50936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/50936/consoleFull)** for PR 11118 at commit [`488da80`](https://github.com/apache/spark/commit/488da804320bcb4ab9b09526ac234f8436ce2b6a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181601692
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/50936/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11118#discussion_r52264218
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala ---
    @@ -0,0 +1,669 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.deploy.history
    +
    +import java.util.NoSuchElementException
    +import javax.servlet.{DispatcherType, Filter, FilterChain, FilterConfig, ServletException, ServletRequest, ServletResponse}
    +import javax.servlet.http.{HttpServletRequest, HttpServletResponse}
    +
    +import scala.collection.JavaConverters._
    +import scala.util.control.NonFatal
    +
    +import com.codahale.metrics.{Counter, MetricRegistry, Timer}
    +import com.google.common.cache.{CacheBuilder, CacheLoader, LoadingCache, RemovalListener, RemovalNotification}
    +import org.eclipse.jetty.servlet.FilterHolder
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.metrics.source.Source
    +import org.apache.spark.ui.SparkUI
    +import org.apache.spark.util.Clock
    +
    +/**
    + * Cache for applications.
    + *
    + * Completed applications are cached for as long as there is capacity for them.
    + * Incompleted applications have their update time checked on every
    + * retrieval; if the cached entry is out of date, it is refreshed.
    + *
    + * @note there must be only one instance of [[ApplicationCache]] in a
    + * JVM at a time. This is because a static field in [[ApplicationCacheCheckFilterRelay]]
    + * keeps a reference to the cache so that HTTP requests on the attempt-specific web UIs
    + * can probe the current cache to see if the attempts have changed.
    + *
    + * Creating multiple instances will break this routing.
    + * @param operations implementation of record access operations
    + * @param refreshInterval interval between refreshes in milliseconds.
    + * @param retainedApplications number of retained applications
    + * @param clock time source
    + */
    +private[history] class ApplicationCache(
    +    val operations: ApplicationCacheOperations,
    +    val refreshInterval: Long,
    +    val retainedApplications: Int,
    +    val clock: Clock) extends Logging {
    +
    +  /**
    +   * Services the load request from the cache.
    +   */
    +  private val appLoader = new CacheLoader[CacheKey, CacheEntry] {
    +
    +    /** the cache key doesn't match a cached entry, or the entry is out-of-date, so load it. */
    +    override def load(key: CacheKey): CacheEntry = {
    +      loadApplicationEntry(key.appId, key.attemptId)
    +    }
    +
    +  }
    +
    +  /**
    +   * Handler for callbacks from the cache of entry removal.
    +   */
    +  private val removalListener = new RemovalListener[CacheKey, CacheEntry] {
    +
    +    /**
    +     * Removal event notifies the provider to detach the UI.
    +     * @param rm removal notification
    +     */
    +    override def onRemoval(rm: RemovalNotification[CacheKey, CacheEntry]): Unit = {
    +      metrics.evictionCount.inc()
    +      val key = rm.getKey
    +      logDebug(s"Evicting entry ${key}")
    +      operations.detachSparkUI(key.appId, key.attemptId, rm.getValue().ui)
    +    }
    +  }
    +
    +  /**
    +   * The cache of applications.
    +   *
    +   * Tagged as `protected` so as to allow subclasses in tests to accesss it directly
    +   */
    +  protected val appCache: LoadingCache[CacheKey, CacheEntry] = {
    +    CacheBuilder.newBuilder()
    +        .maximumSize(retainedApplications)
    +        .removalListener(removalListener)
    +        .build(appLoader)
    +  }
    +
    +  /**
    +   * The metrics which are updated as the cache is used.
    +   */
    +  val metrics = new CacheMetrics("history.cache")
    +
    +  init()
    +
    +  /**
    +   * Perform any startup operations.
    +   *
    +   * This includes declaring this instance as the cache to use in the
    +   * [[ApplicationCacheCheckFilterRelay]].
    +   */
    +  private def init(): Unit = {
    +    ApplicationCacheCheckFilterRelay.setApplicationCache(this)
    +  }
    +
    +  /**
    +   * Stop the cache.
    +   * This will reset the relay in [[ApplicationCacheCheckFilterRelay]].
    +   */
    +  def stop(): Unit = {
    +    ApplicationCacheCheckFilterRelay.resetApplicationCache()
    +  }
    +
    +  /**
    +   * Get an entry.
    +   *
    +   * Cache fetch/refresh will have taken place by the time this method returns.
    +   * @param appAndAttempt application to look up in the format needed by the history server web UI,
    +   *                      `appId/attemptId` or `appId`.
    +   * @return the entry
    +   */
    +  def get(appAndAttempt: String): SparkUI = {
    +    val parts = splitAppAndAttemptKey(appAndAttempt)
    +    get(parts._1, parts._2)
    +  }
    +
    +  /**
    +   * Get the Spark UI, converting a lookup failure from an exception to `None`.
    +   * @param appAndAttempt application to look up in the format needed by the history server web UI,
    +   *                      `appId/attemptId` or `appId`.
    +   * @return the entry
    +   */
    +  def getSparkUI(appAndAttempt: String): Option[SparkUI] = {
    +    try {
    +      val ui = get(appAndAttempt)
    +      Some(ui)
    +    } catch {
    +      case NonFatal(e) => e.getCause() match {
    +        case nsee: NoSuchElementException =>
    +          None
    +        case cause: Exception => throw cause
    +      }
    +    }
    +  }
    +
    +  /**
    +   * Get the associated spark UI.
    +   *
    +   * Cache fetch/refresh will have taken place by the time this method returns.
    +   * @param appId application ID
    +   * @param attemptId optional attempt ID
    +   * @return the entry
    +   */
    +  def get(appId: String, attemptId: Option[String]): SparkUI = {
    +    lookupAndUpdate(appId, attemptId)._1.ui
    +  }
    +
    +  /**
    +   * Look up the entry; update it if needed.
    +   * @param appId application ID
    +   * @param attemptId optional attempt ID
    +   * @return the underlying cache entry -which can have its timestamp changed, and a flag to
    +   *         indicate that the entry has changed
    +   */
    +  private def lookupAndUpdate(appId: String, attemptId: Option[String]): (CacheEntry, Boolean) = {
    +    metrics.lookupCount.inc()
    +    val cacheKey = CacheKey(appId, attemptId)
    +    var entry = appCache.getIfPresent(cacheKey)
    +    var updated = false
    +    if (entry == null) {
    +      // no entry, so fetch without any post-fetch probes for out-of-dateness
    +      // this will trigger a callback to loadApplicationEntry()
    +      entry = appCache.get(cacheKey)
    +    } else if (!entry.completed) {
    +      val now = clock.getTimeMillis()
    +      if (now - entry.probeTime > refreshInterval) {
    +        log.debug(s"Probing at time $now for updated application $cacheKey -> $entry")
    +        metrics.updateProbeCount.inc()
    +        updated = time(metrics.updateProbeTimer) {
    +          entry.updateProbe()
    --- End diff --
    
    right, I guess I just wanted to point out that with the changes here, this probe is entirely independent from replay.  Replay happens with normal log-checking -- that frequency is controlled by `spark.history.fs.update.interval`.   Here, we're just checking whether that regular log scanning has already loaded an updated UI for this attempt, and that is it.  Since `spark.history.fs.update.interval` is entirely controlling the expensive part, we may not need any other interval.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11118#discussion_r52465521
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala ---
    @@ -256,6 +269,215 @@ class HistoryServerSuite extends SparkFunSuite with BeforeAndAfter with Matchers
         all (siteRelativeLinks) should startWith (uiRoot)
       }
     
    +  test("incomplete apps get refreshed") {
    +
    +    implicit val webDriver: WebDriver = new HtmlUnitDriver
    +    implicit val formats = org.json4s.DefaultFormats
    +
    +    // this test dir is explictly deleted on successful runs; retained for diagnostics when
    +    // not
    +    val logDir = Utils.createDirectory(System.getProperty("java.io.tmpdir", "logs"))
    +
    +    // a new conf is used with the background thread set and running at its fastest
    +    // alllowed refresh rate (1Hz)
    +    val myConf = new SparkConf()
    +      .set("spark.history.fs.logDirectory", logDir.getAbsolutePath)
    +      .set("spark.eventLog.dir", logDir.getAbsolutePath)
    +      .set("spark.history.fs.update.interval", "1s")
    +      .set("spark.eventLog.enabled", "true")
    +      .set("spark.history.cache.window", "250ms")
    +      .remove("spark.testing")
    +    val provider = new FsHistoryProvider(myConf)
    +    val securityManager = new SecurityManager(myConf)
    +
    +    sc = new SparkContext("local", "test", myConf)
    +    val logDirUri = logDir.toURI
    +    val logDirPath = new Path(logDirUri)
    +    val fs = FileSystem.get(logDirUri, sc.hadoopConfiguration)
    +
    +    def listDir(dir: Path): Seq[FileStatus] = {
    +      val statuses = fs.listStatus(dir)
    +      statuses.flatMap(
    +        stat => if (stat.isDirectory) listDir(stat.getPath) else Seq(stat))
    +    }
    +
    +    def dumpLogDir(msg: String = ""): Unit = {
    +      if (log.isDebugEnabled) {
    +        logDebug(msg)
    +        listDir(logDirPath).foreach { status =>
    +          val s = status.toString
    +          logDebug(s)
    +        }
    +      }
    +    }
    +
    +    // stop the server with the old config, and start the new one
    +    server.stop()
    +    server = new HistoryServer(myConf, provider, securityManager, 18080)
    +    server.initialize()
    +    server.bind()
    +    val port = server.boundPort
    +    val metrics = server.cacheMetrics
    +
    +    // assert that a metric has a value; if not dump the whole metrics instance
    +    def assertMetric(name: String, counter: Counter, expected: Long): Unit = {
    +      val actual = counter.getCount
    +      if (actual != expected) {
    +        // this is here because Scalatest loses stack depth
    +        fail(s"Wrong $name value - expected $expected but got $actual" +
    +            s" in metrics\n$metrics")
    +      }
    +    }
    +
    +    // build a URL for an app or app/attempt plus a page underneath
    +    def buildURL(appId: String, suffix: String): URL = {
    +      new URL(s"http://localhost:$port/history/$appId$suffix")
    +    }
    +
    +    // build a rest URL for the application and suffix.
    +    def applications(appId: String, suffix: String): URL = {
    +      new URL(s"http://localhost:$port/api/v1/applications/$appId$suffix")
    +    }
    +
    +    val historyServerRoot = new URL(s"http://localhost:$port/")
    +    val historyServerIncompleted = new URL(historyServerRoot, "?page=1&showIncomplete=true")
    +
    +    // assert the body of a URL contains a string; return that body
    +    def assertUrlContains(url: URL, str: String): String = {
    --- End diff --
    
    this method is no longer used —you can cut it


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181508283
  
    **[Test build #50929 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/50929/consoleFull)** for PR 11118 at commit [`bfbf348`](https://github.com/apache/spark/commit/bfbf348020c92232624f8009f1c4312c21bd964b).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182943466
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181537586
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/50933/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11118#discussion_r52220649
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/ApplicationCache.scala ---
    @@ -0,0 +1,669 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.deploy.history
    +
    +import java.util.NoSuchElementException
    +import javax.servlet.{DispatcherType, Filter, FilterChain, FilterConfig, ServletException, ServletRequest, ServletResponse}
    +import javax.servlet.http.{HttpServletRequest, HttpServletResponse}
    +
    +import scala.collection.JavaConverters._
    +import scala.util.control.NonFatal
    +
    +import com.codahale.metrics.{Counter, MetricRegistry, Timer}
    +import com.google.common.cache.{CacheBuilder, CacheLoader, LoadingCache, RemovalListener, RemovalNotification}
    +import org.eclipse.jetty.servlet.FilterHolder
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.metrics.source.Source
    +import org.apache.spark.ui.SparkUI
    +import org.apache.spark.util.Clock
    +
    +/**
    + * Cache for applications.
    + *
    + * Completed applications are cached for as long as there is capacity for them.
    + * Incompleted applications have their update time checked on every
    + * retrieval; if the cached entry is out of date, it is refreshed.
    + *
    + * @note there must be only one instance of [[ApplicationCache]] in a
    + * JVM at a time. This is because a static field in [[ApplicationCacheCheckFilterRelay]]
    + * keeps a reference to the cache so that HTTP requests on the attempt-specific web UIs
    + * can probe the current cache to see if the attempts have changed.
    + *
    + * Creating multiple instances will break this routing.
    + * @param operations implementation of record access operations
    + * @param refreshInterval interval between refreshes in milliseconds.
    + * @param retainedApplications number of retained applications
    + * @param clock time source
    + */
    +private[history] class ApplicationCache(
    +    val operations: ApplicationCacheOperations,
    +    val refreshInterval: Long,
    +    val retainedApplications: Int,
    +    val clock: Clock) extends Logging {
    +
    +  /**
    +   * Services the load request from the cache.
    +   */
    +  private val appLoader = new CacheLoader[CacheKey, CacheEntry] {
    +
    +    /** the cache key doesn't match a cached entry, or the entry is out-of-date, so load it. */
    +    override def load(key: CacheKey): CacheEntry = {
    +      loadApplicationEntry(key.appId, key.attemptId)
    +    }
    +
    +  }
    +
    +  /**
    +   * Handler for callbacks from the cache of entry removal.
    +   */
    +  private val removalListener = new RemovalListener[CacheKey, CacheEntry] {
    +
    +    /**
    +     * Removal event notifies the provider to detach the UI.
    +     * @param rm removal notification
    +     */
    +    override def onRemoval(rm: RemovalNotification[CacheKey, CacheEntry]): Unit = {
    +      metrics.evictionCount.inc()
    +      val key = rm.getKey
    +      logDebug(s"Evicting entry ${key}")
    +      operations.detachSparkUI(key.appId, key.attemptId, rm.getValue().ui)
    +    }
    +  }
    +
    +  /**
    +   * The cache of applications.
    +   *
    +   * Tagged as `protected` so as to allow subclasses in tests to accesss it directly
    +   */
    +  protected val appCache: LoadingCache[CacheKey, CacheEntry] = {
    +    CacheBuilder.newBuilder()
    +        .maximumSize(retainedApplications)
    +        .removalListener(removalListener)
    +        .build(appLoader)
    +  }
    +
    +  /**
    +   * The metrics which are updated as the cache is used.
    +   */
    +  val metrics = new CacheMetrics("history.cache")
    +
    +  init()
    +
    +  /**
    +   * Perform any startup operations.
    +   *
    +   * This includes declaring this instance as the cache to use in the
    +   * [[ApplicationCacheCheckFilterRelay]].
    +   */
    +  private def init(): Unit = {
    +    ApplicationCacheCheckFilterRelay.setApplicationCache(this)
    +  }
    +
    +  /**
    +   * Stop the cache.
    +   * This will reset the relay in [[ApplicationCacheCheckFilterRelay]].
    +   */
    +  def stop(): Unit = {
    +    ApplicationCacheCheckFilterRelay.resetApplicationCache()
    +  }
    +
    +  /**
    +   * Get an entry.
    +   *
    +   * Cache fetch/refresh will have taken place by the time this method returns.
    +   * @param appAndAttempt application to look up in the format needed by the history server web UI,
    +   *                      `appId/attemptId` or `appId`.
    +   * @return the entry
    +   */
    +  def get(appAndAttempt: String): SparkUI = {
    +    val parts = splitAppAndAttemptKey(appAndAttempt)
    +    get(parts._1, parts._2)
    +  }
    +
    +  /**
    +   * Get the Spark UI, converting a lookup failure from an exception to `None`.
    +   * @param appAndAttempt application to look up in the format needed by the history server web UI,
    +   *                      `appId/attemptId` or `appId`.
    +   * @return the entry
    +   */
    +  def getSparkUI(appAndAttempt: String): Option[SparkUI] = {
    +    try {
    +      val ui = get(appAndAttempt)
    +      Some(ui)
    +    } catch {
    +      case NonFatal(e) => e.getCause() match {
    +        case nsee: NoSuchElementException =>
    +          None
    +        case cause: Exception => throw cause
    +      }
    +    }
    +  }
    +
    +  /**
    +   * Get the associated spark UI.
    +   *
    +   * Cache fetch/refresh will have taken place by the time this method returns.
    +   * @param appId application ID
    +   * @param attemptId optional attempt ID
    +   * @return the entry
    +   */
    +  def get(appId: String, attemptId: Option[String]): SparkUI = {
    +    lookupAndUpdate(appId, attemptId)._1.ui
    +  }
    +
    +  /**
    +   * Look up the entry; update it if needed.
    +   * @param appId application ID
    +   * @param attemptId optional attempt ID
    +   * @return the underlying cache entry -which can have its timestamp changed, and a flag to
    +   *         indicate that the entry has changed
    +   */
    +  private def lookupAndUpdate(appId: String, attemptId: Option[String]): (CacheEntry, Boolean) = {
    +    metrics.lookupCount.inc()
    +    val cacheKey = CacheKey(appId, attemptId)
    +    var entry = appCache.getIfPresent(cacheKey)
    +    var updated = false
    +    if (entry == null) {
    +      // no entry, so fetch without any post-fetch probes for out-of-dateness
    +      // this will trigger a callback to loadApplicationEntry()
    +      entry = appCache.get(cacheKey)
    +    } else if (!entry.completed) {
    +      val now = clock.getTimeMillis()
    +      if (now - entry.probeTime > refreshInterval) {
    +        log.debug(s"Probing at time $now for updated application $cacheKey -> $entry")
    +        metrics.updateProbeCount.inc()
    +        updated = time(metrics.updateProbeTimer) {
    +          entry.updateProbe()
    --- End diff --
    
    my real concern was not cost of probe, but what if there was an app updating rapidly, with a lot of user requests coming in; it'd trigger replay too often. it's the cost of replay which I worried about


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181508354
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by shaneknapp <gi...@git.apache.org>.
Github user shaneknapp commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182936245
  
    jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-183005159
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51119/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182949387
  
    Plan to merge this a little later (assuming tests pass), any other comments?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-183005153
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182935184
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-183168840
  
    merged to master, thanks @steveloughran!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by steveloughran <gi...@git.apache.org>.
Github user steveloughran commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11118#discussion_r52465565
  
    --- Diff: core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala ---
    @@ -256,6 +269,215 @@ class HistoryServerSuite extends SparkFunSuite with BeforeAndAfter with Matchers
         all (siteRelativeLinks) should startWith (uiRoot)
       }
     
    +  test("incomplete apps get refreshed") {
    +
    +    implicit val webDriver: WebDriver = new HtmlUnitDriver
    +    implicit val formats = org.json4s.DefaultFormats
    +
    +    // this test dir is explictly deleted on successful runs; retained for diagnostics when
    +    // not
    +    val logDir = Utils.createDirectory(System.getProperty("java.io.tmpdir", "logs"))
    +
    +    // a new conf is used with the background thread set and running at its fastest
    +    // alllowed refresh rate (1Hz)
    +    val myConf = new SparkConf()
    +      .set("spark.history.fs.logDirectory", logDir.getAbsolutePath)
    +      .set("spark.eventLog.dir", logDir.getAbsolutePath)
    +      .set("spark.history.fs.update.interval", "1s")
    +      .set("spark.eventLog.enabled", "true")
    +      .set("spark.history.cache.window", "250ms")
    +      .remove("spark.testing")
    +    val provider = new FsHistoryProvider(myConf)
    +    val securityManager = new SecurityManager(myConf)
    +
    +    sc = new SparkContext("local", "test", myConf)
    +    val logDirUri = logDir.toURI
    +    val logDirPath = new Path(logDirUri)
    +    val fs = FileSystem.get(logDirUri, sc.hadoopConfiguration)
    +
    +    def listDir(dir: Path): Seq[FileStatus] = {
    +      val statuses = fs.listStatus(dir)
    +      statuses.flatMap(
    +        stat => if (stat.isDirectory) listDir(stat.getPath) else Seq(stat))
    +    }
    +
    +    def dumpLogDir(msg: String = ""): Unit = {
    +      if (log.isDebugEnabled) {
    +        logDebug(msg)
    +        listDir(logDirPath).foreach { status =>
    +          val s = status.toString
    +          logDebug(s)
    +        }
    +      }
    +    }
    +
    +    // stop the server with the old config, and start the new one
    +    server.stop()
    +    server = new HistoryServer(myConf, provider, securityManager, 18080)
    +    server.initialize()
    +    server.bind()
    +    val port = server.boundPort
    +    val metrics = server.cacheMetrics
    +
    +    // assert that a metric has a value; if not dump the whole metrics instance
    +    def assertMetric(name: String, counter: Counter, expected: Long): Unit = {
    +      val actual = counter.getCount
    +      if (actual != expected) {
    +        // this is here because Scalatest loses stack depth
    +        fail(s"Wrong $name value - expected $expected but got $actual" +
    +            s" in metrics\n$metrics")
    +      }
    +    }
    +
    +    // build a URL for an app or app/attempt plus a page underneath
    +    def buildURL(appId: String, suffix: String): URL = {
    +      new URL(s"http://localhost:$port/history/$appId$suffix")
    +    }
    +
    +    // build a rest URL for the application and suffix.
    +    def applications(appId: String, suffix: String): URL = {
    +      new URL(s"http://localhost:$port/api/v1/applications/$appId$suffix")
    +    }
    +
    +    val historyServerRoot = new URL(s"http://localhost:$port/")
    +    val historyServerIncompleted = new URL(historyServerRoot, "?page=1&showIncomplete=true")
    +
    +    // assert the body of a URL contains a string; return that body
    +    def assertUrlContains(url: URL, str: String): String = {
    +      val body = HistoryServerSuite.getUrl(url)
    +      assert(body.contains(str), s"did not find $str at $url : $body")
    +      body
    +    }
    +
    +    // start initial job
    +    val d = sc.parallelize(1 to 10)
    +    d.count()
    +    val stdInterval = interval(100 milliseconds)
    +    val appId = eventually(timeout(20 seconds), stdInterval) {
    +      val json = getContentAndCode("applications", port)._2.get
    +      val apps = parse(json).asInstanceOf[JArray].arr
    +      apps should have size 1
    +      (apps.head \ "id").extract[String]
    +    }
    +
    +    // which lists as incomplete
    +//    assertUrlContains(historyServerIncompleted, appId)
    --- End diff --
    
    and cut this line...we can't check the root HTML page any more


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-183198008
  
    Just saw this got merged. I'm probably missing some context, but can somebody explain to me why something so conceptually simple leads to such a big patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182980532
  
    **[Test build #51117 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/51117/consoleFull)** for PR 11118 at commit [`57e937b`](https://github.com/apache/spark/commit/57e937bd6261f66e60b5b58a3fc3062b2d90f785).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by squito <gi...@git.apache.org>.
Github user squito commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181502916
  
    I realize that this doesn't address the fileSize / mod time issue that @steveloughran had pointed out earlier -- if we go with this approach, `checkForLogs` should be updated to use file size.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181655431
  
    **[Test build #2526 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2526/consoleFull)** for PR 11118 at commit [`488da80`](https://github.com/apache/spark/commit/488da804320bcb4ab9b09526ac234f8436ce2b6a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-182918248
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/51102/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WebUI][SPARK-7889] HistoryServer updates UI f...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11118#issuecomment-181537583
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org