You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by tdas <gi...@git.apache.org> on 2014/04/01 22:58:44 UTC

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

GitHub user tdas opened a pull request:

    https://github.com/apache/spark/pull/290

    [SPARK-1386] Web UI for Spark Streaming [WIP]

    When debugging Spark Streaming applications it is necessary to monitor certain metrics that are not shown in the Spark application UI. For example, what is average processing time of batches? What is the scheduling delay? Is the system able to process as fast as it is receiving data? How many records I am receiving through my receivers? 
    
    While the StreamingListener interface introduced in the 0.9 provided some of this information, it could only be accessed programmatically. A UI that shows information specific to the streaming applications is necessary for easier debugging. This PR introduces such a UI. It shows various statistics related to the streaming application. Here is a screenshot of the UI running on my local machine.
    http://i.imgur.com/Sf9TnG5.png
    
    This is still a WIP. The UI current runs on a different port (6060). We do not want to make everyone open a new port in their firewalls, so we would like to integrate this UI into the Spark UI running at 4040. This is still to be done.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tdas/spark streaming-web-ui

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/290.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #290
    
----
commit 56cc7fbcaf04a5aab88296d20da2cfc5b84a7651
Author: Tathagata Das <ta...@gmail.com>
Date:   2014-03-28T21:45:46Z

    First cut implementation of Streaming UI.

commit 93f1c69e067fb02bcbb1dcab93d1dff4905c2e17
Author: Tathagata Das <ta...@gmail.com>
Date:   2014-03-31T23:31:48Z

    Added network receiver information to the Streaming UI.

commit 4d86e985cb7bbc7f4f125e52d72f4e4bd560677e
Author: Tathagata Das <ta...@gmail.com>
Date:   2014-04-01T18:02:23Z

    Added basic stats to the StreamingUI and refactored the UI to a Page to make it easier to transition to using SparkUI later.

commit db27bad1c781345a4bd0de6003e1a8a10508e024
Author: Tathagata Das <ta...@gmail.com>
Date:   2014-04-01T20:23:29Z

    Added last batch processing time to StreamingUI.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40130246
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40134386
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14011/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40128595
  
    Build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11544206
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,180 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.Calendar
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.ui.UIUtils._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    +  extends WebUIPage("") with Logging {
    +
    +  private val listener = parent.listener
    +  private val startTime = Calendar.getInstance().getTime()
    +  private val emptyCellTest = "-"
    +
    +  /** Render the page */
    +  override def render(request: HttpServletRequest): Seq[Node] = {
    +    val content =
    +      generateBasicStats() ++
    +      <br></br><h4>Statistics over last {listener.completedBatches.size} processed batches</h4> ++
    +      generateNetworkStatsTable() ++
    +      generateBatchStatsTable()
    +    UIUtils.headerSparkPage(
    +      content, parent.basePath, parent.appName, "Streaming", parent.headerTabs, parent, Some(5000))
    +  }
    +
    +  /** Generate basic stats of the streaming program */
    +  private def generateBasicStats(): Seq[Node] = {
    --- End diff --
    
    just a minor point (and you may not agree) but I think you could drop the `generate` prefix from all these functions and not lose anything. E.g. a function called `networkStatsTable` that returns `Seq[Node]` - I think it's self-evident that it's generating the table.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40272639
  
    Thanks @andrewor14 
    @pwendell See if it merges now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40045602
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40173436
  
    This is a really nice re-factoring! Unfortunately I just had time to look at some of it due to running into a weird performance issue. For some reason the pages were taking a long time to load (e.g. several seconds). I did some profiling and it seemed like there was a code path when serving the static data where it would look for gzipped versions of the static files and I don't think this lookup was cached, so it was having to walk through the jar.
    
    I fixed that issue by just disabling that feature for the static servlets:
    
    ```
    def createStaticHandler(resourceBase: String, path: String):
    ServletContextHandler = {
         val contextHandler = new ServletContextHandler
    +    contextHandler.setInitParameter("org.eclipse.jetty.servlet.Default.gzip",
    "false")
         val staticHandler = new DefaultServlet
         val holder = new ServletHolder(staticHandler)
         Option(getClass.getClassLoader.getResource(resourceBase)) match {
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11546746
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala ---
    @@ -0,0 +1,148 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import org.apache.spark.streaming.{Time, StreamingContext}
    +import org.apache.spark.streaming.scheduler._
    +import scala.collection.mutable.{Queue, HashMap}
    +import org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
    +import org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
    +import org.apache.spark.streaming.scheduler.BatchInfo
    +import org.apache.spark.streaming.scheduler.ReceiverInfo
    +import org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
    +import org.apache.spark.util.Distribution
    +
    +
    +private[ui] class StreamingJobProgressListener(ssc: StreamingContext) extends StreamingListener {
    +
    +  private val waitingBatchInfos = new HashMap[Time, BatchInfo]
    +  private val runningBatchInfos = new HashMap[Time, BatchInfo]
    +  private val completedaBatchInfos = new Queue[BatchInfo]
    +  private val batchInfoLimit = ssc.conf.getInt("spark.steaming.ui.maxBatches", 100)
    --- End diff --
    
    Done. 
    Steaming ... heh!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11499492
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala ---
    @@ -17,34 +17,122 @@
     
     package org.apache.spark.ui
     
    -import java.text.SimpleDateFormat
    -import java.util.Date
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.collection.mutable.ArrayBuffer
    +import scala.xml.Node
    +
    +import org.eclipse.jetty.servlet.ServletContextHandler
    +import org.json4s.JsonAST.{JNothing, JValue}
    +
    +import org.apache.spark.SecurityManager
    +import org.apache.spark.scheduler.SparkListener
    +import org.apache.spark.ui.JettyUtils._
    +import org.apache.spark.util.Utils
     
     /**
    - * Utilities used throughout the web UI.
    + * The top level component of the UI hierarchy that contains the server.
    + *
    + * Each WebUI represents a collection of tabs, each of which in turn represents a collection of
    + * pages. The use of tabs is optional, however; a WebUI may choose to include pages directly.
      */
    -private[spark] object WebUI {
    -  // SimpleDateFormat is not thread-safe. Don't expose it to avoid improper use.
    -  private val dateFormat = new ThreadLocal[SimpleDateFormat]() {
    -    override def initialValue(): SimpleDateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss")
    -  }
    +private[spark] abstract class WebUI(securityManager: SecurityManager, basePath: String = "") {
    +  protected val tabs = ArrayBuffer[UITab]()
    +  protected val handlers = ArrayBuffer[ServletContextHandler]()
    +  protected var serverInfo: Option[ServerInfo] = None
     
    -  def formatDate(date: Date): String = dateFormat.get.format(date)
    +  def getTabs: Seq[UITab] = tabs.toSeq
    +  def getHandlers: Seq[ServletContextHandler] = handlers.toSeq
    +  def getListeners: Seq[SparkListener] = tabs.flatMap(_.listener)
     
    -  def formatDate(timestamp: Long): String = dateFormat.get.format(new Date(timestamp))
    +  /** Attach a tab to this UI, along with all of its attached pages. */
    +  def attachTab(tab: UITab) {
    +    tab.start()
    +    tab.pages.foreach(attachPage)
    +    tabs += tab
    +  }
     
    -  def formatDuration(milliseconds: Long): String = {
    -    val seconds = milliseconds.toDouble / 1000
    -    if (seconds < 60) {
    -      return "%.0f s".format(seconds)
    +  /** Attach a page to this UI. */
    +  def attachPage(page: UIPage) {
    +    val pagePath = "/" + page.prefix
    +    attachHandler(createServletHandler(pagePath,
    +      (request: HttpServletRequest) => page.render(request), securityManager, basePath))
    +    if (page.includeJson) {
    +      attachHandler(createServletHandler(pagePath.stripSuffix("/") + "/json",
    +        (request: HttpServletRequest) => page.renderJson(request), securityManager, basePath))
         }
    -    val minutes = seconds / 60
    -    if (minutes < 10) {
    -      return "%.1f min".format(minutes)
    -    } else if (minutes < 60) {
    -      return "%.0f min".format(minutes)
    +  }
    +
    +  /** Attach a handler to this UI. */
    +  def attachHandler(handler: ServletContextHandler) {
    +    handlers += handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.addHandler(handler)
    +      if (!handler.isStarted) {
    +        handler.start()
    +      }
         }
    -    val hours = minutes / 60
    -    return "%.1f h".format(hours)
       }
    +
    +  /** Detach a handler from this UI. */
    +  def detachHandler(handler: ServletContextHandler) {
    +    handlers -= handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.removeHandler(handler)
    +      if (handler.isStarted) {
    +        handler.stop()
    +      }
    +    }
    +  }
    +
    +  /** Initialize all components of the server. */
    +  def start()
    +
    +  /**
    +   * Bind to the HTTP server behind this web interface.
    +   * Overridden implementation should set serverInfo.
    +   */
    +  def bind()
    +
    +  /** Return the actual port to which this server is bound. Only valid after bind(). */
    +  def boundPort: Int = serverInfo.map(_.boundPort).getOrElse(-1)
    +
    +  /** Stop the server behind this web interface. Only valid after bind(). */
    +  def stop() {
    +    assert(serverInfo.isDefined,
    +      "Attempted to stop %s before binding to a server!".format(Utils.getFormattedClassName(this)))
    +    serverInfo.get.server.stop()
    +  }
    +}
    +
    +
    +/**
    + * A tab that represents a collection of pages and a unit of listening for Spark events.
    + * Associating each tab with a listener is arbitrary and need not be the case.
    + */
    +private[spark] abstract class UITab(val prefix: String) {
    +  val pages = ArrayBuffer[UIPage]()
    +  var listener: Option[SparkListener] = None
    --- End diff --
    
    I added this, but in retrospect SparkListener should really not be associated with the UITab at all. I will put in a PR with this change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40270748
  
    @andrewor14 Hey Andrew, the merge with the master is a little involved. There is an extra handler being added to the JobProgressUI (which now is JobProgressTab) which does not translate cleanly through the refactoring. So I leave it up to you to do the merging (I dont want to break anything accidentally). Also, a new suite was added SparkUISuite, and its best to merge the UISuite and SparkUISuite to prevent further confusion. Can you take care of these.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40270373
  
    Build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39514855
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13738/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11520596
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala ---
    @@ -17,53 +17,134 @@
     
     package org.apache.spark.ui
     
    -import java.text.SimpleDateFormat
    -import java.util.Date
    +import javax.servlet.http.HttpServletRequest
     
    -private[spark] abstract class WebUI(name: String) {
    +import scala.collection.mutable.ArrayBuffer
    +import scala.xml.Node
    +
    +import org.eclipse.jetty.servlet.ServletContextHandler
    +import org.json4s.JsonAST.{JNothing, JValue}
    +
    +import org.apache.spark.{Logging, SecurityManager, SparkConf}
    +import org.apache.spark.ui.JettyUtils._
    +import org.apache.spark.util.Utils
    +
    +/**
    + * The top level component of the UI hierarchy that contains the server.
    + *
    + * Each WebUI represents a collection of tabs, each of which in turn represents a collection of
    + * pages. The use of tabs is optional, however; a WebUI may choose to include pages directly.
    + */
    +private[spark] abstract class WebUI(
    +    securityManager: SecurityManager,
    +    port: Int,
    +    conf: SparkConf,
    +    basePath: String = "")
    +  extends Logging {
    +
    +  protected val tabs = ArrayBuffer[WebUITab]()
    +  protected val handlers = ArrayBuffer[ServletContextHandler]()
       protected var serverInfo: Option[ServerInfo] = None
    +  protected val localHostName = Utils.localHostName()
    +  protected val publicHostName = Option(System.getenv("SPARK_PUBLIC_DNS")).getOrElse(localHostName)
    +  private val className = Utils.getFormattedClassName(this)
    +
    +  def getTabs: Seq[WebUITab] = tabs.toSeq
    +  def getHandlers: Seq[ServletContextHandler] = handlers.toSeq
     
    -  /**
    -   * Bind to the HTTP server behind this web interface.
    -   * Overridden implementation should set serverInfo.
    -   */
    -  def bind() { }
    +  /** Attach a tab to this UI, along with all of its attached pages. */
    +  def attachTab(tab: WebUITab) {
    +    tab.pages.foreach(attachPage)
    +    tabs += tab
    +  }
    +
    +  /** Attach a page to this UI. */
    +  def attachPage(page: WebUIPage) {
    +    val pagePath = "/" + page.prefix
    +    attachHandler(createServletHandler(pagePath,
    +      (request: HttpServletRequest) => page.render(request), securityManager, basePath))
    +    if (page.includeJson) {
    +      attachHandler(createServletHandler(pagePath.stripSuffix("/") + "/json",
    +        (request: HttpServletRequest) => page.renderJson(request), securityManager, basePath))
    +    }
    +  }
    +
    +  /** Attach a handler to this UI. */
    +  def attachHandler(handler: ServletContextHandler) {
    +    handlers += handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.addHandler(handler)
    +      if (!handler.isStarted) {
    +        handler.start()
    +      }
    +    }
    +  }
    +
    +  /** Detach a handler from this UI. */
    +  def detachHandler(handler: ServletContextHandler) {
    +    handlers -= handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.removeHandler(handler)
    +      if (handler.isStarted) {
    +        handler.stop()
    +      }
    +    }
    +  }
    +
    +  /** Initialize all components of the server. */
    +  def initialize()
    +
    +  /** Bind to the HTTP server behind this web interface. */
    +  def bind() {
    +    assert(!serverInfo.isDefined, "Attempted to bind %s more than once!".format(className))
    +    try {
    +      serverInfo = Some(startJettyServer("0.0.0.0", port, handlers, conf))
    +      logInfo("Started %s at http://%s:%d".format(className, publicHostName, boundPort))
    +    } catch {
    +      case e: Exception =>
    +        logError("Failed to bind %s".format(className), e)
    +        System.exit(1)
    +    }
    +  }
     
       /** Return the actual port to which this server is bound. Only valid after bind(). */
       def boundPort: Int = serverInfo.map(_.boundPort).getOrElse(-1)
     
       /** Stop the server behind this web interface. Only valid after bind(). */
       def stop() {
    -    assert(serverInfo.isDefined, "Attempted to stop %s before binding to a server!".format(name))
    +    assert(serverInfo.isDefined,
    +      "Attempted to stop %s before binding to a server!".format(className))
         serverInfo.get.server.stop()
       }
     }
     
    +
     /**
    - * Utilities used throughout the web UI.
    + * A tab that represents a collection of pages.
      */
    -private[spark] object WebUI {
    -  // SimpleDateFormat is not thread-safe. Don't expose it to avoid improper use.
    -  private val dateFormat = new ThreadLocal[SimpleDateFormat]() {
    -    override def initialValue(): SimpleDateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss")
    +private[spark] abstract class WebUITab(parent: WebUI, val prefix: String) {
    +  val pages = ArrayBuffer[WebUIPage]()
    +  val name = prefix.capitalize
    +
    +  /** Attach a page to this tab. This prepends the page's prefix with the tab's own prefix. */
    +  def attachPage(page: WebUIPage) {
    +    page.prefix = (prefix + "/" + page.prefix).stripSuffix("/")
    +    pages += page
       }
     
    -  def formatDate(date: Date): String = dateFormat.get.format(date)
    +  /** Get a list of header tabs from the parent UI. */
    +  def headerTabs: Seq[WebUITab] = parent.getTabs
    +}
     
    -  def formatDate(timestamp: Long): String = dateFormat.get.format(new Date(timestamp))
     
    -  def formatDuration(milliseconds: Long): String = {
    -    val seconds = milliseconds.toDouble / 1000
    -    if (seconds < 60) {
    -      return "%.0f s".format(seconds)
    -    }
    -    val minutes = seconds / 60
    -    if (minutes < 10) {
    -      return "%.1f min".format(minutes)
    -    } else if (minutes < 60) {
    -      return "%.0f min".format(minutes)
    -    }
    -    val hours = minutes / 60
    -    "%.1f h".format(hours)
    -  }
    +/**
    + * A page that represents the leaf node in the UI hierarchy.
    + *
    + * The direct parent of a WebUIPage is not specified as it can be either a WebUI or a WebUITab.
    + * If includeJson is true, the parent WebUI (direct or indirect) creates handlers for both the
    + * HTML and the JSON content, rather than just the former.
    + */
    --- End diff --
    
    I was thinking of prefix in more of a network prefix sense. The hierarchy is analogous in that the parent prefix (e.g. `/stage`) contains the child prefix (e.g. `/stage/stages`), so if a request comes in we do an LPM on the URL to decide which page to serve.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40273163
  
    Merged build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40174456
  
    @tdas there are some binary compatiblity checks here that need to be fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40156532
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40152499
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40160473
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14028/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11517950
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala ---
    @@ -17,53 +17,134 @@
     
     package org.apache.spark.ui
     
    -import java.text.SimpleDateFormat
    -import java.util.Date
    +import javax.servlet.http.HttpServletRequest
     
    -private[spark] abstract class WebUI(name: String) {
    +import scala.collection.mutable.ArrayBuffer
    +import scala.xml.Node
    +
    +import org.eclipse.jetty.servlet.ServletContextHandler
    +import org.json4s.JsonAST.{JNothing, JValue}
    +
    +import org.apache.spark.{Logging, SecurityManager, SparkConf}
    +import org.apache.spark.ui.JettyUtils._
    +import org.apache.spark.util.Utils
    +
    +/**
    + * The top level component of the UI hierarchy that contains the server.
    + *
    + * Each WebUI represents a collection of tabs, each of which in turn represents a collection of
    + * pages. The use of tabs is optional, however; a WebUI may choose to include pages directly.
    + */
    +private[spark] abstract class WebUI(
    +    securityManager: SecurityManager,
    +    port: Int,
    +    conf: SparkConf,
    +    basePath: String = "")
    +  extends Logging {
    +
    +  protected val tabs = ArrayBuffer[WebUITab]()
    +  protected val handlers = ArrayBuffer[ServletContextHandler]()
       protected var serverInfo: Option[ServerInfo] = None
    +  protected val localHostName = Utils.localHostName()
    +  protected val publicHostName = Option(System.getenv("SPARK_PUBLIC_DNS")).getOrElse(localHostName)
    +  private val className = Utils.getFormattedClassName(this)
    +
    +  def getTabs: Seq[WebUITab] = tabs.toSeq
    +  def getHandlers: Seq[ServletContextHandler] = handlers.toSeq
     
    -  /**
    -   * Bind to the HTTP server behind this web interface.
    -   * Overridden implementation should set serverInfo.
    -   */
    -  def bind() { }
    +  /** Attach a tab to this UI, along with all of its attached pages. */
    +  def attachTab(tab: WebUITab) {
    +    tab.pages.foreach(attachPage)
    +    tabs += tab
    +  }
    +
    +  /** Attach a page to this UI. */
    +  def attachPage(page: WebUIPage) {
    +    val pagePath = "/" + page.prefix
    +    attachHandler(createServletHandler(pagePath,
    +      (request: HttpServletRequest) => page.render(request), securityManager, basePath))
    +    if (page.includeJson) {
    +      attachHandler(createServletHandler(pagePath.stripSuffix("/") + "/json",
    +        (request: HttpServletRequest) => page.renderJson(request), securityManager, basePath))
    +    }
    +  }
    +
    +  /** Attach a handler to this UI. */
    +  def attachHandler(handler: ServletContextHandler) {
    +    handlers += handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.addHandler(handler)
    +      if (!handler.isStarted) {
    +        handler.start()
    +      }
    +    }
    +  }
    +
    +  /** Detach a handler from this UI. */
    +  def detachHandler(handler: ServletContextHandler) {
    +    handlers -= handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.removeHandler(handler)
    +      if (handler.isStarted) {
    +        handler.stop()
    +      }
    +    }
    +  }
    +
    +  /** Initialize all components of the server. */
    +  def initialize()
    +
    +  /** Bind to the HTTP server behind this web interface. */
    +  def bind() {
    +    assert(!serverInfo.isDefined, "Attempted to bind %s more than once!".format(className))
    +    try {
    +      serverInfo = Some(startJettyServer("0.0.0.0", port, handlers, conf))
    +      logInfo("Started %s at http://%s:%d".format(className, publicHostName, boundPort))
    +    } catch {
    +      case e: Exception =>
    +        logError("Failed to bind %s".format(className), e)
    +        System.exit(1)
    +    }
    +  }
     
       /** Return the actual port to which this server is bound. Only valid after bind(). */
       def boundPort: Int = serverInfo.map(_.boundPort).getOrElse(-1)
     
       /** Stop the server behind this web interface. Only valid after bind(). */
       def stop() {
    -    assert(serverInfo.isDefined, "Attempted to stop %s before binding to a server!".format(name))
    +    assert(serverInfo.isDefined,
    +      "Attempted to stop %s before binding to a server!".format(className))
         serverInfo.get.server.stop()
       }
     }
     
    +
     /**
    - * Utilities used throughout the web UI.
    + * A tab that represents a collection of pages.
      */
    -private[spark] object WebUI {
    -  // SimpleDateFormat is not thread-safe. Don't expose it to avoid improper use.
    -  private val dateFormat = new ThreadLocal[SimpleDateFormat]() {
    -    override def initialValue(): SimpleDateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss")
    +private[spark] abstract class WebUITab(parent: WebUI, val prefix: String) {
    +  val pages = ArrayBuffer[WebUIPage]()
    +  val name = prefix.capitalize
    +
    +  /** Attach a page to this tab. This prepends the page's prefix with the tab's own prefix. */
    +  def attachPage(page: WebUIPage) {
    +    page.prefix = (prefix + "/" + page.prefix).stripSuffix("/")
    +    pages += page
       }
     
    -  def formatDate(date: Date): String = dateFormat.get.format(date)
    +  /** Get a list of header tabs from the parent UI. */
    +  def headerTabs: Seq[WebUITab] = parent.getTabs
    +}
     
    -  def formatDate(timestamp: Long): String = dateFormat.get.format(new Date(timestamp))
     
    -  def formatDuration(milliseconds: Long): String = {
    -    val seconds = milliseconds.toDouble / 1000
    -    if (seconds < 60) {
    -      return "%.0f s".format(seconds)
    -    }
    -    val minutes = seconds / 60
    -    if (minutes < 10) {
    -      return "%.1f min".format(minutes)
    -    } else if (minutes < 60) {
    -      return "%.0f min".format(minutes)
    -    }
    -    val hours = minutes / 60
    -    "%.1f h".format(hours)
    -  }
    +/**
    + * A page that represents the leaf node in the UI hierarchy.
    + *
    + * The direct parent of a WebUIPage is not specified as it can be either a WebUI or a WebUITab.
    + * If includeJson is true, the parent WebUI (direct or indirect) creates handlers for both the
    + * HTML and the JSON content, rather than just the former.
    + */
    +private[spark] abstract class WebUIPage(var prefix: String, val includeJson: Boolean = false) {
    +  def render(request: HttpServletRequest): Seq[Node] = Seq[Node]()
    --- End diff --
    
    Is it possible to have a valid `WebUIPage` where it doesn't override `render`? Rather than have a default implementation that returns an empty XML sequence, what about just leaving it abstract? That way if someone extends `WebUIPage` they are forced to implement `render`. (Maybe there is a use case I haven't seen yet...).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40274906
  
    Thanks. You and @tdas too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11518377
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
    @@ -213,6 +192,18 @@ class HistoryServer(
         fileSystem.close()
       }
     
    +  /** Attach a reconstructed UI to this server. Only valid after bind(). */
    +  private def attachUI(ui: SparkUI) {
    --- End diff --
    
    SparkAppUI is confusing, since there's already a `SparkApp`. We can call this `attachSparkUI` to be more precise. I think `SparkUI` is a good name, because it's like *the* UI for Spark.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11498612
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,289 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.{Calendar, Locale}
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    +  extends UIPage("") with Logging {
    +
    +  private val ssc = parent.ssc
    +  private val sc = ssc.sparkContext
    +  private val sparkUI = sc.ui
    +  private val listener = new StreamingProgressListener(ssc)
    +  private val calendar = Calendar.getInstance()
    +  private val startTime = calendar.getTime()
    +  private val emptyCellTest = "-"
    +
    +  ssc.addStreamingListener(listener)
    +  parent.attachPage(this)
    +
    +  /** Render the page */
    +  override def render(request: HttpServletRequest): Seq[Node] = {
    +    val content =
    +      generateBasicStats() ++
    +      <br></br><h4>Statistics over last {listener.completedBatches.size} processed batches</h4> ++
    +      generateNetworkStatsTable() ++
    +      generateBatchStatsTable()
    +    UIUtils.headerSparkPage(
    +      content, sparkUI.basePath, sc.appName, "Streaming", sparkUI.getTabs, parent, Some(5000))
    +  }
    +
    +  /** Generate basic stats of the streaming program */
    +  private def generateBasicStats(): Seq[Node] = {
    +    val timeSinceStart = System.currentTimeMillis() - startTime.getTime
    +    <ul class ="unstyled">
    +      <li>
    +        <strong>Started at: </strong> {startTime.toString}
    +      </li>
    +      <li>
    +        <strong>Time since start: </strong>{msDurationToString(timeSinceStart)}
    +      </li>
    +      <li>
    +        <strong>Network receivers: </strong>{listener.numNetworkReceivers}
    +      </li>
    +      <li>
    +        <strong>Batch interval: </strong>{msDurationToString(listener.batchDuration)}
    +      </li>
    +      <li>
    +        <strong>Processed batches: </strong>{listener.numTotalCompletedBatches}
    +      </li>
    +      <li>
    +        <strong>Waiting batches: </strong>{listener.numUnprocessedBatches}
    +      </li>
    +    </ul>
    +  }
    +
    +  /** Generate stats of data received over the network the streaming program */
    +  private def generateNetworkStatsTable(): Seq[Node] = {
    +    val receivedRecordDistributions = listener.receivedRecordsDistributions
    +    val lastBatchReceivedRecord = listener.lastReceivedBatchRecords
    +    val table = if (receivedRecordDistributions.size > 0) {
    +      val headerRow = Seq(
    +        "Receiver",
    +        "Location",
    +        s"Records in last batch",
    +        "Minimum rate\n[records/sec]",
    +        "25th percentile rate\n[records/sec]",
    +        "Median rate\n[records/sec]",
    +        "75th percentile rate\n[records/sec]",
    +        "Maximum rate\n[records/sec]"
    +      )
    +      val dataRows = (0 until listener.numNetworkReceivers).map { receiverId =>
    +        val receiverInfo = listener.receiverInfo(receiverId)
    +        val receiverName = receiverInfo.map(_.toString).getOrElse(s"Receiver-$receiverId")
    +        val receiverLocation = receiverInfo.map(_.location).getOrElse(emptyCellTest)
    +        val receiverLastBatchRecords = numberToString(lastBatchReceivedRecord(receiverId))
    +        val receivedRecordStats = receivedRecordDistributions(receiverId).map { d =>
    +          d.getQuantiles().map(r => numberToString(r.toLong))
    +        }.getOrElse {
    +          Seq(emptyCellTest, emptyCellTest, emptyCellTest, emptyCellTest, emptyCellTest)
    +        }
    +        Seq(receiverName, receiverLocation, receiverLastBatchRecords) ++
    +          receivedRecordStats
    +      }
    +      Some(listingTable(headerRow, dataRows, fixedWidth = true))
    +    } else {
    +      None
    +    }
    +
    +    val content =
    +      <h5>Network Input Statistics</h5> ++
    +      <div>{table.getOrElse("No network receivers")}</div>
    +
    +    content
    +  }
    +
    +  /** Generate stats of batch jobs of the streaming program */
    +  private def generateBatchStatsTable(): Seq[Node] = {
    +    val numBatches = listener.completedBatches.size
    +    val lastCompletedBatch = listener.lastCompletedBatch
    +    val table = if (numBatches > 0) {
    +      val processingDelayQuantilesRow = {
    +        Seq(
    +          "Processing Time",
    +          msDurationToString(lastCompletedBatch.flatMap(_.processingDelay))
    +        ) ++ getQuantiles(listener.processingDelayDistribution)
    +      }
    +      val schedulingDelayQuantilesRow = {
    +        Seq(
    +          "Scheduling Delay",
    +          msDurationToString(lastCompletedBatch.flatMap(_.schedulingDelay))
    +        ) ++ getQuantiles(listener.schedulingDelayDistribution)
    +      }
    +      val totalDelayQuantilesRow = {
    +        Seq(
    +          "Total Delay",
    +          msDurationToString(lastCompletedBatch.flatMap(_.totalDelay))
    +        ) ++ getQuantiles(listener.totalDelayDistribution)
    +      }
    +      val headerRow = Seq("Metric", "Last batch", "Minimum", "25th percentile",
    +        "Median", "75th percentile", "Maximum")
    +      val dataRows: Seq[Seq[String]] = Seq(
    +        processingDelayQuantilesRow,
    +        schedulingDelayQuantilesRow,
    +        totalDelayQuantilesRow
    +      )
    +      Some(listingTable(headerRow, dataRows, fixedWidth = true))
    +    } else {
    +      None
    +    }
    +
    +    val content =
    +      <h5>Batch Processing Statistics</h5> ++
    +      <div>
    +        <ul class="unstyled">
    +          {table.getOrElse("No statistics have been generated yet.")}
    +        </ul>
    +      </div>
    +
    +    content
    +  }
    +
    +  /**
    +   * Returns a human-readable string representing a number
    +   */
    +  private def numberToString(records: Double): String = {
    +    val trillion = 1e12
    +    val billion = 1e9
    +    val million = 1e6
    +    val thousand = 1e3
    +
    +    val (value, unit) = {
    +      if (records >= 2*trillion) {
    +        (records / trillion, " T")
    +      } else if (records >= 2*billion) {
    +        (records / billion, " B")
    +      } else if (records >= 2*million) {
    +        (records / million, " M")
    +      } else if (records >= 2*thousand) {
    +        (records / thousand, " K")
    +      } else {
    +        (records, "")
    +      }
    +    }
    +    "%.1f%s".formatLocal(Locale.US, value, unit)
    +  }
    +
    +  /**
    +   * Returns a human-readable string representing a duration such as "5 second 35 ms"
    +   */
    +  private def msDurationToString(ms: Long): String = {
    +    try {
    +      val second = 1000L
    +      val minute = 60 * second
    +      val hour = 60 * minute
    +      val day = 24 * hour
    +      val week = 7 * day
    +      val year = 365 * day
    +
    +      def toString(num: Long, unit: String): String = {
    +        if (num == 0) {
    +          ""
    +        } else if (num == 1) {
    +          s"$num $unit"
    +        } else {
    +          s"$num ${unit}s"
    +        }
    +      }
    +
    +      val millisecondsString = if (ms >= second && ms % second == 0) "" else s"${ms % second} ms"
    +      val secondString = toString((ms % minute) / second, "second")
    +      val minuteString = toString((ms % hour) / minute, "minute")
    +      val hourString = toString((ms % day) / hour, "hour")
    +      val dayString = toString((ms % week) / day, "day")
    +      val weekString = toString((ms % year) / week, "week")
    +      val yearString = toString(ms / year, "year")
    +
    +      Seq(
    +        second -> millisecondsString,
    +        minute -> s"$secondString $millisecondsString",
    +        hour -> s"$minuteString $secondString",
    +        day -> s"$hourString $minuteString $secondString",
    +        week -> s"$dayString $hourString $minuteString",
    +        year -> s"$weekString $dayString $hourString"
    +      ).foreach {
    +        case (durationLimit, durationString) if (ms < durationLimit) =>
    +          return durationString
    +        case e: Any => // matcherror is thrown without this
    --- End diff --
    
    Match error is thrown only because you place the `if` within the `case` itself. I think a better way to write this is
    
    ```
    .foreach { case (durationLimit, durationString) =>
      if (ms < durationLimit) {
        return durationString
      }
    }


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40247441
  
    Build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40117894
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40153239
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39514854
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40155919
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40242857
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40134383
  
    Build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40166583
  
    Build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11545501
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala ---
    @@ -0,0 +1,148 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import org.apache.spark.streaming.{Time, StreamingContext}
    +import org.apache.spark.streaming.scheduler._
    +import scala.collection.mutable.{Queue, HashMap}
    +import org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
    +import org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
    +import org.apache.spark.streaming.scheduler.BatchInfo
    +import org.apache.spark.streaming.scheduler.ReceiverInfo
    +import org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
    +import org.apache.spark.util.Distribution
    +
    +
    +private[ui] class StreamingJobProgressListener(ssc: StreamingContext) extends StreamingListener {
    +
    +  private val waitingBatchInfos = new HashMap[Time, BatchInfo]
    +  private val runningBatchInfos = new HashMap[Time, BatchInfo]
    +  private val completedaBatchInfos = new Queue[BatchInfo]
    +  private val batchInfoLimit = ssc.conf.getInt("spark.steaming.ui.maxBatches", 100)
    --- End diff --
    
    (a) typo: says spark.steaming :) (b) to be consistent the the similar configs in other UI settings would be good to say `spark.streaming.ui.retainedBatches`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40153240
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14022/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40050337
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39275599
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40045592
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40242487
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14057/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11518271
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala ---
    @@ -17,53 +17,134 @@
     
     package org.apache.spark.ui
     
    -import java.text.SimpleDateFormat
    -import java.util.Date
    +import javax.servlet.http.HttpServletRequest
     
    -private[spark] abstract class WebUI(name: String) {
    +import scala.collection.mutable.ArrayBuffer
    +import scala.xml.Node
    +
    +import org.eclipse.jetty.servlet.ServletContextHandler
    +import org.json4s.JsonAST.{JNothing, JValue}
    +
    +import org.apache.spark.{Logging, SecurityManager, SparkConf}
    +import org.apache.spark.ui.JettyUtils._
    +import org.apache.spark.util.Utils
    +
    +/**
    + * The top level component of the UI hierarchy that contains the server.
    + *
    + * Each WebUI represents a collection of tabs, each of which in turn represents a collection of
    + * pages. The use of tabs is optional, however; a WebUI may choose to include pages directly.
    + */
    +private[spark] abstract class WebUI(
    +    securityManager: SecurityManager,
    +    port: Int,
    +    conf: SparkConf,
    +    basePath: String = "")
    +  extends Logging {
    +
    +  protected val tabs = ArrayBuffer[WebUITab]()
    +  protected val handlers = ArrayBuffer[ServletContextHandler]()
       protected var serverInfo: Option[ServerInfo] = None
    +  protected val localHostName = Utils.localHostName()
    +  protected val publicHostName = Option(System.getenv("SPARK_PUBLIC_DNS")).getOrElse(localHostName)
    +  private val className = Utils.getFormattedClassName(this)
    +
    +  def getTabs: Seq[WebUITab] = tabs.toSeq
    +  def getHandlers: Seq[ServletContextHandler] = handlers.toSeq
     
    -  /**
    -   * Bind to the HTTP server behind this web interface.
    -   * Overridden implementation should set serverInfo.
    -   */
    -  def bind() { }
    +  /** Attach a tab to this UI, along with all of its attached pages. */
    +  def attachTab(tab: WebUITab) {
    +    tab.pages.foreach(attachPage)
    +    tabs += tab
    +  }
    +
    +  /** Attach a page to this UI. */
    +  def attachPage(page: WebUIPage) {
    +    val pagePath = "/" + page.prefix
    +    attachHandler(createServletHandler(pagePath,
    +      (request: HttpServletRequest) => page.render(request), securityManager, basePath))
    +    if (page.includeJson) {
    +      attachHandler(createServletHandler(pagePath.stripSuffix("/") + "/json",
    +        (request: HttpServletRequest) => page.renderJson(request), securityManager, basePath))
    +    }
    +  }
    +
    +  /** Attach a handler to this UI. */
    +  def attachHandler(handler: ServletContextHandler) {
    +    handlers += handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.addHandler(handler)
    +      if (!handler.isStarted) {
    +        handler.start()
    +      }
    +    }
    +  }
    +
    +  /** Detach a handler from this UI. */
    +  def detachHandler(handler: ServletContextHandler) {
    +    handlers -= handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.removeHandler(handler)
    +      if (handler.isStarted) {
    +        handler.stop()
    +      }
    +    }
    +  }
    +
    +  /** Initialize all components of the server. */
    +  def initialize()
    +
    +  /** Bind to the HTTP server behind this web interface. */
    +  def bind() {
    +    assert(!serverInfo.isDefined, "Attempted to bind %s more than once!".format(className))
    +    try {
    +      serverInfo = Some(startJettyServer("0.0.0.0", port, handlers, conf))
    +      logInfo("Started %s at http://%s:%d".format(className, publicHostName, boundPort))
    +    } catch {
    +      case e: Exception =>
    +        logError("Failed to bind %s".format(className), e)
    +        System.exit(1)
    +    }
    +  }
     
       /** Return the actual port to which this server is bound. Only valid after bind(). */
       def boundPort: Int = serverInfo.map(_.boundPort).getOrElse(-1)
     
       /** Stop the server behind this web interface. Only valid after bind(). */
       def stop() {
    -    assert(serverInfo.isDefined, "Attempted to stop %s before binding to a server!".format(name))
    +    assert(serverInfo.isDefined,
    +      "Attempted to stop %s before binding to a server!".format(className))
         serverInfo.get.server.stop()
       }
     }
     
    +
     /**
    - * Utilities used throughout the web UI.
    + * A tab that represents a collection of pages.
      */
    -private[spark] object WebUI {
    -  // SimpleDateFormat is not thread-safe. Don't expose it to avoid improper use.
    -  private val dateFormat = new ThreadLocal[SimpleDateFormat]() {
    -    override def initialValue(): SimpleDateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss")
    +private[spark] abstract class WebUITab(parent: WebUI, val prefix: String) {
    +  val pages = ArrayBuffer[WebUIPage]()
    +  val name = prefix.capitalize
    +
    +  /** Attach a page to this tab. This prepends the page's prefix with the tab's own prefix. */
    +  def attachPage(page: WebUIPage) {
    +    page.prefix = (prefix + "/" + page.prefix).stripSuffix("/")
    +    pages += page
       }
     
    -  def formatDate(date: Date): String = dateFormat.get.format(date)
    +  /** Get a list of header tabs from the parent UI. */
    +  def headerTabs: Seq[WebUITab] = parent.getTabs
    +}
     
    -  def formatDate(timestamp: Long): String = dateFormat.get.format(new Date(timestamp))
     
    -  def formatDuration(milliseconds: Long): String = {
    -    val seconds = milliseconds.toDouble / 1000
    -    if (seconds < 60) {
    -      return "%.0f s".format(seconds)
    -    }
    -    val minutes = seconds / 60
    -    if (minutes < 10) {
    -      return "%.1f min".format(minutes)
    -    } else if (minutes < 60) {
    -      return "%.0f min".format(minutes)
    -    }
    -    val hours = minutes / 60
    -    "%.1f h".format(hours)
    -  }
    +/**
    + * A page that represents the leaf node in the UI hierarchy.
    + *
    + * The direct parent of a WebUIPage is not specified as it can be either a WebUI or a WebUITab.
    + * If includeJson is true, the parent WebUI (direct or indirect) creates handlers for both the
    + * HTML and the JSON content, rather than just the former.
    + */
    +private[spark] abstract class WebUIPage(var prefix: String, val includeJson: Boolean = false) {
    --- End diff --
    
    Sounds good. I don't like the extra boolean flag either.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40045150
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40242398
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40168263
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40045153
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39275458
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39275452
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40166505
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11545457
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,180 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.Calendar
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.ui.UIUtils._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    +  extends WebUIPage("") with Logging {
    +
    +  private val listener = parent.listener
    +  private val startTime = Calendar.getInstance().getTime()
    +  private val emptyCellTest = "-"
    +
    +  /** Render the page */
    +  override def render(request: HttpServletRequest): Seq[Node] = {
    +    val content =
    +      generateBasicStats() ++
    +      <br></br><h4>Statistics over last {listener.completedBatches.size} processed batches</h4> ++
    +      generateNetworkStatsTable() ++
    +      generateBatchStatsTable()
    +    UIUtils.headerSparkPage(
    +      content, parent.basePath, parent.appName, "Streaming", parent.headerTabs, parent, Some(5000))
    +  }
    +
    +  /** Generate basic stats of the streaming program */
    +  private def generateBasicStats(): Seq[Node] = {
    +    val timeSinceStart = System.currentTimeMillis() - startTime.getTime
    +    <ul class ="unstyled">
    +      <li>
    +        <strong>Started at: </strong> {startTime.toString}
    +      </li>
    +      <li>
    +        <strong>Time since start: </strong>{formatDurationVerbose(timeSinceStart)}
    +      </li>
    +      <li>
    +        <strong>Network receivers: </strong>{listener.numNetworkReceivers}
    +      </li>
    +      <li>
    +        <strong>Batch interval: </strong>{formatDurationVerbose(listener.batchDuration)}
    +      </li>
    +      <li>
    +        <strong>Processed batches: </strong>{listener.numTotalCompletedBatches}
    +      </li>
    +      <li>
    +        <strong>Waiting batches: </strong>{listener.numUnprocessedBatches}
    +      </li>
    +    </ul>
    +  }
    +
    +  /** Generate stats of data received over the network the streaming program */
    +  private def generateNetworkStatsTable(): Seq[Node] = {
    +    val receivedRecordDistributions = listener.receivedRecordsDistributions
    +    val lastBatchReceivedRecord = listener.lastReceivedBatchRecords
    +    val table = if (receivedRecordDistributions.size > 0) {
    +      val headerRow = Seq(
    +        "Receiver",
    +        "Location",
    +        "Records in last batch\n[" + formatDate(Calendar.getInstance().getTime()) + "]",
    +        "Minimum rate\n[records/sec]",
    +        "25th percentile rate\n[records/sec]",
    +        "Median rate\n[records/sec]",
    +        "75th percentile rate\n[records/sec]",
    +        "Maximum rate\n[records/sec]"
    +      )
    +      val dataRows = (0 until listener.numNetworkReceivers).map { receiverId =>
    +        val receiverInfo = listener.receiverInfo(receiverId)
    +        val receiverName = receiverInfo.map(_.toString).getOrElse(s"Receiver-$receiverId")
    +        val receiverLocation = receiverInfo.map(_.location).getOrElse(emptyCellTest)
    +        val receiverLastBatchRecords = formatDurationVerbose(lastBatchReceivedRecord(receiverId))
    +        val receivedRecordStats = receivedRecordDistributions(receiverId).map { d =>
    +          d.getQuantiles().map(r => formatDurationVerbose(r.toLong))
    +        }.getOrElse {
    +          Seq(emptyCellTest, emptyCellTest, emptyCellTest, emptyCellTest, emptyCellTest)
    +        }
    +        Seq(receiverName, receiverLocation, receiverLastBatchRecords) ++ receivedRecordStats
    +      }
    +      Some(listingTable(headerRow, dataRows))
    +    } else {
    +      None
    +    }
    +
    +    val content =
    +      <h5>Network Input Statistics</h5> ++
    +      <div>{table.getOrElse("No network receivers")}</div>
    +
    +    content
    +  }
    +
    +  /** Generate stats of batch jobs of the streaming program */
    +  private def generateBatchStatsTable(): Seq[Node] = {
    +    val numBatches = listener.completedBatches.size
    +    val lastCompletedBatch = listener.lastCompletedBatch
    +    val table = if (numBatches > 0) {
    +      val processingDelayQuantilesRow = {
    +        Seq(
    +          "Processing Time",
    +          formatDurationOption(lastCompletedBatch.flatMap(_.processingDelay))
    +        ) ++ getQuantiles(listener.processingDelayDistribution)
    +      }
    +      val schedulingDelayQuantilesRow = {
    +        Seq(
    +          "Scheduling Delay",
    +          formatDurationOption(lastCompletedBatch.flatMap(_.schedulingDelay))
    +        ) ++ getQuantiles(listener.schedulingDelayDistribution)
    +      }
    +      val totalDelayQuantilesRow = {
    +        Seq(
    +          "Total Delay",
    +          formatDurationOption(lastCompletedBatch.flatMap(_.totalDelay))
    +        ) ++ getQuantiles(listener.totalDelayDistribution)
    +      }
    +      val headerRow = Seq("Metric", "Last batch", "Minimum", "25th percentile",
    +        "Median", "75th percentile", "Maximum")
    +      val dataRows: Seq[Seq[String]] = Seq(
    +        processingDelayQuantilesRow,
    +        schedulingDelayQuantilesRow,
    +        totalDelayQuantilesRow
    +      )
    +      Some(listingTable(headerRow, dataRows))
    +    } else {
    +      None
    +    }
    +
    +    val content =
    +      <h5>Batch Processing Statistics</h5> ++
    --- End diff --
    
    I noticed that here by default it only keeps track of 100 batches. It might be good to say somewhere the number of batches currently being considered in the UI statistics. Otherwise I'd assume (as a user) that all processed batches are included in the statistics. So here it might be good to say "Batch Processing Statics (last XX batches)" or something in the title.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11517823
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala ---
    @@ -213,6 +192,18 @@ class HistoryServer(
         fileSystem.close()
       }
     
    +  /** Attach a reconstructed UI to this server. Only valid after bind(). */
    +  private def attachUI(ui: SparkUI) {
    --- End diff --
    
    minor: `attachUI` is pretty generic - what bout calling this `attachApplicationUI` or even just `attachSparkUI`? Separately I'm wondering if `SparkUI` should be renamed to be more specific. E.g. `SparkAppUI` 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40167488
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11518384
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/SparkUI.scala ---
    @@ -17,107 +17,78 @@
     
     package org.apache.spark.ui
     
    -import org.eclipse.jetty.servlet.ServletContextHandler
    -
    -import org.apache.spark.{Logging, SecurityManager, SparkConf, SparkContext, SparkEnv}
    +import org.apache.spark.{Logging, SecurityManager, SparkConf, SparkContext}
     import org.apache.spark.scheduler._
     import org.apache.spark.storage.StorageStatusListener
     import org.apache.spark.ui.JettyUtils._
    -import org.apache.spark.ui.env.EnvironmentUI
    -import org.apache.spark.ui.exec.ExecutorsUI
    -import org.apache.spark.ui.jobs.JobProgressUI
    -import org.apache.spark.ui.storage.BlockManagerUI
    -import org.apache.spark.util.Utils
    +import org.apache.spark.ui.env.EnvironmentTab
    +import org.apache.spark.ui.exec.ExecutorsTab
    +import org.apache.spark.ui.jobs.JobProgressTab
    +import org.apache.spark.ui.storage.StorageTab
     
    -/** Top level user interface for Spark */
    +/**
    + * Top level user interface for Spark.
    --- End diff --
    
    Ok.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39275601
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13657/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39258687
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13646/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40242411
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11518109
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/UIUtils.scala ---
    @@ -129,21 +220,36 @@ private[spark] object UIUtils {
       /** Returns an HTML table constructed by generating a row for each object in a sequence. */
       def listingTable[T](
           headers: Seq[String],
    -      makeRow: T => Seq[Node],
    -      rows: Seq[T],
    +      generateDataRow: T => Seq[Node],
    +      data: Seq[T],
           fixedWidth: Boolean = false): Seq[Node] = {
     
    -    val colWidth = 100.toDouble / headers.size
    -    val colWidthAttr = if (fixedWidth) colWidth + "%" else ""
         var tableClass = "table table-bordered table-striped table-condensed sortable"
         if (fixedWidth) {
           tableClass += " table-fixed"
         }
    -
    +    val colWidth = 100.toDouble / headers.size
    +    val colWidthAttr =if (fixedWidth) colWidth + "%" else ""
    --- End diff --
    
    `= if`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11520055
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala ---
    @@ -17,53 +17,134 @@
     
     package org.apache.spark.ui
     
    -import java.text.SimpleDateFormat
    -import java.util.Date
    +import javax.servlet.http.HttpServletRequest
     
    -private[spark] abstract class WebUI(name: String) {
    +import scala.collection.mutable.ArrayBuffer
    +import scala.xml.Node
    +
    +import org.eclipse.jetty.servlet.ServletContextHandler
    +import org.json4s.JsonAST.{JNothing, JValue}
    +
    +import org.apache.spark.{Logging, SecurityManager, SparkConf}
    +import org.apache.spark.ui.JettyUtils._
    +import org.apache.spark.util.Utils
    +
    +/**
    + * The top level component of the UI hierarchy that contains the server.
    + *
    + * Each WebUI represents a collection of tabs, each of which in turn represents a collection of
    + * pages. The use of tabs is optional, however; a WebUI may choose to include pages directly.
    + */
    +private[spark] abstract class WebUI(
    +    securityManager: SecurityManager,
    +    port: Int,
    +    conf: SparkConf,
    +    basePath: String = "")
    +  extends Logging {
    +
    +  protected val tabs = ArrayBuffer[WebUITab]()
    +  protected val handlers = ArrayBuffer[ServletContextHandler]()
       protected var serverInfo: Option[ServerInfo] = None
    +  protected val localHostName = Utils.localHostName()
    +  protected val publicHostName = Option(System.getenv("SPARK_PUBLIC_DNS")).getOrElse(localHostName)
    +  private val className = Utils.getFormattedClassName(this)
    +
    +  def getTabs: Seq[WebUITab] = tabs.toSeq
    +  def getHandlers: Seq[ServletContextHandler] = handlers.toSeq
     
    -  /**
    -   * Bind to the HTTP server behind this web interface.
    -   * Overridden implementation should set serverInfo.
    -   */
    -  def bind() { }
    +  /** Attach a tab to this UI, along with all of its attached pages. */
    +  def attachTab(tab: WebUITab) {
    +    tab.pages.foreach(attachPage)
    +    tabs += tab
    +  }
    +
    +  /** Attach a page to this UI. */
    +  def attachPage(page: WebUIPage) {
    +    val pagePath = "/" + page.prefix
    +    attachHandler(createServletHandler(pagePath,
    +      (request: HttpServletRequest) => page.render(request), securityManager, basePath))
    +    if (page.includeJson) {
    +      attachHandler(createServletHandler(pagePath.stripSuffix("/") + "/json",
    +        (request: HttpServletRequest) => page.renderJson(request), securityManager, basePath))
    +    }
    +  }
    +
    +  /** Attach a handler to this UI. */
    +  def attachHandler(handler: ServletContextHandler) {
    +    handlers += handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.addHandler(handler)
    +      if (!handler.isStarted) {
    +        handler.start()
    +      }
    +    }
    +  }
    +
    +  /** Detach a handler from this UI. */
    +  def detachHandler(handler: ServletContextHandler) {
    +    handlers -= handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.removeHandler(handler)
    +      if (handler.isStarted) {
    +        handler.stop()
    +      }
    +    }
    +  }
    +
    +  /** Initialize all components of the server. */
    +  def initialize()
    +
    +  /** Bind to the HTTP server behind this web interface. */
    +  def bind() {
    +    assert(!serverInfo.isDefined, "Attempted to bind %s more than once!".format(className))
    +    try {
    +      serverInfo = Some(startJettyServer("0.0.0.0", port, handlers, conf))
    +      logInfo("Started %s at http://%s:%d".format(className, publicHostName, boundPort))
    +    } catch {
    +      case e: Exception =>
    +        logError("Failed to bind %s".format(className), e)
    +        System.exit(1)
    +    }
    +  }
     
       /** Return the actual port to which this server is bound. Only valid after bind(). */
       def boundPort: Int = serverInfo.map(_.boundPort).getOrElse(-1)
     
       /** Stop the server behind this web interface. Only valid after bind(). */
       def stop() {
    -    assert(serverInfo.isDefined, "Attempted to stop %s before binding to a server!".format(name))
    +    assert(serverInfo.isDefined,
    +      "Attempted to stop %s before binding to a server!".format(className))
         serverInfo.get.server.stop()
       }
     }
     
    +
     /**
    - * Utilities used throughout the web UI.
    + * A tab that represents a collection of pages.
      */
    -private[spark] object WebUI {
    -  // SimpleDateFormat is not thread-safe. Don't expose it to avoid improper use.
    -  private val dateFormat = new ThreadLocal[SimpleDateFormat]() {
    -    override def initialValue(): SimpleDateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss")
    +private[spark] abstract class WebUITab(parent: WebUI, val prefix: String) {
    +  val pages = ArrayBuffer[WebUIPage]()
    +  val name = prefix.capitalize
    +
    +  /** Attach a page to this tab. This prepends the page's prefix with the tab's own prefix. */
    +  def attachPage(page: WebUIPage) {
    +    page.prefix = (prefix + "/" + page.prefix).stripSuffix("/")
    +    pages += page
       }
     
    -  def formatDate(date: Date): String = dateFormat.get.format(date)
    +  /** Get a list of header tabs from the parent UI. */
    +  def headerTabs: Seq[WebUITab] = parent.getTabs
    +}
     
    -  def formatDate(timestamp: Long): String = dateFormat.get.format(new Date(timestamp))
     
    -  def formatDuration(milliseconds: Long): String = {
    -    val seconds = milliseconds.toDouble / 1000
    -    if (seconds < 60) {
    -      return "%.0f s".format(seconds)
    -    }
    -    val minutes = seconds / 60
    -    if (minutes < 10) {
    -      return "%.1f min".format(minutes)
    -    } else if (minutes < 60) {
    -      return "%.0f min".format(minutes)
    -    }
    -    val hours = minutes / 60
    -    "%.1f h".format(hours)
    -  }
    +/**
    + * A page that represents the leaf node in the UI hierarchy.
    + *
    + * The direct parent of a WebUIPage is not specified as it can be either a WebUI or a WebUITab.
    + * If includeJson is true, the parent WebUI (direct or indirect) creates handlers for both the
    + * HTML and the JSON content, rather than just the former.
    + */
    --- End diff --
    
    For the prefix variable here, it might be good to explain that it should not include a leading or trailing slash. Also, it wasn't totally clear to me whether this needed to be the entire path to the WebUI (e.g. `/stages/stage`) or just the last component in the path (`/stage`). In this light the name prefix was bit confusing because it, to me, seemed more like it would mean the former.
    
    It would be good to say it has no slashes and it is the path relative to the parent container. There might also be a more accurate name for this (e.g. `pathComponent`, though I don't really like that one). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40050339
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13985/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40166585
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14041/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40272669
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11498513
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,289 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.{Calendar, Locale}
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    +  extends UIPage("") with Logging {
    +
    +  private val ssc = parent.ssc
    +  private val sc = ssc.sparkContext
    +  private val sparkUI = sc.ui
    +  private val listener = new StreamingProgressListener(ssc)
    +  private val calendar = Calendar.getInstance()
    +  private val startTime = calendar.getTime()
    +  private val emptyCellTest = "-"
    +
    +  ssc.addStreamingListener(listener)
    +  parent.attachPage(this)
    +
    +  /** Render the page */
    +  override def render(request: HttpServletRequest): Seq[Node] = {
    +    val content =
    +      generateBasicStats() ++
    +      <br></br><h4>Statistics over last {listener.completedBatches.size} processed batches</h4> ++
    +      generateNetworkStatsTable() ++
    +      generateBatchStatsTable()
    +    UIUtils.headerSparkPage(
    +      content, sparkUI.basePath, sc.appName, "Streaming", sparkUI.getTabs, parent, Some(5000))
    +  }
    +
    +  /** Generate basic stats of the streaming program */
    +  private def generateBasicStats(): Seq[Node] = {
    +    val timeSinceStart = System.currentTimeMillis() - startTime.getTime
    +    <ul class ="unstyled">
    +      <li>
    +        <strong>Started at: </strong> {startTime.toString}
    +      </li>
    +      <li>
    +        <strong>Time since start: </strong>{msDurationToString(timeSinceStart)}
    +      </li>
    +      <li>
    +        <strong>Network receivers: </strong>{listener.numNetworkReceivers}
    +      </li>
    +      <li>
    +        <strong>Batch interval: </strong>{msDurationToString(listener.batchDuration)}
    +      </li>
    +      <li>
    +        <strong>Processed batches: </strong>{listener.numTotalCompletedBatches}
    +      </li>
    +      <li>
    +        <strong>Waiting batches: </strong>{listener.numUnprocessedBatches}
    +      </li>
    +    </ul>
    +  }
    +
    +  /** Generate stats of data received over the network the streaming program */
    +  private def generateNetworkStatsTable(): Seq[Node] = {
    +    val receivedRecordDistributions = listener.receivedRecordsDistributions
    +    val lastBatchReceivedRecord = listener.lastReceivedBatchRecords
    +    val table = if (receivedRecordDistributions.size > 0) {
    +      val headerRow = Seq(
    +        "Receiver",
    +        "Location",
    +        s"Records in last batch",
    +        "Minimum rate\n[records/sec]",
    +        "25th percentile rate\n[records/sec]",
    +        "Median rate\n[records/sec]",
    +        "75th percentile rate\n[records/sec]",
    +        "Maximum rate\n[records/sec]"
    +      )
    +      val dataRows = (0 until listener.numNetworkReceivers).map { receiverId =>
    +        val receiverInfo = listener.receiverInfo(receiverId)
    +        val receiverName = receiverInfo.map(_.toString).getOrElse(s"Receiver-$receiverId")
    +        val receiverLocation = receiverInfo.map(_.location).getOrElse(emptyCellTest)
    +        val receiverLastBatchRecords = numberToString(lastBatchReceivedRecord(receiverId))
    +        val receivedRecordStats = receivedRecordDistributions(receiverId).map { d =>
    +          d.getQuantiles().map(r => numberToString(r.toLong))
    +        }.getOrElse {
    +          Seq(emptyCellTest, emptyCellTest, emptyCellTest, emptyCellTest, emptyCellTest)
    +        }
    +        Seq(receiverName, receiverLocation, receiverLastBatchRecords) ++
    +          receivedRecordStats
    +      }
    +      Some(listingTable(headerRow, dataRows, fixedWidth = true))
    +    } else {
    +      None
    +    }
    +
    +    val content =
    +      <h5>Network Input Statistics</h5> ++
    +      <div>{table.getOrElse("No network receivers")}</div>
    +
    +    content
    +  }
    +
    +  /** Generate stats of batch jobs of the streaming program */
    +  private def generateBatchStatsTable(): Seq[Node] = {
    +    val numBatches = listener.completedBatches.size
    +    val lastCompletedBatch = listener.lastCompletedBatch
    +    val table = if (numBatches > 0) {
    +      val processingDelayQuantilesRow = {
    +        Seq(
    +          "Processing Time",
    +          msDurationToString(lastCompletedBatch.flatMap(_.processingDelay))
    +        ) ++ getQuantiles(listener.processingDelayDistribution)
    +      }
    +      val schedulingDelayQuantilesRow = {
    +        Seq(
    +          "Scheduling Delay",
    +          msDurationToString(lastCompletedBatch.flatMap(_.schedulingDelay))
    +        ) ++ getQuantiles(listener.schedulingDelayDistribution)
    +      }
    +      val totalDelayQuantilesRow = {
    +        Seq(
    +          "Total Delay",
    +          msDurationToString(lastCompletedBatch.flatMap(_.totalDelay))
    +        ) ++ getQuantiles(listener.totalDelayDistribution)
    +      }
    +      val headerRow = Seq("Metric", "Last batch", "Minimum", "25th percentile",
    +        "Median", "75th percentile", "Maximum")
    +      val dataRows: Seq[Seq[String]] = Seq(
    +        processingDelayQuantilesRow,
    +        schedulingDelayQuantilesRow,
    +        totalDelayQuantilesRow
    +      )
    +      Some(listingTable(headerRow, dataRows, fixedWidth = true))
    +    } else {
    +      None
    +    }
    +
    +    val content =
    +      <h5>Batch Processing Statistics</h5> ++
    +      <div>
    +        <ul class="unstyled">
    +          {table.getOrElse("No statistics have been generated yet.")}
    +        </ul>
    +      </div>
    +
    +    content
    +  }
    +
    +  /**
    +   * Returns a human-readable string representing a number
    +   */
    +  private def numberToString(records: Double): String = {
    +    val trillion = 1e12
    +    val billion = 1e9
    +    val million = 1e6
    +    val thousand = 1e3
    +
    +    val (value, unit) = {
    +      if (records >= 2*trillion) {
    +        (records / trillion, " T")
    +      } else if (records >= 2*billion) {
    +        (records / billion, " B")
    +      } else if (records >= 2*million) {
    +        (records / million, " M")
    +      } else if (records >= 2*thousand) {
    +        (records / thousand, " K")
    +      } else {
    +        (records, "")
    +      }
    +    }
    +    "%.1f%s".formatLocal(Locale.US, value, unit)
    +  }
    +
    +  /**
    +   * Returns a human-readable string representing a duration such as "5 second 35 ms"
    +   */
    +  private def msDurationToString(ms: Long): String = {
    +    try {
    +      val second = 1000L
    +      val minute = 60 * second
    +      val hour = 60 * minute
    +      val day = 24 * hour
    +      val week = 7 * day
    +      val year = 365 * day
    +
    +      def toString(num: Long, unit: String): String = {
    +        if (num == 0) {
    +          ""
    +        } else if (num == 1) {
    +          s"$num $unit"
    +        } else {
    +          s"$num ${unit}s"
    +        }
    +      }
    +
    +      val millisecondsString = if (ms >= second && ms % second == 0) "" else s"${ms % second} ms"
    +      val secondString = toString((ms % minute) / second, "second")
    +      val minuteString = toString((ms % hour) / minute, "minute")
    +      val hourString = toString((ms % day) / hour, "hour")
    +      val dayString = toString((ms % week) / day, "day")
    +      val weekString = toString((ms % year) / week, "week")
    +      val yearString = toString(ms / year, "year")
    +
    +      Seq(
    +        second -> millisecondsString,
    +        minute -> s"$secondString $millisecondsString",
    +        hour -> s"$minuteString $secondString",
    +        day -> s"$hourString $minuteString $secondString",
    +        week -> s"$dayString $hourString $minuteString",
    +        year -> s"$weekString $dayString $hourString"
    +      ).foreach {
    +        case (durationLimit, durationString) if (ms < durationLimit) =>
    +          return durationString
    +        case e: Any => // matcherror is thrown without this
    +      }
    +      return s"$yearString $weekString $dayString"
    +    } catch {
    +      case e: Exception =>
    +        logError("Error converting time to string", e)
    +        return ""
    --- End diff --
    
    super duper minor nit: return not needed here, and in L233


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40167726
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14037/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11517882
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala ---
    @@ -17,53 +17,134 @@
     
     package org.apache.spark.ui
     
    -import java.text.SimpleDateFormat
    -import java.util.Date
    +import javax.servlet.http.HttpServletRequest
     
    -private[spark] abstract class WebUI(name: String) {
    +import scala.collection.mutable.ArrayBuffer
    +import scala.xml.Node
    +
    +import org.eclipse.jetty.servlet.ServletContextHandler
    +import org.json4s.JsonAST.{JNothing, JValue}
    +
    +import org.apache.spark.{Logging, SecurityManager, SparkConf}
    +import org.apache.spark.ui.JettyUtils._
    +import org.apache.spark.util.Utils
    +
    +/**
    + * The top level component of the UI hierarchy that contains the server.
    + *
    + * Each WebUI represents a collection of tabs, each of which in turn represents a collection of
    + * pages. The use of tabs is optional, however; a WebUI may choose to include pages directly.
    + */
    +private[spark] abstract class WebUI(
    +    securityManager: SecurityManager,
    +    port: Int,
    +    conf: SparkConf,
    +    basePath: String = "")
    +  extends Logging {
    +
    +  protected val tabs = ArrayBuffer[WebUITab]()
    +  protected val handlers = ArrayBuffer[ServletContextHandler]()
       protected var serverInfo: Option[ServerInfo] = None
    +  protected val localHostName = Utils.localHostName()
    +  protected val publicHostName = Option(System.getenv("SPARK_PUBLIC_DNS")).getOrElse(localHostName)
    +  private val className = Utils.getFormattedClassName(this)
    +
    +  def getTabs: Seq[WebUITab] = tabs.toSeq
    +  def getHandlers: Seq[ServletContextHandler] = handlers.toSeq
     
    -  /**
    -   * Bind to the HTTP server behind this web interface.
    -   * Overridden implementation should set serverInfo.
    -   */
    -  def bind() { }
    +  /** Attach a tab to this UI, along with all of its attached pages. */
    +  def attachTab(tab: WebUITab) {
    +    tab.pages.foreach(attachPage)
    +    tabs += tab
    +  }
    +
    +  /** Attach a page to this UI. */
    +  def attachPage(page: WebUIPage) {
    +    val pagePath = "/" + page.prefix
    +    attachHandler(createServletHandler(pagePath,
    +      (request: HttpServletRequest) => page.render(request), securityManager, basePath))
    +    if (page.includeJson) {
    +      attachHandler(createServletHandler(pagePath.stripSuffix("/") + "/json",
    +        (request: HttpServletRequest) => page.renderJson(request), securityManager, basePath))
    +    }
    +  }
    +
    +  /** Attach a handler to this UI. */
    +  def attachHandler(handler: ServletContextHandler) {
    +    handlers += handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.addHandler(handler)
    +      if (!handler.isStarted) {
    +        handler.start()
    +      }
    +    }
    +  }
    +
    +  /** Detach a handler from this UI. */
    +  def detachHandler(handler: ServletContextHandler) {
    +    handlers -= handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.removeHandler(handler)
    +      if (handler.isStarted) {
    +        handler.stop()
    +      }
    +    }
    +  }
    +
    +  /** Initialize all components of the server. */
    +  def initialize()
    +
    +  /** Bind to the HTTP server behind this web interface. */
    +  def bind() {
    +    assert(!serverInfo.isDefined, "Attempted to bind %s more than once!".format(className))
    +    try {
    +      serverInfo = Some(startJettyServer("0.0.0.0", port, handlers, conf))
    +      logInfo("Started %s at http://%s:%d".format(className, publicHostName, boundPort))
    +    } catch {
    +      case e: Exception =>
    +        logError("Failed to bind %s".format(className), e)
    +        System.exit(1)
    +    }
    +  }
     
       /** Return the actual port to which this server is bound. Only valid after bind(). */
       def boundPort: Int = serverInfo.map(_.boundPort).getOrElse(-1)
     
       /** Stop the server behind this web interface. Only valid after bind(). */
       def stop() {
    -    assert(serverInfo.isDefined, "Attempted to stop %s before binding to a server!".format(name))
    +    assert(serverInfo.isDefined,
    +      "Attempted to stop %s before binding to a server!".format(className))
         serverInfo.get.server.stop()
       }
     }
     
    +
     /**
    - * Utilities used throughout the web UI.
    + * A tab that represents a collection of pages.
      */
    -private[spark] object WebUI {
    -  // SimpleDateFormat is not thread-safe. Don't expose it to avoid improper use.
    -  private val dateFormat = new ThreadLocal[SimpleDateFormat]() {
    -    override def initialValue(): SimpleDateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss")
    +private[spark] abstract class WebUITab(parent: WebUI, val prefix: String) {
    +  val pages = ArrayBuffer[WebUIPage]()
    +  val name = prefix.capitalize
    +
    +  /** Attach a page to this tab. This prepends the page's prefix with the tab's own prefix. */
    +  def attachPage(page: WebUIPage) {
    +    page.prefix = (prefix + "/" + page.prefix).stripSuffix("/")
    +    pages += page
       }
     
    -  def formatDate(date: Date): String = dateFormat.get.format(date)
    +  /** Get a list of header tabs from the parent UI. */
    +  def headerTabs: Seq[WebUITab] = parent.getTabs
    +}
     
    -  def formatDate(timestamp: Long): String = dateFormat.get.format(new Date(timestamp))
     
    -  def formatDuration(milliseconds: Long): String = {
    -    val seconds = milliseconds.toDouble / 1000
    -    if (seconds < 60) {
    -      return "%.0f s".format(seconds)
    -    }
    -    val minutes = seconds / 60
    -    if (minutes < 10) {
    -      return "%.1f min".format(minutes)
    -    } else if (minutes < 60) {
    -      return "%.0f min".format(minutes)
    -    }
    -    val hours = minutes / 60
    -    "%.1f h".format(hours)
    -  }
    +/**
    + * A page that represents the leaf node in the UI hierarchy.
    + *
    + * The direct parent of a WebUIPage is not specified as it can be either a WebUI or a WebUITab.
    + * If includeJson is true, the parent WebUI (direct or indirect) creates handlers for both the
    + * HTML and the JSON content, rather than just the former.
    + */
    +private[spark] abstract class WebUIPage(var prefix: String, val includeJson: Boolean = false) {
    --- End diff --
    
    I noticed that `includeJson` is only used to determine whether or not to add a handler at the `/json` path. What about just always adding a `/json` handler and just having it return `{}` for the UI's that chose not to implement it? Like you could have it be an optionally overriden method.
    
    Anyways, not a strong preference, just wondering if you considered this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11498474
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,289 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.{Calendar, Locale}
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    --- End diff --
    
    Or we could rename all the landing pages to follow this convention. Like `EnvironmentPage`, `JobProgressPage`, `BlockManagerPage`. On that note maybe we should rename `BlockManagerTab` to `StorageTab`, hmm.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40166514
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40117920
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40040834
  
    Build finished. All automated tests passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40040835
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13975/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11546163
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,180 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.Calendar
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.ui.UIUtils._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    +  extends WebUIPage("") with Logging {
    +
    +  private val listener = parent.listener
    +  private val startTime = Calendar.getInstance().getTime()
    +  private val emptyCellTest = "-"
    --- End diff --
    
    Dont know how that Test got there. Removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40156512
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40167526
  
    Build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40243023
  
    Build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39258499
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40130178
  
    Jenkins, retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11498393
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,289 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.{Calendar, Locale}
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    --- End diff --
    
    Should we call this IndexPage to be consistent with the other pages? This would mean each UITab has an index page, which is the default for that tab.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40242484
  
    Build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40038264
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40160471
  
    Build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11546175
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,180 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.Calendar
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.ui.UIUtils._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    +  extends WebUIPage("") with Logging {
    +
    +  private val listener = parent.listener
    +  private val startTime = Calendar.getInstance().getTime()
    +  private val emptyCellTest = "-"
    +
    +  /** Render the page */
    +  override def render(request: HttpServletRequest): Seq[Node] = {
    +    val content =
    +      generateBasicStats() ++
    +      <br></br><h4>Statistics over last {listener.completedBatches.size} processed batches</h4> ++
    +      generateNetworkStatsTable() ++
    +      generateBatchStatsTable()
    +    UIUtils.headerSparkPage(
    +      content, parent.basePath, parent.appName, "Streaming", parent.headerTabs, parent, Some(5000))
    +  }
    +
    +  /** Generate basic stats of the streaming program */
    +  private def generateBasicStats(): Seq[Node] = {
    +    val timeSinceStart = System.currentTimeMillis() - startTime.getTime
    +    <ul class ="unstyled">
    +      <li>
    +        <strong>Started at: </strong> {startTime.toString}
    +      </li>
    +      <li>
    +        <strong>Time since start: </strong>{formatDurationVerbose(timeSinceStart)}
    +      </li>
    +      <li>
    +        <strong>Network receivers: </strong>{listener.numNetworkReceivers}
    +      </li>
    +      <li>
    +        <strong>Batch interval: </strong>{formatDurationVerbose(listener.batchDuration)}
    +      </li>
    +      <li>
    +        <strong>Processed batches: </strong>{listener.numTotalCompletedBatches}
    +      </li>
    +      <li>
    +        <strong>Waiting batches: </strong>{listener.numUnprocessedBatches}
    +      </li>
    +    </ul>
    +  }
    +
    +  /** Generate stats of data received over the network the streaming program */
    +  private def generateNetworkStatsTable(): Seq[Node] = {
    +    val receivedRecordDistributions = listener.receivedRecordsDistributions
    +    val lastBatchReceivedRecord = listener.lastReceivedBatchRecords
    +    val table = if (receivedRecordDistributions.size > 0) {
    +      val headerRow = Seq(
    +        "Receiver",
    +        "Location",
    +        "Records in last batch\n[" + formatDate(Calendar.getInstance().getTime()) + "]",
    +        "Minimum rate\n[records/sec]",
    +        "25th percentile rate\n[records/sec]",
    +        "Median rate\n[records/sec]",
    +        "75th percentile rate\n[records/sec]",
    +        "Maximum rate\n[records/sec]"
    +      )
    +      val dataRows = (0 until listener.numNetworkReceivers).map { receiverId =>
    +        val receiverInfo = listener.receiverInfo(receiverId)
    +        val receiverName = receiverInfo.map(_.toString).getOrElse(s"Receiver-$receiverId")
    +        val receiverLocation = receiverInfo.map(_.location).getOrElse(emptyCellTest)
    +        val receiverLastBatchRecords = formatDurationVerbose(lastBatchReceivedRecord(receiverId))
    +        val receivedRecordStats = receivedRecordDistributions(receiverId).map { d =>
    +          d.getQuantiles().map(r => formatDurationVerbose(r.toLong))
    +        }.getOrElse {
    +          Seq(emptyCellTest, emptyCellTest, emptyCellTest, emptyCellTest, emptyCellTest)
    +        }
    +        Seq(receiverName, receiverLocation, receiverLastBatchRecords) ++ receivedRecordStats
    +      }
    +      Some(listingTable(headerRow, dataRows))
    +    } else {
    +      None
    +    }
    +
    +    val content =
    +      <h5>Network Input Statistics</h5> ++
    +      <div>{table.getOrElse("No network receivers")}</div>
    +
    +    content
    +  }
    +
    +  /** Generate stats of batch jobs of the streaming program */
    +  private def generateBatchStatsTable(): Seq[Node] = {
    +    val numBatches = listener.completedBatches.size
    +    val lastCompletedBatch = listener.lastCompletedBatch
    +    val table = if (numBatches > 0) {
    +      val processingDelayQuantilesRow = {
    +        Seq(
    +          "Processing Time",
    +          formatDurationOption(lastCompletedBatch.flatMap(_.processingDelay))
    +        ) ++ getQuantiles(listener.processingDelayDistribution)
    +      }
    +      val schedulingDelayQuantilesRow = {
    +        Seq(
    +          "Scheduling Delay",
    +          formatDurationOption(lastCompletedBatch.flatMap(_.schedulingDelay))
    +        ) ++ getQuantiles(listener.schedulingDelayDistribution)
    +      }
    +      val totalDelayQuantilesRow = {
    +        Seq(
    +          "Total Delay",
    +          formatDurationOption(lastCompletedBatch.flatMap(_.totalDelay))
    +        ) ++ getQuantiles(listener.totalDelayDistribution)
    +      }
    +      val headerRow = Seq("Metric", "Last batch", "Minimum", "25th percentile",
    +        "Median", "75th percentile", "Maximum")
    +      val dataRows: Seq[Seq[String]] = Seq(
    +        processingDelayQuantilesRow,
    +        schedulingDelayQuantilesRow,
    +        totalDelayQuantilesRow
    +      )
    +      Some(listingTable(headerRow, dataRows))
    +    } else {
    +      None
    +    }
    +
    +    val content =
    +      <h5>Batch Processing Statistics</h5> ++
    --- End diff --
    
    It does say in the UI. Here is how it looks.
    http://i.imgur.com/1ooDGhm.png


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40130259
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11546882
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala ---
    @@ -0,0 +1,148 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import org.apache.spark.streaming.{Time, StreamingContext}
    +import org.apache.spark.streaming.scheduler._
    +import scala.collection.mutable.{Queue, HashMap}
    +import org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
    +import org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
    +import org.apache.spark.streaming.scheduler.BatchInfo
    +import org.apache.spark.streaming.scheduler.ReceiverInfo
    +import org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
    +import org.apache.spark.util.Distribution
    +
    +
    +private[ui] class StreamingJobProgressListener(ssc: StreamingContext) extends StreamingListener {
    +
    +  private val waitingBatchInfos = new HashMap[Time, BatchInfo]
    +  private val runningBatchInfos = new HashMap[Time, BatchInfo]
    +  private val completedaBatchInfos = new Queue[BatchInfo]
    +  private val batchInfoLimit = ssc.conf.getInt("spark.steaming.ui.maxBatches", 100)
    +  private var totalCompletedBatches = 0L
    +  private val receiverInfos = new HashMap[Int, ReceiverInfo]
    +
    +  val batchDuration = ssc.graph.batchDuration.milliseconds
    +
    +  override def onReceiverStarted(receiverStarted: StreamingListenerReceiverStarted) = {
    +    synchronized {
    +      receiverInfos.put(receiverStarted.receiverInfo.streamId, receiverStarted.receiverInfo)
    +    }
    +  }
    +
    +  override def onBatchSubmitted(batchSubmitted: StreamingListenerBatchSubmitted) = synchronized {
    +    runningBatchInfos(batchSubmitted.batchInfo.batchTime) = batchSubmitted.batchInfo
    +  }
    +
    +  override def onBatchStarted(batchStarted: StreamingListenerBatchStarted) = synchronized {
    +    runningBatchInfos(batchStarted.batchInfo.batchTime) = batchStarted.batchInfo
    +    waitingBatchInfos.remove(batchStarted.batchInfo.batchTime)
    +  }
    +
    +  override def onBatchCompleted(batchCompleted: StreamingListenerBatchCompleted) = synchronized {
    +    waitingBatchInfos.remove(batchCompleted.batchInfo.batchTime)
    +    runningBatchInfos.remove(batchCompleted.batchInfo.batchTime)
    +    completedaBatchInfos.enqueue(batchCompleted.batchInfo)
    +    if (completedaBatchInfos.size > batchInfoLimit) completedaBatchInfos.dequeue()
    +    totalCompletedBatches += 1L
    +  }
    +
    +  def numNetworkReceivers = synchronized {
    +    ssc.graph.getNetworkInputStreams().size
    +  }
    +
    +  def numTotalCompletedBatches: Long = synchronized {
    +    totalCompletedBatches
    +  }
    +
    +  def numUnprocessedBatches: Long = synchronized {
    +    waitingBatchInfos.size + runningBatchInfos.size
    +  }
    +
    +  def waitingBatches: Seq[BatchInfo] = synchronized {
    +    waitingBatchInfos.values.toSeq
    +  }
    +
    +  def runningBatches: Seq[BatchInfo] = synchronized {
    +    runningBatchInfos.values.toSeq
    +  }
    +
    +  def completedBatches: Seq[BatchInfo] = synchronized {
    +    completedaBatchInfos.toSeq
    +  }
    +
    +  def processingDelayDistribution: Option[Distribution] = synchronized {
    +    extractDistribution(_.processingDelay)
    +  }
    +
    +  def schedulingDelayDistribution: Option[Distribution] = synchronized {
    +    extractDistribution(_.schedulingDelay)
    +  }
    +
    +  def totalDelayDistribution: Option[Distribution] = synchronized {
    +    extractDistribution(_.totalDelay)
    +  }
    +
    +  def receivedRecordsDistributions: Map[Int, Option[Distribution]] = synchronized {
    +    val latestBatchInfos = allBatches.reverse.take(batchInfoLimit)
    +    val latestBlockInfos = latestBatchInfos.map(_.receivedBlockInfo)
    +    (0 until numNetworkReceivers).map { receiverId =>
    +      val blockInfoOfParticularReceiver = latestBlockInfos.map { batchInfo =>
    +        batchInfo.get(receiverId).getOrElse(Array.empty)
    +      }
    +      val recordsOfParticularReceiver = blockInfoOfParticularReceiver.map { blockInfo =>
    +      // calculate records per second for each batch
    +        blockInfo.map(_.numRecords).sum.toDouble * 1000 / batchDuration
    +      }
    +      val distributionOption = Distribution(recordsOfParticularReceiver)
    +      (receiverId, distributionOption)
    +    }.toMap
    +  }
    +
    +  def lastReceivedBatchRecords: Map[Int, Long] = {
    +    val lastReceivedBlockInfoOption = lastReceivedBatch.map(_.receivedBlockInfo)
    +    lastReceivedBlockInfoOption.map { lastReceivedBlockInfo =>
    +      (0 until numNetworkReceivers).map { receiverId =>
    +        (receiverId, lastReceivedBlockInfo(receiverId).map(_.numRecords).sum)
    +      }.toMap
    +    }.getOrElse {
    +      (0 until numNetworkReceivers).map(receiverId => (receiverId, 0L)).toMap
    +    }
    +  }
    +
    +  def receiverInfo(receiverId: Int): Option[ReceiverInfo] = {
    +    receiverInfos.get(receiverId)
    +  }
    +
    +  def lastCompletedBatch: Option[BatchInfo] = {
    +    completedaBatchInfos.sortBy(_.batchTime)(Time.ordering).lastOption
    +  }
    +
    +  def lastReceivedBatch: Option[BatchInfo] = {
    +    allBatches.lastOption
    +  }
    +
    +  private def allBatches: Seq[BatchInfo] = synchronized {
    --- End diff --
    
    Yeah. I also renamed completedBatches to retainedCompletedBatches for semantic consistency. Have to put in the "completed" to differentiate from runningBatches, and waitingBatches.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40167723
  
    Build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40242872
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11546692
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,180 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.Calendar
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.ui.UIUtils._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    +  extends WebUIPage("") with Logging {
    +
    +  private val listener = parent.listener
    +  private val startTime = Calendar.getInstance().getTime()
    +  private val emptyCellTest = "-"
    +
    +  /** Render the page */
    +  override def render(request: HttpServletRequest): Seq[Node] = {
    +    val content =
    +      generateBasicStats() ++
    +      <br></br><h4>Statistics over last {listener.completedBatches.size} processed batches</h4> ++
    +      generateNetworkStatsTable() ++
    +      generateBatchStatsTable()
    +    UIUtils.headerSparkPage(
    +      content, parent.basePath, parent.appName, "Streaming", parent.headerTabs, parent, Some(5000))
    +  }
    +
    +  /** Generate basic stats of the streaming program */
    +  private def generateBasicStats(): Seq[Node] = {
    --- End diff --
    
    Yeah, I think this is a sort of a philosophy of mine. When a function generates and returns something substantial (like this generating a reasonably complex table), I usually add a verb. If a functions return  something trivial that could easily have been a field rather than a method, I usually omit the verb. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39514049
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40165469
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40269684
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40168270
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/290


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40243869
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40271639
  
    Sure


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40128598
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14005/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40168298
  
    @pwendell Good to go.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11545788
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala ---
    @@ -0,0 +1,148 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import org.apache.spark.streaming.{Time, StreamingContext}
    +import org.apache.spark.streaming.scheduler._
    +import scala.collection.mutable.{Queue, HashMap}
    +import org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
    +import org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
    +import org.apache.spark.streaming.scheduler.BatchInfo
    +import org.apache.spark.streaming.scheduler.ReceiverInfo
    +import org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
    +import org.apache.spark.util.Distribution
    +
    +
    +private[ui] class StreamingJobProgressListener(ssc: StreamingContext) extends StreamingListener {
    +
    +  private val waitingBatchInfos = new HashMap[Time, BatchInfo]
    +  private val runningBatchInfos = new HashMap[Time, BatchInfo]
    +  private val completedaBatchInfos = new Queue[BatchInfo]
    +  private val batchInfoLimit = ssc.conf.getInt("spark.steaming.ui.maxBatches", 100)
    +  private var totalCompletedBatches = 0L
    +  private val receiverInfos = new HashMap[Int, ReceiverInfo]
    +
    +  val batchDuration = ssc.graph.batchDuration.milliseconds
    +
    +  override def onReceiverStarted(receiverStarted: StreamingListenerReceiverStarted) = {
    +    synchronized {
    +      receiverInfos.put(receiverStarted.receiverInfo.streamId, receiverStarted.receiverInfo)
    +    }
    +  }
    +
    +  override def onBatchSubmitted(batchSubmitted: StreamingListenerBatchSubmitted) = synchronized {
    +    runningBatchInfos(batchSubmitted.batchInfo.batchTime) = batchSubmitted.batchInfo
    +  }
    +
    +  override def onBatchStarted(batchStarted: StreamingListenerBatchStarted) = synchronized {
    +    runningBatchInfos(batchStarted.batchInfo.batchTime) = batchStarted.batchInfo
    +    waitingBatchInfos.remove(batchStarted.batchInfo.batchTime)
    +  }
    +
    +  override def onBatchCompleted(batchCompleted: StreamingListenerBatchCompleted) = synchronized {
    +    waitingBatchInfos.remove(batchCompleted.batchInfo.batchTime)
    +    runningBatchInfos.remove(batchCompleted.batchInfo.batchTime)
    +    completedaBatchInfos.enqueue(batchCompleted.batchInfo)
    +    if (completedaBatchInfos.size > batchInfoLimit) completedaBatchInfos.dequeue()
    +    totalCompletedBatches += 1L
    +  }
    +
    +  def numNetworkReceivers = synchronized {
    +    ssc.graph.getNetworkInputStreams().size
    +  }
    +
    +  def numTotalCompletedBatches: Long = synchronized {
    +    totalCompletedBatches
    +  }
    +
    +  def numUnprocessedBatches: Long = synchronized {
    +    waitingBatchInfos.size + runningBatchInfos.size
    +  }
    +
    +  def waitingBatches: Seq[BatchInfo] = synchronized {
    +    waitingBatchInfos.values.toSeq
    +  }
    +
    +  def runningBatches: Seq[BatchInfo] = synchronized {
    +    runningBatchInfos.values.toSeq
    +  }
    +
    +  def completedBatches: Seq[BatchInfo] = synchronized {
    +    completedaBatchInfos.toSeq
    +  }
    +
    +  def processingDelayDistribution: Option[Distribution] = synchronized {
    +    extractDistribution(_.processingDelay)
    +  }
    +
    +  def schedulingDelayDistribution: Option[Distribution] = synchronized {
    +    extractDistribution(_.schedulingDelay)
    +  }
    +
    +  def totalDelayDistribution: Option[Distribution] = synchronized {
    +    extractDistribution(_.totalDelay)
    +  }
    +
    +  def receivedRecordsDistributions: Map[Int, Option[Distribution]] = synchronized {
    +    val latestBatchInfos = allBatches.reverse.take(batchInfoLimit)
    +    val latestBlockInfos = latestBatchInfos.map(_.receivedBlockInfo)
    +    (0 until numNetworkReceivers).map { receiverId =>
    +      val blockInfoOfParticularReceiver = latestBlockInfos.map { batchInfo =>
    +        batchInfo.get(receiverId).getOrElse(Array.empty)
    +      }
    +      val recordsOfParticularReceiver = blockInfoOfParticularReceiver.map { blockInfo =>
    +      // calculate records per second for each batch
    +        blockInfo.map(_.numRecords).sum.toDouble * 1000 / batchDuration
    +      }
    +      val distributionOption = Distribution(recordsOfParticularReceiver)
    +      (receiverId, distributionOption)
    +    }.toMap
    +  }
    +
    +  def lastReceivedBatchRecords: Map[Int, Long] = {
    +    val lastReceivedBlockInfoOption = lastReceivedBatch.map(_.receivedBlockInfo)
    +    lastReceivedBlockInfoOption.map { lastReceivedBlockInfo =>
    +      (0 until numNetworkReceivers).map { receiverId =>
    +        (receiverId, lastReceivedBlockInfo(receiverId).map(_.numRecords).sum)
    +      }.toMap
    +    }.getOrElse {
    +      (0 until numNetworkReceivers).map(receiverId => (receiverId, 0L)).toMap
    +    }
    +  }
    +
    +  def receiverInfo(receiverId: Int): Option[ReceiverInfo] = {
    +    receiverInfos.get(receiverId)
    +  }
    +
    +  def lastCompletedBatch: Option[BatchInfo] = {
    +    completedaBatchInfos.sortBy(_.batchTime)(Time.ordering).lastOption
    +  }
    +
    +  def lastReceivedBatch: Option[BatchInfo] = {
    +    allBatches.lastOption
    +  }
    +
    +  private def allBatches: Seq[BatchInfo] = synchronized {
    --- End diff --
    
    maybe this should be called `retainedBatches`? In some sense this is not "all" of the batches because some have been evicted right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39258686
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39258514
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40273164
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14072/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40272667
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40167478
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40117879
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40045216
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13984/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11546093
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/UIUtils.scala ---
    @@ -67,19 +154,23 @@ private[spark] object UIUtils {
                   type="text/css" />
             <script src={prependBaseUri("/static/sorttable.js")} ></script>
             <title>{appName} - {title}</title>
    +        <script type="text/JavaScript">
    +          <!--
    +          function timedRefresh(timeoutPeriod) {
    +            if (timeoutPeriod > 0) {
    +              setTimeout("location.reload(true);",timeoutPeriod);
    --- End diff --
    
    space after ","


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40270374
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14068/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40247442
  
    All automated tests passed.
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14059/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11518334
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/WebUI.scala ---
    @@ -17,53 +17,134 @@
     
     package org.apache.spark.ui
     
    -import java.text.SimpleDateFormat
    -import java.util.Date
    +import javax.servlet.http.HttpServletRequest
     
    -private[spark] abstract class WebUI(name: String) {
    +import scala.collection.mutable.ArrayBuffer
    +import scala.xml.Node
    +
    +import org.eclipse.jetty.servlet.ServletContextHandler
    +import org.json4s.JsonAST.{JNothing, JValue}
    +
    +import org.apache.spark.{Logging, SecurityManager, SparkConf}
    +import org.apache.spark.ui.JettyUtils._
    +import org.apache.spark.util.Utils
    +
    +/**
    + * The top level component of the UI hierarchy that contains the server.
    + *
    + * Each WebUI represents a collection of tabs, each of which in turn represents a collection of
    + * pages. The use of tabs is optional, however; a WebUI may choose to include pages directly.
    + */
    +private[spark] abstract class WebUI(
    +    securityManager: SecurityManager,
    +    port: Int,
    +    conf: SparkConf,
    +    basePath: String = "")
    +  extends Logging {
    +
    +  protected val tabs = ArrayBuffer[WebUITab]()
    +  protected val handlers = ArrayBuffer[ServletContextHandler]()
       protected var serverInfo: Option[ServerInfo] = None
    +  protected val localHostName = Utils.localHostName()
    +  protected val publicHostName = Option(System.getenv("SPARK_PUBLIC_DNS")).getOrElse(localHostName)
    +  private val className = Utils.getFormattedClassName(this)
    +
    +  def getTabs: Seq[WebUITab] = tabs.toSeq
    +  def getHandlers: Seq[ServletContextHandler] = handlers.toSeq
     
    -  /**
    -   * Bind to the HTTP server behind this web interface.
    -   * Overridden implementation should set serverInfo.
    -   */
    -  def bind() { }
    +  /** Attach a tab to this UI, along with all of its attached pages. */
    +  def attachTab(tab: WebUITab) {
    +    tab.pages.foreach(attachPage)
    +    tabs += tab
    +  }
    +
    +  /** Attach a page to this UI. */
    +  def attachPage(page: WebUIPage) {
    +    val pagePath = "/" + page.prefix
    +    attachHandler(createServletHandler(pagePath,
    +      (request: HttpServletRequest) => page.render(request), securityManager, basePath))
    +    if (page.includeJson) {
    +      attachHandler(createServletHandler(pagePath.stripSuffix("/") + "/json",
    +        (request: HttpServletRequest) => page.renderJson(request), securityManager, basePath))
    +    }
    +  }
    +
    +  /** Attach a handler to this UI. */
    +  def attachHandler(handler: ServletContextHandler) {
    +    handlers += handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.addHandler(handler)
    +      if (!handler.isStarted) {
    +        handler.start()
    +      }
    +    }
    +  }
    +
    +  /** Detach a handler from this UI. */
    +  def detachHandler(handler: ServletContextHandler) {
    +    handlers -= handler
    +    serverInfo.foreach { info =>
    +      info.rootHandler.removeHandler(handler)
    +      if (handler.isStarted) {
    +        handler.stop()
    +      }
    +    }
    +  }
    +
    +  /** Initialize all components of the server. */
    +  def initialize()
    +
    +  /** Bind to the HTTP server behind this web interface. */
    +  def bind() {
    +    assert(!serverInfo.isDefined, "Attempted to bind %s more than once!".format(className))
    +    try {
    +      serverInfo = Some(startJettyServer("0.0.0.0", port, handlers, conf))
    +      logInfo("Started %s at http://%s:%d".format(className, publicHostName, boundPort))
    +    } catch {
    +      case e: Exception =>
    +        logError("Failed to bind %s".format(className), e)
    +        System.exit(1)
    +    }
    +  }
     
       /** Return the actual port to which this server is bound. Only valid after bind(). */
       def boundPort: Int = serverInfo.map(_.boundPort).getOrElse(-1)
     
       /** Stop the server behind this web interface. Only valid after bind(). */
       def stop() {
    -    assert(serverInfo.isDefined, "Attempted to stop %s before binding to a server!".format(name))
    +    assert(serverInfo.isDefined,
    +      "Attempted to stop %s before binding to a server!".format(className))
         serverInfo.get.server.stop()
       }
     }
     
    +
     /**
    - * Utilities used throughout the web UI.
    + * A tab that represents a collection of pages.
      */
    -private[spark] object WebUI {
    -  // SimpleDateFormat is not thread-safe. Don't expose it to avoid improper use.
    -  private val dateFormat = new ThreadLocal[SimpleDateFormat]() {
    -    override def initialValue(): SimpleDateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss")
    +private[spark] abstract class WebUITab(parent: WebUI, val prefix: String) {
    +  val pages = ArrayBuffer[WebUIPage]()
    +  val name = prefix.capitalize
    +
    +  /** Attach a page to this tab. This prepends the page's prefix with the tab's own prefix. */
    +  def attachPage(page: WebUIPage) {
    +    page.prefix = (prefix + "/" + page.prefix).stripSuffix("/")
    +    pages += page
       }
     
    -  def formatDate(date: Date): String = dateFormat.get.format(date)
    +  /** Get a list of header tabs from the parent UI. */
    +  def headerTabs: Seq[WebUITab] = parent.getTabs
    +}
     
    -  def formatDate(timestamp: Long): String = dateFormat.get.format(new Date(timestamp))
     
    -  def formatDuration(milliseconds: Long): String = {
    -    val seconds = milliseconds.toDouble / 1000
    -    if (seconds < 60) {
    -      return "%.0f s".format(seconds)
    -    }
    -    val minutes = seconds / 60
    -    if (minutes < 10) {
    -      return "%.1f min".format(minutes)
    -    } else if (minutes < 60) {
    -      return "%.0f min".format(minutes)
    -    }
    -    val hours = minutes / 60
    -    "%.1f h".format(hours)
    -  }
    +/**
    + * A page that represents the leaf node in the UI hierarchy.
    + *
    + * The direct parent of a WebUIPage is not specified as it can be either a WebUI or a WebUITab.
    + * If includeJson is true, the parent WebUI (direct or indirect) creates handlers for both the
    + * HTML and the JSON content, rather than just the former.
    + */
    +private[spark] abstract class WebUIPage(var prefix: String, val includeJson: Boolean = false) {
    +  def render(request: HttpServletRequest): Seq[Node] = Seq[Node]()
    --- End diff --
    
    Makes sense. Initially it was purely abstract, but I added the JSON thing and I wanted render and renderJson to be consistent. I'll change it back, but keep a default return value for renderJson.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40272670
  
    @tdas yep, merges now. I'll let it run tests and then it's good to go. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40243853
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40150589
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11517812
  
    --- Diff: core/src/main/scala/org/apache/spark/ui/SparkUI.scala ---
    @@ -17,107 +17,78 @@
     
     package org.apache.spark.ui
     
    -import org.eclipse.jetty.servlet.ServletContextHandler
    -
    -import org.apache.spark.{Logging, SecurityManager, SparkConf, SparkContext, SparkEnv}
    +import org.apache.spark.{Logging, SecurityManager, SparkConf, SparkContext}
     import org.apache.spark.scheduler._
     import org.apache.spark.storage.StorageStatusListener
     import org.apache.spark.ui.JettyUtils._
    -import org.apache.spark.ui.env.EnvironmentUI
    -import org.apache.spark.ui.exec.ExecutorsUI
    -import org.apache.spark.ui.jobs.JobProgressUI
    -import org.apache.spark.ui.storage.BlockManagerUI
    -import org.apache.spark.util.Utils
    +import org.apache.spark.ui.env.EnvironmentTab
    +import org.apache.spark.ui.exec.ExecutorsTab
    +import org.apache.spark.ui.jobs.JobProgressTab
    +import org.apache.spark.ui.storage.StorageTab
     
    -/** Top level user interface for Spark */
    +/**
    + * Top level user interface for Spark.
    --- End diff --
    
    Not your change, but I realize this isn't the most helpful doc. What about saying `Top level user interface for a Spark application.`. Then in other cases, where e.g. these get passed around it's more clear.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-39514059
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11498951
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,289 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.{Calendar, Locale}
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    +  extends UIPage("") with Logging {
    +
    +  private val ssc = parent.ssc
    +  private val sc = ssc.sparkContext
    +  private val sparkUI = sc.ui
    +  private val listener = new StreamingProgressListener(ssc)
    +  private val calendar = Calendar.getInstance()
    +  private val startTime = calendar.getTime()
    +  private val emptyCellTest = "-"
    +
    +  ssc.addStreamingListener(listener)
    +  parent.attachPage(this)
    +
    +  /** Render the page */
    +  override def render(request: HttpServletRequest): Seq[Node] = {
    +    val content =
    +      generateBasicStats() ++
    +      <br></br><h4>Statistics over last {listener.completedBatches.size} processed batches</h4> ++
    +      generateNetworkStatsTable() ++
    +      generateBatchStatsTable()
    +    UIUtils.headerSparkPage(
    +      content, sparkUI.basePath, sc.appName, "Streaming", sparkUI.getTabs, parent, Some(5000))
    +  }
    +
    +  /** Generate basic stats of the streaming program */
    +  private def generateBasicStats(): Seq[Node] = {
    +    val timeSinceStart = System.currentTimeMillis() - startTime.getTime
    +    <ul class ="unstyled">
    +      <li>
    +        <strong>Started at: </strong> {startTime.toString}
    +      </li>
    +      <li>
    +        <strong>Time since start: </strong>{msDurationToString(timeSinceStart)}
    +      </li>
    +      <li>
    +        <strong>Network receivers: </strong>{listener.numNetworkReceivers}
    +      </li>
    +      <li>
    +        <strong>Batch interval: </strong>{msDurationToString(listener.batchDuration)}
    +      </li>
    +      <li>
    +        <strong>Processed batches: </strong>{listener.numTotalCompletedBatches}
    +      </li>
    +      <li>
    +        <strong>Waiting batches: </strong>{listener.numUnprocessedBatches}
    +      </li>
    +    </ul>
    +  }
    +
    +  /** Generate stats of data received over the network the streaming program */
    +  private def generateNetworkStatsTable(): Seq[Node] = {
    +    val receivedRecordDistributions = listener.receivedRecordsDistributions
    +    val lastBatchReceivedRecord = listener.lastReceivedBatchRecords
    +    val table = if (receivedRecordDistributions.size > 0) {
    +      val headerRow = Seq(
    +        "Receiver",
    +        "Location",
    +        s"Records in last batch",
    +        "Minimum rate\n[records/sec]",
    +        "25th percentile rate\n[records/sec]",
    +        "Median rate\n[records/sec]",
    +        "75th percentile rate\n[records/sec]",
    +        "Maximum rate\n[records/sec]"
    +      )
    +      val dataRows = (0 until listener.numNetworkReceivers).map { receiverId =>
    +        val receiverInfo = listener.receiverInfo(receiverId)
    +        val receiverName = receiverInfo.map(_.toString).getOrElse(s"Receiver-$receiverId")
    +        val receiverLocation = receiverInfo.map(_.location).getOrElse(emptyCellTest)
    +        val receiverLastBatchRecords = numberToString(lastBatchReceivedRecord(receiverId))
    +        val receivedRecordStats = receivedRecordDistributions(receiverId).map { d =>
    +          d.getQuantiles().map(r => numberToString(r.toLong))
    +        }.getOrElse {
    +          Seq(emptyCellTest, emptyCellTest, emptyCellTest, emptyCellTest, emptyCellTest)
    +        }
    +        Seq(receiverName, receiverLocation, receiverLastBatchRecords) ++
    +          receivedRecordStats
    --- End diff --
    
    I think this can be bumped to the previous line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11543991
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,180 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.Calendar
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.ui.UIUtils._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    +  extends WebUIPage("") with Logging {
    +
    +  private val listener = parent.listener
    +  private val startTime = Calendar.getInstance().getTime()
    +  private val emptyCellTest = "-"
    --- End diff --
    
    could this just be called `emptyCell`? Not clear to me how this is a test of something.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40169633
  
    Build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40167527
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14043/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40169634
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14045/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40243025
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14058/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40045215
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40269688
  
    Build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40269587
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40165466
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11181569
  
    --- Diff: streaming/src/test/scala/org/apache/spark/streaming/UISuite.scala ---
    @@ -0,0 +1,52 @@
    +package org.apache.spark.streaming
    --- End diff --
    
    Please ignore this file for now. This is WIP and more UI unit tests need to be added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming [WIP]

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40038258
  
     Build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/290#discussion_r11498363
  
    --- Diff: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingPage.scala ---
    @@ -0,0 +1,289 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.streaming.ui
    +
    +import java.util.{Calendar, Locale}
    +import javax.servlet.http.HttpServletRequest
    +
    +import scala.xml.Node
    +
    +import org.apache.spark.Logging
    +import org.apache.spark.ui._
    +import org.apache.spark.util.Distribution
    +
    +/** Page for Spark Web UI that shows statistics of a streaming job */
    +private[ui] class StreamingPage(parent: StreamingTab)
    +  extends UIPage("") with Logging {
    +
    +  private val ssc = parent.ssc
    +  private val sc = ssc.sparkContext
    +  private val sparkUI = sc.ui
    +  private val listener = new StreamingProgressListener(ssc)
    +  private val calendar = Calendar.getInstance()
    +  private val startTime = calendar.getTime()
    +  private val emptyCellTest = "-"
    +
    +  ssc.addStreamingListener(listener)
    +  parent.attachPage(this)
    +
    +  /** Render the page */
    +  override def render(request: HttpServletRequest): Seq[Node] = {
    +    val content =
    +      generateBasicStats() ++
    +      <br></br><h4>Statistics over last {listener.completedBatches.size} processed batches</h4> ++
    +      generateNetworkStatsTable() ++
    +      generateBatchStatsTable()
    +    UIUtils.headerSparkPage(
    +      content, sparkUI.basePath, sc.appName, "Streaming", sparkUI.getTabs, parent, Some(5000))
    +  }
    +
    +  /** Generate basic stats of the streaming program */
    +  private def generateBasicStats(): Seq[Node] = {
    +    val timeSinceStart = System.currentTimeMillis() - startTime.getTime
    +    <ul class ="unstyled">
    +      <li>
    +        <strong>Started at: </strong> {startTime.toString}
    +      </li>
    +      <li>
    +        <strong>Time since start: </strong>{msDurationToString(timeSinceStart)}
    +      </li>
    +      <li>
    +        <strong>Network receivers: </strong>{listener.numNetworkReceivers}
    +      </li>
    +      <li>
    +        <strong>Batch interval: </strong>{msDurationToString(listener.batchDuration)}
    +      </li>
    +      <li>
    +        <strong>Processed batches: </strong>{listener.numTotalCompletedBatches}
    +      </li>
    +      <li>
    +        <strong>Waiting batches: </strong>{listener.numUnprocessedBatches}
    +      </li>
    +    </ul>
    +  }
    +
    +  /** Generate stats of data received over the network the streaming program */
    +  private def generateNetworkStatsTable(): Seq[Node] = {
    +    val receivedRecordDistributions = listener.receivedRecordsDistributions
    +    val lastBatchReceivedRecord = listener.lastReceivedBatchRecords
    +    val table = if (receivedRecordDistributions.size > 0) {
    +      val headerRow = Seq(
    +        "Receiver",
    +        "Location",
    +        s"Records in last batch",
    --- End diff --
    
    The `s` prefix is not needed here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40118057
  
    @tdas Let me know when you're trying to merge with master. #204 was just merged in and there will be plenty of UI conflicts. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by pwendell <gi...@git.apache.org>.
Github user pwendell commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40273215
  
    Thanks - merged this and picked it into 1.0. @andrewor14: get some sleep.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-1386] Web UI for Spark Streaming

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/290#issuecomment-40045530
  
    @pwendell This ready for review to be started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---