You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "jasonli-db (via GitHub)" <gi...@apache.org> on 2023/07/12 17:21:19 UTC

[GitHub] [spark] jasonli-db opened a new pull request, #41964: [WIP][SPARK-44394] Add a Spark UI page for Spark Connect

jasonli-db opened a new pull request, #41964:
URL: https://github.com/apache/spark/pull/41964

   
   ## What changes were proposed in this pull request?
   Add a new Spark UI page to display session and execution information for Spark Connect. This builds of the work in SPARK-43923 (https://github.com/apache/spark/pull/41443) that adds the relevant SparkListenerEvents and mirrors the ThriftServerPage in the Spark UI for JDBC/ODBC.
   
   ### Why are the changes needed?
   This gives users a way to access session and execution information for Spark Connect via the UI and provides the frontend interface for the related SparkListenerEvents.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, it will add a new tab/page in the Spark UI
   
   
   ### How was this patch tested?
   Unit tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1274726055


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerListener.scala:
##########
@@ -0,0 +1,369 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.{SparkConf, SparkContext, SparkEnv}
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config.Status.LIVE_ENTITY_UPDATE_PERIOD
+import org.apache.spark.scheduler._
+import org.apache.spark.sql.connect.config.Connect.{CONNECT_UI_SESSION_LIMIT, CONNECT_UI_STATEMENT_LIMIT}
+import org.apache.spark.sql.connect.service._
+import org.apache.spark.sql.execution.SQLExecution
+import org.apache.spark.status.{ElementTrackingStore, KVUtils, LiveEntity}
+
+private[connect] class SparkConnectServerListener(
+    kvstore: ElementTrackingStore,
+    sparkConf: SparkConf,
+    live: Boolean = true)
+    extends SparkListener
+    with Logging {
+
+  private val sessionList = new mutable.LinkedHashMap[String, LiveSessionData]
+  private val executionList = new mutable.LinkedHashMap[String, LiveExecutionData]
+
+  private val (retainedStatements: Int, retainedSessions: Int) = {
+    (
+      SparkEnv.get.conf.get(CONNECT_UI_STATEMENT_LIMIT),
+      SparkEnv.get.conf.get(CONNECT_UI_SESSION_LIMIT))
+  }
+
+  // How often to update live entities. -1 means "never update" when replaying applications,
+  // meaning only the last write will happen. For live applications, this avoids a few
+  // operations that we can live without when rapidly processing incoming events.
+  private val liveUpdatePeriodNs = if (live) sparkConf.get(LIVE_ENTITY_UPDATE_PERIOD) else -1L
+
+  // Returns true if this listener has no live data. Exposed for tests only.
+  private[connect] def noLiveData(): Boolean = synchronized {
+    sessionList.isEmpty && executionList.isEmpty
+  }
+
+  kvstore.addTrigger(classOf[SessionInfo], retainedSessions) { count =>
+    cleanupSession(count)
+  }
+
+  kvstore.addTrigger(classOf[ExecutionInfo], retainedStatements) { count =>
+    cleanupExecutions(count)
+  }
+
+  override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
+    val executionIdOpt: Option[String] = Option(jobStart.properties)
+      .flatMap { p => Option(p.getProperty(SQLExecution.EXECUTION_ID_KEY)) }

Review Comment:
   Discussed offline with @jasonli-db and @jdesjean that for executions that don't start a Spark Job (e.g. SELECT 1), they will not be recorded. For that reason, as a followup, it would be best if jobTags could also be added to the SparkListenerSQLExecutionStart event.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654798569

   > OperationId could indeed maybe be more useful instead. If a user uses connect api like SparkSession.interruptOperation, they pass just operationId. The only time a user would want to use jobTag is if they wanted to directly call SparkContext.interruptJobsWithTag, which is a rather internal use. We could consider displaying SparkListenerConnectOperationStarted.sparkSessionTags, because these are uses settable (via connect SparkSession.addTag / removeTag / clearTags), and they are used by users to call connect SparkSession.interruptTag.
   
   @juliuszsompolski @gengliangwang I've updated the screenshots. I added the operationID and sparkSessionTags. For now, I kept the jobTag since it seems it could still be used if only internally. I'll remove it if/when we come to a final consensus to do so. Please take a look!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1266396999


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectServer.scala:
##########
@@ -49,6 +49,7 @@ object SparkConnectServer extends Logging {
       SparkConnectService.server.awaitTermination()
     } finally {
       session.stop()
+      SparkConnectService.uiTab.foreach(_.detach())

Review Comment:
   wouldn't SparkConnectService.stop() have been called already somewhere and already detached it?



##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -179,6 +182,9 @@ object SparkConnectService {
 
   private[connect] var server: Server = _
 
+  var uiTab: Option[SparkConnectServerTab] = None
+  var listener: SparkConnectServerListener = _

Review Comment:
   could both be private, or `private[connect]`?



##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -259,13 +281,15 @@ object SparkConnectService {
     if (debugMode) {
       sb.addService(ProtoReflectionService.newInstance())
     }
+
     server = sb.build
     server.start()
+    createListenerAndUI(sc)

Review Comment:
   nit: maybe move `createListenerAndUI` to `start`, just like `detach` is in `stop`. Then don't need to pass sc to `startGRPCService`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1648332877

   > > Since the Spark connect is more about SQL, shall we show the SQL execution links on the UI?
   > 
   > @jasonli-db What is your opinion of this one?
    I agree that adding the SQL execution links would be ideal and useful for the user. @juliuszsompolski Can you give your input on this and confirm whether or not this is currently possible. Do we have a link between SQL execution IDs and Connect requests?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654793631

   > @jasonli-db Please try more tests. I did a quick test and find the SQL IDs need deduplication. <img alt="image" width="2000" src="https://user-images.githubusercontent.com/1097932/256668978-975760a1-c60c-451c-b74d-2a36e7b4a7d4.png">
   
   Thanks for catching this. It's been fixed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] rednaxelafx commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "rednaxelafx (via GitHub)" <gi...@apache.org>.
rednaxelafx commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276799942


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerHistoryServerPlugin.scala:
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import org.apache.spark.SparkConf
+import org.apache.spark.scheduler.SparkListener
+import org.apache.spark.status.{AppHistoryServerPlugin, ElementTrackingStore}
+import org.apache.spark.ui.SparkUI
+
+class SparkConnectServerHistoryServerPlugin extends AppHistoryServerPlugin {
+
+  override def createListeners(
+      conf: SparkConf,
+      store: ElementTrackingStore): Seq[SparkListener] = {
+    Seq(new SparkConnectServerListener(store, conf))
+  }
+
+  override def setupUI(ui: SparkUI): Unit = {
+    val store = new SparkConnectServerAppStatusStore(ui.store.store)
+    if (store.getSessionCount > 0) {
+      new SparkConnectServerTab(store, ui)
+    }
+  }
+
+  override def displayOrder: Int = 1

Review Comment:
   I prefer doing the wholesale update in a separate PR, but please check with the Apache Spark committers in this thread to confirm 🙏 



##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerHistoryServerPlugin.scala:
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import org.apache.spark.SparkConf
+import org.apache.spark.scheduler.SparkListener
+import org.apache.spark.status.{AppHistoryServerPlugin, ElementTrackingStore}
+import org.apache.spark.ui.SparkUI
+
+class SparkConnectServerHistoryServerPlugin extends AppHistoryServerPlugin {
+
+  override def createListeners(
+      conf: SparkConf,
+      store: ElementTrackingStore): Seq[SparkListener] = {
+    Seq(new SparkConnectServerListener(store, conf))
+  }
+
+  override def setupUI(ui: SparkUI): Unit = {
+    val store = new SparkConnectServerAppStatusStore(ui.store.store)
+    if (store.getSessionCount > 0) {
+      new SparkConnectServerTab(store, ui)
+    }
+  }
+
+  override def displayOrder: Int = 1

Review Comment:
   I prefer doing the wholesale update in a separate PR, but please check with the Apache Spark committers in this thread to confirm 🙏 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1656786225

   Thanks @gengliangwang . Since the conflict was trivial (imports), I resolved and raised https://github.com/apache/spark/pull/42224 for branch-3.5.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1648322890

   > Since the Spark connect is more about SQL, shall we show the SQL execution links on the UI?
   
   @jasonli-db What is your opinion of this one?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276817771


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerPage.scala:
##########
@@ -0,0 +1,529 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import java.net.URLEncoder
+import java.nio.charset.StandardCharsets.UTF_8
+import javax.servlet.http.HttpServletRequest
+
+import scala.xml.Node
+
+import org.apache.commons.text.StringEscapeUtils
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.connect.ui.ToolTips._
+import org.apache.spark.ui._
+import org.apache.spark.ui.UIUtils._
+import org.apache.spark.util.Utils
+
+/** Page for Spark UI that shows statistics for a Spark Connect Server. */
+private[ui] class SparkConnectServerPage(parent: SparkConnectServerTab)
+    extends WebUIPage("")
+    with Logging {
+  private val store = parent.store
+  private val startTime = parent.startTime
+
+  /** Render the page */
+  def render(request: HttpServletRequest): Seq[Node] = {
+    val content = store.synchronized { // make sure all parts in this page are consistent
+      generateBasicStats() ++
+        <br/> ++
+        <h4>
+          {store.getOnlineSessionNum}
+          session(s) are online,
+          running
+          {store.getTotalRunning}
+          Request(s)
+        </h4> ++
+        generateSessionStatsTable(request) ++
+        generateSQLStatsTable(request)
+    }
+    UIUtils.headerSparkPage(request, "Spark Connect", content, parent)
+  }
+
+  /** Generate basic stats of the Spark Connect server */
+  private def generateBasicStats(): Seq[Node] = {
+    val timeSinceStart = System.currentTimeMillis() - startTime.getTime
+    <ul class ="list-unstyled">
+      <li>
+        <strong>Started at: </strong> {formatDate(startTime)}
+      </li>
+      <li>
+        <strong>Time since start: </strong>{formatDurationVerbose(timeSinceStart)}
+      </li>
+    </ul>
+  }
+
+  /** Generate stats of batch statements of the Spark Connect program */
+  private def generateSQLStatsTable(request: HttpServletRequest): Seq[Node] = {
+
+    val numStatement = store.getExecutionList.size
+
+    val table = if (numStatement > 0) {
+
+      val sqlTableTag = "sqlstat"
+
+      val sqlTablePage =
+        Option(request.getParameter(s"$sqlTableTag.page")).map(_.toInt).getOrElse(1)
+
+      try {
+        Some(
+          new SqlStatsPagedTable(
+            request,
+            parent,
+            store.getExecutionList,
+            "connect",
+            UIUtils.prependBaseUri(request, parent.basePath),
+            sqlTableTag,
+            showSessionLink = true).table(sqlTablePage))
+      } catch {
+        case e @ (_: IllegalArgumentException | _: IndexOutOfBoundsException) =>
+          Some(<div class="alert alert-error">
+            <p>Error while rendering job table:</p>
+            <pre>
+              {Utils.exceptionString(e)}
+            </pre>
+          </div>)
+      }
+    } else {
+      None
+    }
+    val content =
+      <span id="sqlstat" class="collapse-aggregated-sqlstat collapse-table"
+            onClick="collapseTable('collapse-aggregated-sqlstat',
+                'aggregated-sqlstat')">
+        <h4>
+          <span class="collapse-table-arrow arrow-open"></span>
+          <a>Request Statistics ({numStatement})</a>
+        </h4>
+      </span> ++
+        <div class="aggregated-sqlstat collapsible-table">
+          {table.getOrElse("No statistics have been generated yet.")}
+        </div>
+    content
+  }
+
+  /** Generate stats of batch sessions of the Spark Connect server */
+  private def generateSessionStatsTable(request: HttpServletRequest): Seq[Node] = {
+    val numSessions = store.getSessionList.size
+    val table = if (numSessions > 0) {
+
+      val sessionTableTag = "sessionstat"
+
+      val sessionTablePage =
+        Option(request.getParameter(s"$sessionTableTag.page")).map(_.toInt).getOrElse(1)
+
+      try {
+        Some(
+          new SessionStatsPagedTable(
+            request,
+            parent,
+            store.getSessionList,
+            "connect",
+            UIUtils.prependBaseUri(request, parent.basePath),
+            sessionTableTag).table(sessionTablePage))
+      } catch {
+        case e @ (_: IllegalArgumentException | _: IndexOutOfBoundsException) =>
+          Some(<div class="alert alert-error">
+            <p>Error while rendering job table:</p>
+            <pre>
+              {Utils.exceptionString(e)}
+            </pre>
+          </div>)
+      }
+    } else {
+      None
+    }
+
+    val content =
+      <span id="sessionstat" class="collapse-aggregated-sessionstat collapse-table"
+            onClick="collapseTable('collapse-aggregated-sessionstat',
+                'aggregated-sessionstat')">
+        <h4>
+          <span class="collapse-table-arrow arrow-open"></span>
+          <a>Session Statistics ({numSessions})</a>
+        </h4>
+      </span> ++
+        <div class="aggregated-sessionstat collapsible-table">
+          {table.getOrElse("No statistics have been generated yet.")}
+        </div>
+
+    content
+  }
+}
+
+private[ui] class SqlStatsPagedTable(
+    request: HttpServletRequest,
+    parent: SparkConnectServerTab,
+    data: Seq[ExecutionInfo],
+    subPath: String,
+    basePath: String,
+    sqlStatsTableTag: String,
+    showSessionLink: Boolean)
+    extends PagedTable[SqlStatsTableRow] {
+  private val (sortColumn, desc, pageSize) =
+    getTableParameters(request, sqlStatsTableTag, "Start Time")
+
+  private val encodedSortColumn = URLEncoder.encode(sortColumn, UTF_8.name())
+
+  private val parameterPath =
+    s"$basePath/$subPath/?${getParameterOtherTable(request, sqlStatsTableTag)}"
+
+  override val dataSource = new SqlStatsTableDataSource(data, pageSize, sortColumn, desc)
+
+  override def tableId: String = sqlStatsTableTag
+
+  override def tableCssClass: String =
+    "table table-bordered table-sm table-striped table-head-clickable table-cell-width-limited"
+
+  override def pageLink(page: Int): String = {
+    parameterPath +
+      s"&$pageNumberFormField=$page" +
+      s"&$sqlStatsTableTag.sort=$encodedSortColumn" +
+      s"&$sqlStatsTableTag.desc=$desc" +
+      s"&$pageSizeFormField=$pageSize" +
+      s"#$sqlStatsTableTag"
+  }
+
+  override def pageSizeFormField: String = s"$sqlStatsTableTag.pageSize"
+
+  override def pageNumberFormField: String = s"$sqlStatsTableTag.page"
+
+  override def goButtonFormPath: String =
+    s"$parameterPath&$sqlStatsTableTag.sort=$encodedSortColumn" +
+      s"&$sqlStatsTableTag.desc=$desc#$sqlStatsTableTag"
+
+  override def headers: Seq[Node] = {
+    val sqlTableHeadersAndTooltips: Seq[(String, Boolean, Option[String])] =
+      if (showSessionLink) {
+        Seq(
+          ("User", true, None),
+          ("Job Tag", true, None),
+          ("Job ID", true, None),
+          ("SQL Query ID", true, None),
+          ("Session ID", true, None),
+          ("Start Time", true, None),
+          ("Finish Time", true, Some(SPARK_CONNECT_SERVER_FINISH_TIME)),
+          ("Close Time", true, Some(SPARK_CONNECT_SERVER_CLOSE_TIME)),
+          ("Execution Time", true, Some(SPARK_CONNECT_SERVER_EXECUTION)),
+          ("Duration", true, Some(SPARK_CONNECT_SERVER_DURATION)),
+          ("Statement", true, None),
+          ("State", true, None),
+          ("Detail", true, None))
+      } else {
+        Seq(
+          ("User", true, None),
+          ("Job Tag", true, None),
+          ("Job ID", true, None),
+          ("SQL Query ID", true, None),
+          ("Start Time", true, None),
+          ("Finish Time", true, Some(SPARK_CONNECT_SERVER_FINISH_TIME)),
+          ("Close Time", true, Some(SPARK_CONNECT_SERVER_CLOSE_TIME)),
+          ("Execution Time", true, Some(SPARK_CONNECT_SERVER_EXECUTION)),
+          ("Duration", true, Some(SPARK_CONNECT_SERVER_DURATION)),
+          ("Statement", true, None),
+          ("State", true, None),
+          ("Detail", true, None))
+      }
+
+    isSortColumnValid(sqlTableHeadersAndTooltips, sortColumn)
+
+    headerRow(
+      sqlTableHeadersAndTooltips,
+      desc,
+      pageSize,
+      sortColumn,
+      parameterPath,
+      sqlStatsTableTag,
+      sqlStatsTableTag)
+  }
+
+  override def row(sqlStatsTableRow: SqlStatsTableRow): Seq[Node] = {
+    val info = sqlStatsTableRow.executionInfo
+    val startTime = info.startTimestamp
+    val executionTime = sqlStatsTableRow.executionTime
+    val duration = sqlStatsTableRow.duration
+
+    def jobLinks(jobData: Seq[String]): Seq[Node] = {
+      jobData.map { jobId =>
+        <a href={jobURL(request, jobId)}>[{jobId}]</a>
+      }
+    }
+    def sqlLinks(sqlData: Seq[String]): Seq[Node] = {
+      sqlData.map { sqlExecId =>
+        <a href={sqlURL(request, sqlExecId)}>[{sqlExecId}]</a>
+      }
+    }
+    val sessionLink = "%s/%s/session/?id=%s".format(
+      UIUtils.prependBaseUri(request, parent.basePath),
+      parent.prefix,
+      info.sessionId)
+
+    <tr>
+      <td>
+        {info.userId}
+      </td>
+      <td>
+        {info.jobTag}
+      </td>
+      <td>
+        {jobLinks(sqlStatsTableRow.jobId)}
+      </td>
+      <td>
+        {sqlLinks({ info.sqlExecId })}
+      </td>
+      {
+      if (showSessionLink) {
+        <td>
+          <a href={sessionLink}>{info.sessionId}</a>
+        </td>
+      }
+    }
+      <td>
+        {UIUtils.formatDate(startTime)}
+      </td>
+      <td>
+        {if (info.finishTimestamp > 0) formatDate(info.finishTimestamp)}
+      </td>
+      <td>
+        {if (info.closeTimestamp > 0) formatDate(info.closeTimestamp)}
+      </td>
+      <!-- Returns a human-readable string representing a duration such as "5 second 35 ms"-->
+      <td >
+        {formatDurationVerbose(executionTime)}
+      </td>
+      <td >
+        {formatDurationVerbose(duration)}
+      </td>
+      <td>
+        <span class="description-input">
+          {info.statement}
+        </span>
+      </td>
+      <td>
+        {info.state}
+      </td>
+      {errorMessageCell(Option(sqlStatsTableRow.detail))}
+    </tr>
+  }
+
+  private def errorMessageCell(errorMessageOption: Option[String]): Seq[Node] = {
+    val errorMessage = errorMessageOption.getOrElse("")
+    val isMultiline = errorMessage.indexOf('\n') >= 0
+    val errorSummary = StringEscapeUtils.escapeHtml4(if (isMultiline) {
+      errorMessage.substring(0, errorMessage.indexOf('\n'))
+    } else {
+      errorMessage
+    })
+    val details = detailsUINode(isMultiline, errorMessage)
+    <td>
+      {errorSummary}{details}
+    </td>
+  }
+
+  private def jobURL(request: HttpServletRequest, jobId: String): String =
+    "%s/jobs/job/?id=%s".format(UIUtils.prependBaseUri(request, parent.basePath), jobId)
+
+  private def sqlURL(request: HttpServletRequest, sqlExecId: String): String =
+    "%s/SQL/execution/?id=%s".format(UIUtils.prependBaseUri(request, parent.basePath), sqlExecId)
+}
+
+private[ui] class SessionStatsPagedTable(
+    request: HttpServletRequest,
+    parent: SparkConnectServerTab,
+    data: Seq[SessionInfo],
+    subPath: String,
+    basePath: String,
+    sessionStatsTableTag: String)
+    extends PagedTable[SessionInfo] {
+
+  private val (sortColumn, desc, pageSize) =
+    getTableParameters(request, sessionStatsTableTag, "Start Time")
+
+  private val encodedSortColumn = URLEncoder.encode(sortColumn, UTF_8.name())
+
+  private val parameterPath =
+    s"$basePath/$subPath/?${getParameterOtherTable(request, sessionStatsTableTag)}"
+
+  override val dataSource = new SessionStatsTableDataSource(data, pageSize, sortColumn, desc)
+
+  override def tableId: String = sessionStatsTableTag
+
+  override def tableCssClass: String =
+    "table table-bordered table-sm table-striped table-head-clickable table-cell-width-limited"
+
+  override def pageLink(page: Int): String = {
+    parameterPath +
+      s"&$pageNumberFormField=$page" +
+      s"&$sessionStatsTableTag.sort=$encodedSortColumn" +
+      s"&$sessionStatsTableTag.desc=$desc" +
+      s"&$pageSizeFormField=$pageSize" +
+      s"#$sessionStatsTableTag"
+  }
+
+  override def pageSizeFormField: String = s"$sessionStatsTableTag.pageSize"
+
+  override def pageNumberFormField: String = s"$sessionStatsTableTag.page"
+
+  override def goButtonFormPath: String =
+    s"$parameterPath&$sessionStatsTableTag.sort=$encodedSortColumn" +
+      s"&$sessionStatsTableTag.desc=$desc#$sessionStatsTableTag"
+
+  override def headers: Seq[Node] = {
+    val sessionTableHeadersAndTooltips: Seq[(String, Boolean, Option[String])] =
+      Seq(
+        ("User", true, None),
+        ("Session ID", true, None),
+        ("Start Time", true, None),
+        ("Finish Time", true, None),
+        ("Duration", true, Some(SPARK_CONNECT_SESSION_DURATION)),
+        ("Total Execute", true, Some(SPARK_CONNECT_SESSION_TOTAL_EXECUTE)))
+
+    isSortColumnValid(sessionTableHeadersAndTooltips, sortColumn)
+
+    headerRow(
+      sessionTableHeadersAndTooltips,
+      desc,
+      pageSize,
+      sortColumn,
+      parameterPath,
+      sessionStatsTableTag,
+      sessionStatsTableTag)
+  }
+
+  override def row(session: SessionInfo): Seq[Node] = {
+    val sessionLink = "%s/%s/session/?id=%s".format(
+      UIUtils.prependBaseUri(request, parent.basePath),
+      parent.prefix,
+      session.sessionId)
+    <tr>
+      <td> {session.userId} </td>
+      <td> <a href={sessionLink}> {session.sessionId} </a> </td>
+      <td> {formatDate(session.startTimestamp)} </td>
+      <td> {if (session.finishTimestamp > 0) formatDate(session.finishTimestamp)} </td>
+      <td> {formatDurationVerbose(session.totalTime)} </td>
+      <td> {session.totalExecution.toString} </td>
+    </tr>
+  }
+}
+
+private[ui] class SqlStatsTableRow(
+    val jobTag: String,
+    val jobId: Seq[String],
+    val sqlExecId: Seq[String],
+    val duration: Long,
+    val executionTime: Long,
+    val executionInfo: ExecutionInfo,
+    val detail: String)
+
+private[ui] class SqlStatsTableDataSource(
+    info: Seq[ExecutionInfo],
+    pageSize: Int,
+    sortColumn: String,
+    desc: Boolean)
+    extends PagedDataSource[SqlStatsTableRow](pageSize) {
+
+  // Convert ExecutionInfo to SqlStatsTableRow which contains the final contents to show in
+  // the table so that we can avoid creating duplicate contents during sorting the data
+  private val data = info.map(sqlStatsTableRow).sorted(ordering(sortColumn, desc))
+
+  override def dataSize: Int = data.size
+
+  override def sliceData(from: Int, to: Int): Seq[SqlStatsTableRow] = data.slice(from, to)
+
+  private def sqlStatsTableRow(executionInfo: ExecutionInfo): SqlStatsTableRow = {
+    val duration = executionInfo.totalTime(executionInfo.closeTimestamp)
+    val executionTime = executionInfo.totalTime(executionInfo.finishTimestamp)
+    val detail = Option(executionInfo.detail)
+      .filter(!_.isEmpty)
+      .getOrElse(executionInfo.executePlan)
+    val jobId = executionInfo.jobId.toSeq.sorted
+    val sqlExecId = executionInfo.sqlExecId.toSeq.sorted
+
+    new SqlStatsTableRow(
+      executionInfo.jobTag,
+      jobId,
+      sqlExecId,
+      duration,
+      executionTime,
+      executionInfo,
+      detail)
+  }
+
+  /**
+   * Return Ordering according to sortColumn and desc.
+   */
+  private def ordering(sortColumn: String, desc: Boolean): Ordering[SqlStatsTableRow] = {
+    val ordering: Ordering[SqlStatsTableRow] = sortColumn match {
+      case "User" => Ordering.by(_.executionInfo.userId)
+      case "Job Tag" => Ordering.by(_.executionInfo.jobTag)
+      case "Job ID" => Ordering by (_.jobId.headOption)

Review Comment:
   <img width="354" alt="image" src="https://github.com/apache/spark/assets/1097932/62c47d17-56b2-4187-ab59-f96a5af1b240">
   Is the ordering working?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654794159

   > Nit: we can have follow-up to improve the page when a session doesn't exist(e.g. evicted from KV store): <img alt="image" width="750" src="https://user-images.githubusercontent.com/1097932/256671033-1171802f-942e-424e-8d2c-154da05ece1e.png">
   > 
   > For example, id not existed is handled in the SQL page: <img alt="image" width="1041" src="https://user-images.githubusercontent.com/1097932/256671131-8adea10a-e653-40b7-8190-a175301ff41f.png">
   
   Addressed to match the behavior for the SQL page.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276744354


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerHistoryServerPlugin.scala:
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import org.apache.spark.SparkConf
+import org.apache.spark.scheduler.SparkListener
+import org.apache.spark.status.{AppHistoryServerPlugin, ElementTrackingStore}
+import org.apache.spark.ui.SparkUI
+
+class SparkConnectServerHistoryServerPlugin extends AppHistoryServerPlugin {
+
+  override def createListeners(
+      conf: SparkConf,
+      store: ElementTrackingStore): Seq[SparkListener] = {
+    Seq(new SparkConnectServerListener(store, conf))
+  }
+
+  override def setupUI(ui: SparkUI): Unit = {
+    val store = new SparkConnectServerAppStatusStore(ui.store.store)
+    if (store.getSessionCount > 0) {
+      new SparkConnectServerTab(store, ui)
+    }
+  }
+
+  override def displayOrder: Int = 1

Review Comment:
   Agreed we should update them all at once. Should we do it in a separate PR or just update all of them in this one?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654629558

   > Do we expect users to set the job tags? The generated tags are long and meaningless <img alt="image" width="1996" src="https://user-images.githubusercontent.com/1097932/256669420-88d105a3-4bb8-4241-a9b4-987f78520b0a.png">
   
   @juliuszsompolski @jdesjean Is there any use in including the job tags? Actually now that I think about it, since it's constructed from the SessionID, userID, and operationID, I think we can instead just add a column for the operationID and remove the jobTag column. What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1267386309


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectServer.scala:
##########
@@ -49,6 +49,7 @@ object SparkConnectServer extends Logging {
       SparkConnectService.server.awaitTermination()
     } finally {
       session.stop()
+      SparkConnectService.uiTab.foreach(_.detach())

Review Comment:
   Does it get called on this path? I only saw `SparkConnectService.server.awaitTermination()` on line 49. When I searched for `SparkConnectService.stop(),` I only saw two occurrences in [SimpleSparkConnectService](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/SimpleSparkConnectService.scala#L55) and [SparkConnectPlugin](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/SparkConnectPlugin.scala)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1273856506


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerListener.scala:
##########
@@ -0,0 +1,361 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.spark.{SparkConf, SparkContext, SparkEnv}
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config.Status.LIVE_ENTITY_UPDATE_PERIOD
+import org.apache.spark.scheduler._
+import org.apache.spark.sql.connect.config.Connect.{CONNECT_UI_SESSION_LIMIT, CONNECT_UI_STATEMENT_LIMIT}
+import org.apache.spark.sql.connect.service._
+import org.apache.spark.sql.execution.SQLExecution
+import org.apache.spark.status.{ElementTrackingStore, KVUtils, LiveEntity}
+
+private[connect] class SparkConnectServerListener(
+    kvstore: ElementTrackingStore,
+    sparkConf: SparkConf,
+    live: Boolean = true)
+    extends SparkListener
+    with Logging {
+
+  private val sessionList = new mutable.LinkedHashMap[String, LiveSessionData]
+  private val executionList = new mutable.LinkedHashMap[String, LiveExecutionData]
+
+  private val (retainedStatements: Int, retainedSessions: Int) = {
+    (
+      SparkEnv.get.conf.get(CONNECT_UI_STATEMENT_LIMIT),
+      SparkEnv.get.conf.get(CONNECT_UI_SESSION_LIMIT))
+  }
+
+  // How often to update live entities. -1 means "never update" when replaying applications,
+  // meaning only the last write will happen. For live applications, this avoids a few
+  // operations that we can live without when rapidly processing incoming events.
+  private val liveUpdatePeriodNs = if (live) sparkConf.get(LIVE_ENTITY_UPDATE_PERIOD) else -1L
+
+  // Returns true if this listener has no live data. Exposed for tests only.
+  private[connect] def noLiveData(): Boolean = synchronized {
+    sessionList.isEmpty && executionList.isEmpty
+  }
+
+  kvstore.addTrigger(classOf[SessionInfo], retainedSessions) { count =>
+    cleanupSession(count)
+  }
+
+  kvstore.addTrigger(classOf[ExecutionInfo], retainedStatements) { count =>
+    cleanupExecutions(count)
+  }
+
+  override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
+    val executionIdOpt: Option[String] = Option(jobStart.properties)
+      .flatMap { p => Option(p.getProperty(SQLExecution.EXECUTION_ID_KEY)) }
+    val jobTags = Option(jobStart.properties)
+      .flatMap { p => Option(p.getProperty(SparkContext.SPARK_JOB_TAGS)) }
+      .map(_.split(SparkContext.SPARK_JOB_TAGS_SEP).toSet)
+      .getOrElse(Set())
+      .toSeq
+      .filter(!_.isEmpty)
+      .sorted
+    val execList = executionList.values.filter(exec => jobTags.contains(exec.jobTag)).toSeq

Review Comment:
   @jasonli-db could you try to use https://github.com/juliuszsompolski/apache-spark/commit/639d515d629d78825489510eb98b3648e4122b04
   to make it
   ```
   val jobTag = jobTags.findFirst({ case ExecuteJobTag(jobTag) => jobexeclistTag }
   val exec = executionList.get(jobTag)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276823216


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerTab.scala:
##########
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import java.util.Date
+
+import org.apache.spark.SparkContext
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.errors.QueryExecutionErrors
+import org.apache.spark.ui.{SparkUI, SparkUITab}
+
+private[connect] class SparkConnectServerTab(
+    val store: SparkConnectServerAppStatusStore,
+    sparkUI: SparkUI)
+    extends SparkUITab(sparkUI, "connect")

Review Comment:
   nit: two spaces indent



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1264242195


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala:
##########
@@ -115,4 +115,17 @@ object Connect {
       .version("3.5.0")
       .booleanConf
       .createWithDefault(false)
+
+  val CONNECT_UI_STATEMENT_LIMIT =
+    ConfigBuilder("spark.sql.connect.ui.retainedStatements")
+      .doc("The number of statements kept in the Spark Connect UI history.")
+      .version("3.5.0")
+      .intConf
+      .createWithDefault(200)

Review Comment:
   This is what the Thrift Server has for defaults: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L1414. Can be changed though. Is 1000 preferred?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1636454958

   Since the Spark connect is more about SQL, shall we show the SQL execution links on the UI?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654613199

   Do we expect users to set the job tags? The generated tags are long and meaningless
   <img width="1996" alt="image" src="https://github.com/apache/spark/assets/1097932/88d105a3-4bb8-4241-a9b4-987f78520b0a">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276861454


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerTab.scala:
##########
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import java.util.Date
+
+import org.apache.spark.SparkContext
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.errors.QueryExecutionErrors
+import org.apache.spark.ui.{SparkUI, SparkUITab}
+
+private[connect] class SparkConnectServerTab(
+    val store: SparkConnectServerAppStatusStore,
+    sparkUI: SparkUI)
+    extends SparkUITab(sparkUI, "connect")

Review Comment:
   This seems to be set by the linter (it reverted back to this form after I ran the lint command).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654907931

   @jasonli-db I have to say to the current table is quite wide. 
   <img width="1720" alt="image" src="https://github.com/apache/spark/assets/1097932/db421554-d98f-497f-a913-56b1a511e17c">
   
   Can we at least put the operationID column behind the column "State" since it is not commonly useful?
   
   
   A good way to resolve this is to have checkboxes and make operationID/sparkSessionTags/jobTag as  additional metrics:
   <img width="383" alt="image" src="https://github.com/apache/spark/assets/1097932/0195d334-70d5-4fee-88ed-1951242a3692">
   This can be done in a followup
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1656308135

   @jasonli-db Thanks for the work. I will merge the PR once the tests are passed. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1267671107


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectServer.scala:
##########
@@ -49,6 +49,7 @@ object SparkConnectServer extends Logging {
       SparkConnectService.server.awaitTermination()
     } finally {
       session.stop()
+      SparkConnectService.uiTab.foreach(_.detach())

Review Comment:
   yeah... it seems that in this path the Server is killed by a signal (via `sbin/stop-connect-server.sh`) and then `SparkConnectService.stop()` is never called.
   It would be nice to SparkConnectService.stop() here, but then it will try to shutdown the server again... 
   In any case, it's unrelated to this PR, so let's leave it like it is now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1267380881


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -179,6 +182,9 @@ object SparkConnectService {
 
   private[connect] var server: Server = _
 
+  var uiTab: Option[SparkConnectServerTab] = None
+  var listener: SparkConnectServerListener = _

Review Comment:
   Updated to `private[connect]`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654623126

   Nit: we can have follow-up to improve the page when a session doesn't exist(e.g. evicted from KV store):
   <img width="750" alt="image" src="https://github.com/apache/spark/assets/1097932/1171802f-942e-424e-8d2c-154da05ece1e">
   
   For example, id not existed is handled in the SQL page:
   <img width="1041" alt="image" src="https://github.com/apache/spark/assets/1097932/8adea10a-e653-40b7-8190-a175301ff41f">
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1277729520


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerPage.scala:
##########
@@ -0,0 +1,541 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import java.net.URLEncoder
+import java.nio.charset.StandardCharsets.UTF_8
+import javax.servlet.http.HttpServletRequest
+
+import scala.xml.Node
+
+import org.apache.commons.text.StringEscapeUtils
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.connect.ui.ToolTips._
+import org.apache.spark.ui._
+import org.apache.spark.ui.UIUtils._
+import org.apache.spark.util.Utils
+
+/** Page for Spark UI that shows statistics for a Spark Connect Server. */
+private[ui] class SparkConnectServerPage(parent: SparkConnectServerTab)
+    extends WebUIPage("")
+    with Logging {
+
+  private val store = parent.store
+  private val startTime = parent.startTime
+
+  /** Render the page */
+  def render(request: HttpServletRequest): Seq[Node] = {
+    val content = store.synchronized { // make sure all parts in this page are consistent
+      generateBasicStats() ++
+        <br/> ++
+        <h4>
+          {store.getOnlineSessionNum}
+          session(s) are online,
+          running
+          {store.getTotalRunning}
+          Request(s)
+        </h4> ++
+        generateSessionStatsTable(request) ++
+        generateSQLStatsTable(request)
+    }
+    UIUtils.headerSparkPage(request, "Spark Connect", content, parent)
+  }
+
+  /** Generate basic stats of the Spark Connect server */
+  private def generateBasicStats(): Seq[Node] = {
+    val timeSinceStart = System.currentTimeMillis() - startTime.getTime
+    <ul class ="list-unstyled">
+      <li>
+        <strong>Started at: </strong> {formatDate(startTime)}
+      </li>
+      <li>
+        <strong>Time since start: </strong>{formatDurationVerbose(timeSinceStart)}
+      </li>
+    </ul>
+  }
+
+  /** Generate stats of batch statements of the Spark Connect program */
+  private def generateSQLStatsTable(request: HttpServletRequest): Seq[Node] = {
+
+    val numStatement = store.getExecutionList.size
+
+    val table = if (numStatement > 0) {
+
+      val sqlTableTag = "sqlstat"
+
+      val sqlTablePage =
+        Option(request.getParameter(s"$sqlTableTag.page")).map(_.toInt).getOrElse(1)
+
+      try {
+        Some(
+          new SqlStatsPagedTable(
+            request,
+            parent,
+            store.getExecutionList,
+            "connect",
+            UIUtils.prependBaseUri(request, parent.basePath),
+            sqlTableTag,
+            showSessionLink = true).table(sqlTablePage))
+      } catch {
+        case e @ (_: IllegalArgumentException | _: IndexOutOfBoundsException) =>
+          Some(<div class="alert alert-error">
+            <p>Error while rendering job table:</p>
+            <pre>
+              {Utils.exceptionString(e)}
+            </pre>
+          </div>)
+      }
+    } else {
+      None
+    }
+    val content =
+      <span id="sqlstat" class="collapse-aggregated-sqlstat collapse-table"
+            onClick="collapseTable('collapse-aggregated-sqlstat',
+                'aggregated-sqlstat')">
+        <h4>
+          <span class="collapse-table-arrow arrow-open"></span>
+          <a>Request Statistics ({numStatement})</a>
+        </h4>
+      </span> ++
+        <div class="aggregated-sqlstat collapsible-table">
+          {table.getOrElse("No statistics have been generated yet.")}
+        </div>
+    content
+  }
+
+  /** Generate stats of batch sessions of the Spark Connect server */
+  private def generateSessionStatsTable(request: HttpServletRequest): Seq[Node] = {
+    val numSessions = store.getSessionList.size
+    val table = if (numSessions > 0) {
+
+      val sessionTableTag = "sessionstat"
+
+      val sessionTablePage =
+        Option(request.getParameter(s"$sessionTableTag.page")).map(_.toInt).getOrElse(1)
+
+      try {
+        Some(
+          new SessionStatsPagedTable(
+            request,
+            parent,
+            store.getSessionList,
+            "connect",
+            UIUtils.prependBaseUri(request, parent.basePath),
+            sessionTableTag).table(sessionTablePage))
+      } catch {
+        case e @ (_: IllegalArgumentException | _: IndexOutOfBoundsException) =>
+          Some(<div class="alert alert-error">
+            <p>Error while rendering job table:</p>
+            <pre>
+              {Utils.exceptionString(e)}
+            </pre>
+          </div>)
+      }
+    } else {
+      None
+    }
+
+    val content =
+      <span id="sessionstat" class="collapse-aggregated-sessionstat collapse-table"
+            onClick="collapseTable('collapse-aggregated-sessionstat',
+                'aggregated-sessionstat')">
+        <h4>
+          <span class="collapse-table-arrow arrow-open"></span>
+          <a>Session Statistics ({numSessions})</a>
+        </h4>
+      </span> ++
+        <div class="aggregated-sessionstat collapsible-table">
+          {table.getOrElse("No statistics have been generated yet.")}
+        </div>
+
+    content
+  }
+}
+
+private[ui] class SqlStatsPagedTable(
+    request: HttpServletRequest,
+    parent: SparkConnectServerTab,
+    data: Seq[ExecutionInfo],
+    subPath: String,
+    basePath: String,
+    sqlStatsTableTag: String,
+    showSessionLink: Boolean)
+    extends PagedTable[SqlStatsTableRow] {
+
+  private val (sortColumn, desc, pageSize) =
+    getTableParameters(request, sqlStatsTableTag, "Start Time")
+
+  private val encodedSortColumn = URLEncoder.encode(sortColumn, UTF_8.name())
+
+  private val parameterPath =
+    s"$basePath/$subPath/?${getParameterOtherTable(request, sqlStatsTableTag)}"
+
+  override val dataSource = new SqlStatsTableDataSource(data, pageSize, sortColumn, desc)
+
+  override def tableId: String = sqlStatsTableTag
+
+  override def tableCssClass: String =
+    "table table-bordered table-sm table-striped table-head-clickable table-cell-width-limited"
+
+  override def pageLink(page: Int): String = {
+    parameterPath +
+      s"&$pageNumberFormField=$page" +
+      s"&$sqlStatsTableTag.sort=$encodedSortColumn" +
+      s"&$sqlStatsTableTag.desc=$desc" +
+      s"&$pageSizeFormField=$pageSize" +
+      s"#$sqlStatsTableTag"
+  }
+
+  override def pageSizeFormField: String = s"$sqlStatsTableTag.pageSize"
+
+  override def pageNumberFormField: String = s"$sqlStatsTableTag.page"
+
+  override def goButtonFormPath: String =
+    s"$parameterPath&$sqlStatsTableTag.sort=$encodedSortColumn" +
+      s"&$sqlStatsTableTag.desc=$desc#$sqlStatsTableTag"
+
+  override def headers: Seq[Node] = {
+    val sqlTableHeadersAndTooltips: Seq[(String, Boolean, Option[String])] =
+      if (showSessionLink) {
+        Seq(
+          ("User", true, None),
+          ("Job ID", true, None),
+          ("SQL Query ID", true, None),
+          ("Session ID", true, None),
+          ("Start Time", true, None),
+          ("Finish Time", true, Some(SPARK_CONNECT_SERVER_FINISH_TIME)),
+          ("Close Time", true, Some(SPARK_CONNECT_SERVER_CLOSE_TIME)),
+          ("Execution Time", true, Some(SPARK_CONNECT_SERVER_EXECUTION)),
+          ("Duration", true, Some(SPARK_CONNECT_SERVER_DURATION)),
+          ("Statement", true, None),
+          ("State", true, None),
+          ("Operation ID", true, None),
+          ("Job Tag", true, None),
+          ("Spark Session Tags", true, None),
+          ("Detail", true, None))
+      } else {
+        Seq(
+          ("User", true, None),
+          ("Job ID", true, None),
+          ("SQL Query ID", true, None),
+          ("Start Time", true, None),
+          ("Finish Time", true, Some(SPARK_CONNECT_SERVER_FINISH_TIME)),
+          ("Close Time", true, Some(SPARK_CONNECT_SERVER_CLOSE_TIME)),
+          ("Execution Time", true, Some(SPARK_CONNECT_SERVER_EXECUTION)),
+          ("Duration", true, Some(SPARK_CONNECT_SERVER_DURATION)),
+          ("Statement", true, None),
+          ("State", true, None),
+          ("Operation ID", true, None),
+          ("Job Tag", true, None),
+          ("Spark Session Tags", true, None),
+          ("Detail", true, None))
+      }
+
+    isSortColumnValid(sqlTableHeadersAndTooltips, sortColumn)
+
+    headerRow(
+      sqlTableHeadersAndTooltips,
+      desc,
+      pageSize,
+      sortColumn,
+      parameterPath,
+      sqlStatsTableTag,
+      sqlStatsTableTag)
+  }
+
+  override def row(sqlStatsTableRow: SqlStatsTableRow): Seq[Node] = {
+    val info = sqlStatsTableRow.executionInfo
+    val startTime = info.startTimestamp
+    val executionTime = sqlStatsTableRow.executionTime
+    val duration = sqlStatsTableRow.duration
+
+    def jobLinks(jobData: Seq[String]): Seq[Node] = {
+      jobData.map { jobId =>
+        <a href={jobURL(request, jobId)}>[{jobId}]</a>
+      }
+    }
+    def sqlLinks(sqlData: Seq[String]): Seq[Node] = {
+      sqlData.map { sqlExecId =>
+        <a href={sqlURL(request, sqlExecId)}>[{sqlExecId}]</a>
+      }
+    }
+    val sessionLink = "%s/%s/session/?id=%s".format(
+      UIUtils.prependBaseUri(request, parent.basePath),
+      parent.prefix,
+      info.sessionId)
+
+    <tr>
+      <td>
+        {info.userId}
+      </td>
+      <td>
+        {jobLinks(sqlStatsTableRow.jobId)}
+      </td>
+      <td>
+        {sqlLinks(sqlStatsTableRow.sqlExecId)}
+      </td>
+      {
+      if (showSessionLink) {
+        <td>
+          <a href={sessionLink}>{info.sessionId}</a>
+        </td>
+      }
+    }
+      <td>
+        {UIUtils.formatDate(startTime)}
+      </td>
+      <td>
+        {if (info.finishTimestamp > 0) formatDate(info.finishTimestamp)}
+      </td>
+      <td>
+        {if (info.closeTimestamp > 0) formatDate(info.closeTimestamp)}
+      </td>
+      <!-- Returns a human-readable string representing a duration such as "5 second 35 ms"-->
+      <td >
+        {formatDurationVerbose(executionTime)}
+      </td>
+      <td >
+        {formatDurationVerbose(duration)}
+      </td>
+      <td>
+        <span class="description-input">
+          {info.statement}
+        </span>
+      </td>
+      <td>
+        {if (info.isExecutionActive) "RUNNING" else info.state}

Review Comment:
   LGTM, I think it makes more sense, thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1656768413

   @jasonli-db This PR has conflict with the branch-3.5
   Please cherry-pick it before the RC1 on Aug 1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang closed pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang closed pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect
URL: https://github.com/apache/spark/pull/41964


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654655977

   OperationId could indeed maybe be more useful instead. If a user uses connect api like SparkSession.interruptOperation, they pass just operationId.
   The only time a user would want to use jobTag is if they wanted to directly call SparkContext.interruptJobsWithTag, which is a rather internal use.
   We could consider displaying SparkListenerConnectOperationStarted.sparkSessionTags, because these are uses settable (via connect SparkSession.addTag / removeTag / clearTags), and they are used by users to call connect SparkSession.interruptTag.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1264168872


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala:
##########
@@ -115,4 +115,17 @@ object Connect {
       .version("3.5.0")
       .booleanConf
       .createWithDefault(false)
+
+  val CONNECT_UI_STATEMENT_LIMIT =
+    ConfigBuilder("spark.sql.connect.ui.retainedStatements")
+      .doc("The number of statements kept in the Spark Connect UI history.")
+      .version("3.5.0")
+      .intConf
+      .createWithDefault(200)

Review Comment:
   Why the default value is 200? For SQL query retention, the default value is 1000



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1636442845

   @juliuszsompolski @grundprinzip Can you help review this PR? This builds off of https://github.com/apache/spark/pull/41443, so the commit to look at is https://github.com/apache/spark/pull/41964/commits/fcf0ca69ea30a71e20ec15673d62bbfe91f11a36. 
   
   Also cc @rednaxelafx @gengliangwang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654657805

   > In the screenshot above, the close time and finish time column seems duplicated. And the execution time and duration also seems quite similar. Why do we have so many columns about the timings?
   
   @gengliangwang Finish time is finish execution, close time is when all the results were transferred; execution time is the time query was executing, while duration is end to end. It is meaningful for queries with large results, or e.g. complex queries that spend a lot of time in optimizer. It's the same with Thriftserver UI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276887691


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerHistoryServerPlugin.scala:
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import org.apache.spark.SparkConf
+import org.apache.spark.scheduler.SparkListener
+import org.apache.spark.status.{AppHistoryServerPlugin, ElementTrackingStore}
+import org.apache.spark.ui.SparkUI
+
+class SparkConnectServerHistoryServerPlugin extends AppHistoryServerPlugin {
+
+  override def createListeners(
+      conf: SparkConf,
+      store: ElementTrackingStore): Seq[SparkListener] = {
+    Seq(new SparkConnectServerListener(store, conf))
+  }
+
+  override def setupUI(ui: SparkUI): Unit = {
+    val store = new SparkConnectServerAppStatusStore(ui.store.store)
+    if (store.getSessionCount > 0) {
+      new SparkConnectServerTab(store, ui)
+    }
+  }
+
+  override def displayOrder: Int = 1

Review Comment:
   Hm, I checked and since this is only incorrect in one other place (StreamingQueryHistoryServerPlugin.scala), I just updated it in this PR



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "hvanhovell (via GitHub)" <gi...@apache.org>.
hvanhovell commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1653965374

   @jasonli-db can you update so we can merge this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1656768165

   Thanks, merging to master/3.5


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276667031


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerHistoryServerPlugin.scala:
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import org.apache.spark.SparkConf
+import org.apache.spark.scheduler.SparkListener
+import org.apache.spark.status.{AppHistoryServerPlugin, ElementTrackingStore}
+import org.apache.spark.ui.SparkUI
+
+class SparkConnectServerHistoryServerPlugin extends AppHistoryServerPlugin {
+
+  override def createListeners(
+      conf: SparkConf,
+      store: ElementTrackingStore): Seq[SparkListener] = {
+    Seq(new SparkConnectServerListener(store, conf))
+  }
+
+  override def setupUI(ui: SparkUI): Unit = {
+    val store = new SparkConnectServerAppStatusStore(ui.store.store)
+    if (store.getSessionCount > 0) {
+      new SparkConnectServerTab(store, ui)
+    }
+  }
+
+  override def displayOrder: Int = 1

Review Comment:
   increase the display order to avoid conflict with StreamingQueryHistoryServerPlugin?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276826736


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerHistoryServerPlugin.scala:
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import org.apache.spark.SparkConf
+import org.apache.spark.scheduler.SparkListener
+import org.apache.spark.status.{AppHistoryServerPlugin, ElementTrackingStore}
+import org.apache.spark.ui.SparkUI
+
+class SparkConnectServerHistoryServerPlugin extends AppHistoryServerPlugin {
+
+  override def createListeners(
+      conf: SparkConf,
+      store: ElementTrackingStore): Seq[SparkListener] = {
+    Seq(new SparkConnectServerListener(store, conf))
+  }
+
+  override def setupUI(ui: SparkUI): Unit = {
+    val store = new SparkConnectServerAppStatusStore(ui.store.store)
+    if (store.getSessionCount > 0) {
+      new SparkConnectServerTab(store, ui)
+    }
+  }
+
+  override def displayOrder: Int = 1

Review Comment:
   @jasonli-db either way is fine



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1277726283


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerAppStatusStore.scala:
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+import com.fasterxml.jackson.annotation.JsonIgnore
+
+import org.apache.spark.status.KVUtils
+import org.apache.spark.status.KVUtils.KVIndexParam
+import org.apache.spark.util.kvstore.{KVIndex, KVStore}
+
+class SparkConnectServerAppStatusStore(store: KVStore) {
+  def getSessionList: Seq[SessionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[SessionInfo]))
+  }
+
+  def getExecutionList: Seq[ExecutionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[ExecutionInfo]))
+  }
+
+  def getOnlineSessionNum: Int = {
+    KVUtils.count(store.view(classOf[SessionInfo]))(_.finishTimestamp == 0)
+  }
+
+  def getSession(sessionId: String): Option[SessionInfo] = {
+    try {
+      Some(store.read(classOf[SessionInfo], sessionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  def getExecution(executionId: String): Option[ExecutionInfo] = {
+    try {
+      Some(store.read(classOf[ExecutionInfo], executionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  /**
+   * When an error or a cancellation occurs, we set the finishTimestamp of the statement.
+   * Therefore, when we count the number of running statements, we need to exclude errors and
+   * cancellations and count all statements that have not been closed so far.
+   */
+  def getTotalRunning: Int = {
+    KVUtils.count(store.view(classOf[ExecutionInfo]))(_.isExecutionActive)
+  }
+
+  def getSessionCount: Long = {
+    store.count(classOf[SessionInfo])
+  }
+
+  def getExecutionCount: Long = {
+    store.count(classOf[ExecutionInfo])
+  }
+}
+
+private[connect] class SessionInfo(
+    @KVIndexParam val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val finishTimestamp: Long,
+    val totalExecution: Long) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L) finishTimestamp else -1L
+  def totalTime: Long = {
+    if (finishTimestamp == 0L) {
+      System.currentTimeMillis - startTimestamp
+    } else {
+      finishTimestamp - startTimestamp
+    }
+  }
+}
+
+private[connect] class ExecutionInfo(
+    @KVIndexParam val jobTag: String,
+    val statement: String,
+    val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val operationId: String,
+    val sparkSessionTags: Set[String],
+    val finishTimestamp: Long,
+    val closeTimestamp: Long,
+    val detail: String,
+    val state: ExecutionState.Value,
+    val jobId: ArrayBuffer[String],
+    val sqlExecId: mutable.Set[String]) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L && !isExecutionActive) {
+    finishTimestamp
+  } else -1L
+
+  @JsonIgnore @KVIndex("isExecutionActive")
+  def isExecutionActive: Boolean = {
+    state == ExecutionState.STARTED ||
+    state == ExecutionState.COMPILED ||
+    state == ExecutionState.READY
+  }
+
+  def totalTime(endTime: Long): Long = {
+    if (endTime == 0L) {
+      System.currentTimeMillis - startTimestamp
+    } else {
+      endTime - startTimestamp
+    }
+  }
+}
+
+private[connect] object ExecutionState extends Enumeration {
+  val STARTED, COMPILED, READY, CANCELED, FAILED, FINISHED, CLOSED = Value
+  type ExecutionState = Value
+}

Review Comment:
   I have the UI display the state as RUNNING is isActiveExecution is true, which covers STARTED, COMPILED, READY: https://github.com/apache/spark/pull/41964/commits/8abdecdc7cedb6ff5a13090dd4bcae2ecf50d0d0#diff-ef2b5b937c2c5e83d6ca9804c9d402d263bf802550cc8daeead089498c123aa0L322.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276743396


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerAppStatusStore.scala:
##########
@@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import scala.collection.mutable.ArrayBuffer
+
+import com.fasterxml.jackson.annotation.JsonIgnore
+
+import org.apache.spark.status.KVUtils
+import org.apache.spark.status.KVUtils.KVIndexParam
+import org.apache.spark.util.kvstore.{KVIndex, KVStore}
+
+class SparkConnectServerAppStatusStore(store: KVStore) {
+  def getSessionList: Seq[SessionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[SessionInfo]))
+  }
+
+  def getExecutionList: Seq[ExecutionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[ExecutionInfo]))
+  }
+
+  def getOnlineSessionNum: Int = {
+    KVUtils.count(store.view(classOf[SessionInfo]))(_.finishTimestamp == 0)
+  }
+
+  def getSession(sessionId: String): Option[SessionInfo] = {
+    try {
+      Some(store.read(classOf[SessionInfo], sessionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  def getExecution(executionId: String): Option[ExecutionInfo] = {
+    try {
+      Some(store.read(classOf[ExecutionInfo], executionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  /**
+   * When an error or a cancellation occurs, we set the finishTimestamp of the statement.
+   * Therefore, when we count the number of running statements, we need to exclude errors and
+   * cancellations and count all statements that have not been closed so far.
+   */
+  def getTotalRunning: Int = {
+    KVUtils.count(store.view(classOf[ExecutionInfo]))(_.isExecutionActive)
+  }
+
+  def getSessionCount: Long = {
+    store.count(classOf[SessionInfo])
+  }
+
+  def getExecutionCount: Long = {
+    store.count(classOf[ExecutionInfo])
+  }
+}
+
+private[connect] class SessionInfo(
+    @KVIndexParam val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val finishTimestamp: Long,
+    val totalExecution: Long) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L) finishTimestamp else -1L
+  def totalTime: Long = {
+    if (finishTimestamp == 0L) {
+      System.currentTimeMillis - startTimestamp
+    } else {
+      finishTimestamp - startTimestamp
+    }
+  }
+}
+
+private[connect] class ExecutionInfo(
+    @KVIndexParam val jobTag: String,
+    val statement: String,
+    val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val finishTimestamp: Long,
+    val closeTimestamp: Long,
+    val executePlan: String,
+    val detail: String,
+    val state: ExecutionState.Value,
+    val jobId: ArrayBuffer[String],
+    val sqlExecId: ArrayBuffer[String]) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L && !isExecutionActive) {
+    finishTimestamp
+  } else -1L
+
+  @JsonIgnore @KVIndex("isExecutionActive")
+  def isExecutionActive: Boolean = {
+    !(state == ExecutionState.FAILED ||

Review Comment:
   Updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654610665

   @jasonli-db Please try more tests. I did a quick test and find the SQL IDs need deduplication.
   <img width="2526" alt="image" src="https://github.com/apache/spark/assets/1097932/975760a1-c60c-451c-b74d-2a36e7b4a7d4">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1277593817


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerAppStatusStore.scala:
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+import com.fasterxml.jackson.annotation.JsonIgnore
+
+import org.apache.spark.status.KVUtils
+import org.apache.spark.status.KVUtils.KVIndexParam
+import org.apache.spark.util.kvstore.{KVIndex, KVStore}
+
+class SparkConnectServerAppStatusStore(store: KVStore) {
+  def getSessionList: Seq[SessionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[SessionInfo]))
+  }
+
+  def getExecutionList: Seq[ExecutionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[ExecutionInfo]))
+  }
+
+  def getOnlineSessionNum: Int = {
+    KVUtils.count(store.view(classOf[SessionInfo]))(_.finishTimestamp == 0)
+  }
+
+  def getSession(sessionId: String): Option[SessionInfo] = {
+    try {
+      Some(store.read(classOf[SessionInfo], sessionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  def getExecution(executionId: String): Option[ExecutionInfo] = {
+    try {
+      Some(store.read(classOf[ExecutionInfo], executionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  /**
+   * When an error or a cancellation occurs, we set the finishTimestamp of the statement.
+   * Therefore, when we count the number of running statements, we need to exclude errors and
+   * cancellations and count all statements that have not been closed so far.
+   */
+  def getTotalRunning: Int = {
+    KVUtils.count(store.view(classOf[ExecutionInfo]))(_.isExecutionActive)
+  }
+
+  def getSessionCount: Long = {
+    store.count(classOf[SessionInfo])
+  }
+
+  def getExecutionCount: Long = {
+    store.count(classOf[ExecutionInfo])
+  }
+}
+
+private[connect] class SessionInfo(
+    @KVIndexParam val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val finishTimestamp: Long,
+    val totalExecution: Long) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L) finishTimestamp else -1L
+  def totalTime: Long = {
+    if (finishTimestamp == 0L) {
+      System.currentTimeMillis - startTimestamp
+    } else {
+      finishTimestamp - startTimestamp
+    }
+  }
+}
+
+private[connect] class ExecutionInfo(
+    @KVIndexParam val jobTag: String,
+    val statement: String,
+    val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val operationId: String,
+    val sparkSessionTags: Set[String],
+    val finishTimestamp: Long,
+    val closeTimestamp: Long,
+    val detail: String,
+    val state: ExecutionState.Value,
+    val jobId: ArrayBuffer[String],
+    val sqlExecId: mutable.Set[String]) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L && !isExecutionActive) {
+    finishTimestamp
+  } else -1L
+
+  @JsonIgnore @KVIndex("isExecutionActive")
+  def isExecutionActive: Boolean = {
+    state == ExecutionState.STARTED ||
+    state == ExecutionState.COMPILED ||
+    state == ExecutionState.READY
+  }
+
+  def totalTime(endTime: Long): Long = {
+    if (endTime == 0L) {
+      System.currentTimeMillis - startTimestamp
+    } else {
+      endTime - startTimestamp
+    }
+  }
+}
+
+private[connect] object ExecutionState extends Enumeration {
+  val STARTED, COMPILED, READY, CANCELED, FAILED, FINISHED, CLOSED = Value
+  type ExecutionState = Value
+}

Review Comment:
   @bogdanghit see https://github.com/apache/spark/pull/41748 from @jdesjean . Jean did some work to be able to figure out the "READY" state to come before the start of eager commands. In Thriftserver there's a bit of ugly issue that eager commands would execute even before the COMPILED state, because they are eagerly executed straight in analysis.
   Thriftserver could use https://github.com/apache/spark/pull/41748 to also display information about eager execution better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1264232249


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -259,13 +281,15 @@ object SparkConnectService {
     if (debugMode) {
       sb.addService(ProtoReflectionService.newInstance())
     }
+
     server = sb.build
     server.start()
+    createListenerAndUI(sparkContextOpt.getOrElse(SparkSession.active.sparkContext))
   }
 
   // Starts the service
-  def start(): Unit = {
-    startGRPCService()
+  def start(sparkContextOpt: Option[SparkContext] = None): Unit = {

Review Comment:
   make it `sparkContext: SparkContext` , not an option, and in SparkConnectServer pass it explcitly from the session there. better that than having to `.getOrElse(SparkSession.active.sparkContext)` without controlling if there's an active session here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1267386309


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectServer.scala:
##########
@@ -49,6 +49,7 @@ object SparkConnectServer extends Logging {
       SparkConnectService.server.awaitTermination()
     } finally {
       session.stop()
+      SparkConnectService.uiTab.foreach(_.detach())

Review Comment:
   Does it get called on this path? I only saw `SparkConnectService.server.awaitTermination()` on line 49. When I searched for SparkConnectService.stop(), I only saw two occurrences in [SimpleSparkConnectService](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/SimpleSparkConnectService.scala#L55) and [SparkConnectPlugin](https://github.com/apache/spark/blob/master/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/SparkConnectPlugin.scala)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] jasonli-db commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "jasonli-db (via GitHub)" <gi...@apache.org>.
jasonli-db commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1267381034


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala:
##########
@@ -259,13 +281,15 @@ object SparkConnectService {
     if (debugMode) {
       sb.addService(ProtoReflectionService.newInstance())
     }
+
     server = sb.build
     server.start()
+    createListenerAndUI(sc)

Review Comment:
   Updated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276661367


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerAppStatusStore.scala:
##########
@@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import scala.collection.mutable.ArrayBuffer
+
+import com.fasterxml.jackson.annotation.JsonIgnore
+
+import org.apache.spark.status.KVUtils
+import org.apache.spark.status.KVUtils.KVIndexParam
+import org.apache.spark.util.kvstore.{KVIndex, KVStore}
+
+class SparkConnectServerAppStatusStore(store: KVStore) {
+  def getSessionList: Seq[SessionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[SessionInfo]))
+  }
+
+  def getExecutionList: Seq[ExecutionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[ExecutionInfo]))
+  }
+
+  def getOnlineSessionNum: Int = {
+    KVUtils.count(store.view(classOf[SessionInfo]))(_.finishTimestamp == 0)
+  }
+
+  def getSession(sessionId: String): Option[SessionInfo] = {
+    try {
+      Some(store.read(classOf[SessionInfo], sessionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  def getExecution(executionId: String): Option[ExecutionInfo] = {
+    try {
+      Some(store.read(classOf[ExecutionInfo], executionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  /**
+   * When an error or a cancellation occurs, we set the finishTimestamp of the statement.
+   * Therefore, when we count the number of running statements, we need to exclude errors and
+   * cancellations and count all statements that have not been closed so far.
+   */
+  def getTotalRunning: Int = {
+    KVUtils.count(store.view(classOf[ExecutionInfo]))(_.isExecutionActive)
+  }
+
+  def getSessionCount: Long = {
+    store.count(classOf[SessionInfo])
+  }
+
+  def getExecutionCount: Long = {
+    store.count(classOf[ExecutionInfo])
+  }
+}
+
+private[connect] class SessionInfo(
+    @KVIndexParam val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val finishTimestamp: Long,
+    val totalExecution: Long) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L) finishTimestamp else -1L
+  def totalTime: Long = {
+    if (finishTimestamp == 0L) {
+      System.currentTimeMillis - startTimestamp
+    } else {
+      finishTimestamp - startTimestamp
+    }
+  }
+}
+
+private[connect] class ExecutionInfo(
+    @KVIndexParam val jobTag: String,
+    val statement: String,
+    val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val finishTimestamp: Long,
+    val closeTimestamp: Long,
+    val executePlan: String,
+    val detail: String,
+    val state: ExecutionState.Value,
+    val jobId: ArrayBuffer[String],
+    val sqlExecId: ArrayBuffer[String]) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L && !isExecutionActive) {
+    finishTimestamp
+  } else -1L
+
+  @JsonIgnore @KVIndex("isExecutionActive")
+  def isExecutionActive: Boolean = {
+    !(state == ExecutionState.FAILED ||

Review Comment:
   Shall we use allow list to make it more robust?
   E.g. check the state is among STARTED, COMPILED, READY



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] rednaxelafx commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "rednaxelafx (via GitHub)" <gi...@apache.org>.
rednaxelafx commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276702709


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerHistoryServerPlugin.scala:
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import org.apache.spark.SparkConf
+import org.apache.spark.scheduler.SparkListener
+import org.apache.spark.status.{AppHistoryServerPlugin, ElementTrackingStore}
+import org.apache.spark.ui.SparkUI
+
+class SparkConnectServerHistoryServerPlugin extends AppHistoryServerPlugin {
+
+  override def createListeners(
+      conf: SparkConf,
+      store: ElementTrackingStore): Seq[SparkListener] = {
+    Seq(new SparkConnectServerListener(store, conf))
+  }
+
+  override def setupUI(ui: SparkUI): Unit = {
+    val store = new SparkConnectServerAppStatusStore(ui.store.store)
+    if (store.getSessionCount > 0) {
+      new SparkConnectServerTab(store, ui)
+    }
+  }
+
+  override def displayOrder: Int = 1

Review Comment:
   It'd be good to have this, but because of how other SHS plugins are also broken in the same way, it may be clearer to have a separate PR that fixes the displayOrder of all of the plugins in one go so that we can see the context of why the values were chosen in one single PR.
   
   e.g. if we set this plugin's display order to `4`, it wouldn't be very obvious just from this PR alone why it should be a `4` while a lot of other plugins have a display order of `1`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1276817771


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerPage.scala:
##########
@@ -0,0 +1,529 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import java.net.URLEncoder
+import java.nio.charset.StandardCharsets.UTF_8
+import javax.servlet.http.HttpServletRequest
+
+import scala.xml.Node
+
+import org.apache.commons.text.StringEscapeUtils
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.connect.ui.ToolTips._
+import org.apache.spark.ui._
+import org.apache.spark.ui.UIUtils._
+import org.apache.spark.util.Utils
+
+/** Page for Spark UI that shows statistics for a Spark Connect Server. */
+private[ui] class SparkConnectServerPage(parent: SparkConnectServerTab)
+    extends WebUIPage("")
+    with Logging {
+  private val store = parent.store
+  private val startTime = parent.startTime
+
+  /** Render the page */
+  def render(request: HttpServletRequest): Seq[Node] = {
+    val content = store.synchronized { // make sure all parts in this page are consistent
+      generateBasicStats() ++
+        <br/> ++
+        <h4>
+          {store.getOnlineSessionNum}
+          session(s) are online,
+          running
+          {store.getTotalRunning}
+          Request(s)
+        </h4> ++
+        generateSessionStatsTable(request) ++
+        generateSQLStatsTable(request)
+    }
+    UIUtils.headerSparkPage(request, "Spark Connect", content, parent)
+  }
+
+  /** Generate basic stats of the Spark Connect server */
+  private def generateBasicStats(): Seq[Node] = {
+    val timeSinceStart = System.currentTimeMillis() - startTime.getTime
+    <ul class ="list-unstyled">
+      <li>
+        <strong>Started at: </strong> {formatDate(startTime)}
+      </li>
+      <li>
+        <strong>Time since start: </strong>{formatDurationVerbose(timeSinceStart)}
+      </li>
+    </ul>
+  }
+
+  /** Generate stats of batch statements of the Spark Connect program */
+  private def generateSQLStatsTable(request: HttpServletRequest): Seq[Node] = {
+
+    val numStatement = store.getExecutionList.size
+
+    val table = if (numStatement > 0) {
+
+      val sqlTableTag = "sqlstat"
+
+      val sqlTablePage =
+        Option(request.getParameter(s"$sqlTableTag.page")).map(_.toInt).getOrElse(1)
+
+      try {
+        Some(
+          new SqlStatsPagedTable(
+            request,
+            parent,
+            store.getExecutionList,
+            "connect",
+            UIUtils.prependBaseUri(request, parent.basePath),
+            sqlTableTag,
+            showSessionLink = true).table(sqlTablePage))
+      } catch {
+        case e @ (_: IllegalArgumentException | _: IndexOutOfBoundsException) =>
+          Some(<div class="alert alert-error">
+            <p>Error while rendering job table:</p>
+            <pre>
+              {Utils.exceptionString(e)}
+            </pre>
+          </div>)
+      }
+    } else {
+      None
+    }
+    val content =
+      <span id="sqlstat" class="collapse-aggregated-sqlstat collapse-table"
+            onClick="collapseTable('collapse-aggregated-sqlstat',
+                'aggregated-sqlstat')">
+        <h4>
+          <span class="collapse-table-arrow arrow-open"></span>
+          <a>Request Statistics ({numStatement})</a>
+        </h4>
+      </span> ++
+        <div class="aggregated-sqlstat collapsible-table">
+          {table.getOrElse("No statistics have been generated yet.")}
+        </div>
+    content
+  }
+
+  /** Generate stats of batch sessions of the Spark Connect server */
+  private def generateSessionStatsTable(request: HttpServletRequest): Seq[Node] = {
+    val numSessions = store.getSessionList.size
+    val table = if (numSessions > 0) {
+
+      val sessionTableTag = "sessionstat"
+
+      val sessionTablePage =
+        Option(request.getParameter(s"$sessionTableTag.page")).map(_.toInt).getOrElse(1)
+
+      try {
+        Some(
+          new SessionStatsPagedTable(
+            request,
+            parent,
+            store.getSessionList,
+            "connect",
+            UIUtils.prependBaseUri(request, parent.basePath),
+            sessionTableTag).table(sessionTablePage))
+      } catch {
+        case e @ (_: IllegalArgumentException | _: IndexOutOfBoundsException) =>
+          Some(<div class="alert alert-error">
+            <p>Error while rendering job table:</p>
+            <pre>
+              {Utils.exceptionString(e)}
+            </pre>
+          </div>)
+      }
+    } else {
+      None
+    }
+
+    val content =
+      <span id="sessionstat" class="collapse-aggregated-sessionstat collapse-table"
+            onClick="collapseTable('collapse-aggregated-sessionstat',
+                'aggregated-sessionstat')">
+        <h4>
+          <span class="collapse-table-arrow arrow-open"></span>
+          <a>Session Statistics ({numSessions})</a>
+        </h4>
+      </span> ++
+        <div class="aggregated-sessionstat collapsible-table">
+          {table.getOrElse("No statistics have been generated yet.")}
+        </div>
+
+    content
+  }
+}
+
+private[ui] class SqlStatsPagedTable(
+    request: HttpServletRequest,
+    parent: SparkConnectServerTab,
+    data: Seq[ExecutionInfo],
+    subPath: String,
+    basePath: String,
+    sqlStatsTableTag: String,
+    showSessionLink: Boolean)
+    extends PagedTable[SqlStatsTableRow] {
+  private val (sortColumn, desc, pageSize) =
+    getTableParameters(request, sqlStatsTableTag, "Start Time")
+
+  private val encodedSortColumn = URLEncoder.encode(sortColumn, UTF_8.name())
+
+  private val parameterPath =
+    s"$basePath/$subPath/?${getParameterOtherTable(request, sqlStatsTableTag)}"
+
+  override val dataSource = new SqlStatsTableDataSource(data, pageSize, sortColumn, desc)
+
+  override def tableId: String = sqlStatsTableTag
+
+  override def tableCssClass: String =
+    "table table-bordered table-sm table-striped table-head-clickable table-cell-width-limited"
+
+  override def pageLink(page: Int): String = {
+    parameterPath +
+      s"&$pageNumberFormField=$page" +
+      s"&$sqlStatsTableTag.sort=$encodedSortColumn" +
+      s"&$sqlStatsTableTag.desc=$desc" +
+      s"&$pageSizeFormField=$pageSize" +
+      s"#$sqlStatsTableTag"
+  }
+
+  override def pageSizeFormField: String = s"$sqlStatsTableTag.pageSize"
+
+  override def pageNumberFormField: String = s"$sqlStatsTableTag.page"
+
+  override def goButtonFormPath: String =
+    s"$parameterPath&$sqlStatsTableTag.sort=$encodedSortColumn" +
+      s"&$sqlStatsTableTag.desc=$desc#$sqlStatsTableTag"
+
+  override def headers: Seq[Node] = {
+    val sqlTableHeadersAndTooltips: Seq[(String, Boolean, Option[String])] =
+      if (showSessionLink) {
+        Seq(
+          ("User", true, None),
+          ("Job Tag", true, None),
+          ("Job ID", true, None),
+          ("SQL Query ID", true, None),
+          ("Session ID", true, None),
+          ("Start Time", true, None),
+          ("Finish Time", true, Some(SPARK_CONNECT_SERVER_FINISH_TIME)),
+          ("Close Time", true, Some(SPARK_CONNECT_SERVER_CLOSE_TIME)),
+          ("Execution Time", true, Some(SPARK_CONNECT_SERVER_EXECUTION)),
+          ("Duration", true, Some(SPARK_CONNECT_SERVER_DURATION)),
+          ("Statement", true, None),
+          ("State", true, None),
+          ("Detail", true, None))
+      } else {
+        Seq(
+          ("User", true, None),
+          ("Job Tag", true, None),
+          ("Job ID", true, None),
+          ("SQL Query ID", true, None),
+          ("Start Time", true, None),
+          ("Finish Time", true, Some(SPARK_CONNECT_SERVER_FINISH_TIME)),
+          ("Close Time", true, Some(SPARK_CONNECT_SERVER_CLOSE_TIME)),
+          ("Execution Time", true, Some(SPARK_CONNECT_SERVER_EXECUTION)),
+          ("Duration", true, Some(SPARK_CONNECT_SERVER_DURATION)),
+          ("Statement", true, None),
+          ("State", true, None),
+          ("Detail", true, None))
+      }
+
+    isSortColumnValid(sqlTableHeadersAndTooltips, sortColumn)
+
+    headerRow(
+      sqlTableHeadersAndTooltips,
+      desc,
+      pageSize,
+      sortColumn,
+      parameterPath,
+      sqlStatsTableTag,
+      sqlStatsTableTag)
+  }
+
+  override def row(sqlStatsTableRow: SqlStatsTableRow): Seq[Node] = {
+    val info = sqlStatsTableRow.executionInfo
+    val startTime = info.startTimestamp
+    val executionTime = sqlStatsTableRow.executionTime
+    val duration = sqlStatsTableRow.duration
+
+    def jobLinks(jobData: Seq[String]): Seq[Node] = {
+      jobData.map { jobId =>
+        <a href={jobURL(request, jobId)}>[{jobId}]</a>
+      }
+    }
+    def sqlLinks(sqlData: Seq[String]): Seq[Node] = {
+      sqlData.map { sqlExecId =>
+        <a href={sqlURL(request, sqlExecId)}>[{sqlExecId}]</a>
+      }
+    }
+    val sessionLink = "%s/%s/session/?id=%s".format(
+      UIUtils.prependBaseUri(request, parent.basePath),
+      parent.prefix,
+      info.sessionId)
+
+    <tr>
+      <td>
+        {info.userId}
+      </td>
+      <td>
+        {info.jobTag}
+      </td>
+      <td>
+        {jobLinks(sqlStatsTableRow.jobId)}
+      </td>
+      <td>
+        {sqlLinks({ info.sqlExecId })}
+      </td>
+      {
+      if (showSessionLink) {
+        <td>
+          <a href={sessionLink}>{info.sessionId}</a>
+        </td>
+      }
+    }
+      <td>
+        {UIUtils.formatDate(startTime)}
+      </td>
+      <td>
+        {if (info.finishTimestamp > 0) formatDate(info.finishTimestamp)}
+      </td>
+      <td>
+        {if (info.closeTimestamp > 0) formatDate(info.closeTimestamp)}
+      </td>
+      <!-- Returns a human-readable string representing a duration such as "5 second 35 ms"-->
+      <td >
+        {formatDurationVerbose(executionTime)}
+      </td>
+      <td >
+        {formatDurationVerbose(duration)}
+      </td>
+      <td>
+        <span class="description-input">
+          {info.statement}
+        </span>
+      </td>
+      <td>
+        {info.state}
+      </td>
+      {errorMessageCell(Option(sqlStatsTableRow.detail))}
+    </tr>
+  }
+
+  private def errorMessageCell(errorMessageOption: Option[String]): Seq[Node] = {
+    val errorMessage = errorMessageOption.getOrElse("")
+    val isMultiline = errorMessage.indexOf('\n') >= 0
+    val errorSummary = StringEscapeUtils.escapeHtml4(if (isMultiline) {
+      errorMessage.substring(0, errorMessage.indexOf('\n'))
+    } else {
+      errorMessage
+    })
+    val details = detailsUINode(isMultiline, errorMessage)
+    <td>
+      {errorSummary}{details}
+    </td>
+  }
+
+  private def jobURL(request: HttpServletRequest, jobId: String): String =
+    "%s/jobs/job/?id=%s".format(UIUtils.prependBaseUri(request, parent.basePath), jobId)
+
+  private def sqlURL(request: HttpServletRequest, sqlExecId: String): String =
+    "%s/SQL/execution/?id=%s".format(UIUtils.prependBaseUri(request, parent.basePath), sqlExecId)
+}
+
+private[ui] class SessionStatsPagedTable(
+    request: HttpServletRequest,
+    parent: SparkConnectServerTab,
+    data: Seq[SessionInfo],
+    subPath: String,
+    basePath: String,
+    sessionStatsTableTag: String)
+    extends PagedTable[SessionInfo] {
+
+  private val (sortColumn, desc, pageSize) =
+    getTableParameters(request, sessionStatsTableTag, "Start Time")
+
+  private val encodedSortColumn = URLEncoder.encode(sortColumn, UTF_8.name())
+
+  private val parameterPath =
+    s"$basePath/$subPath/?${getParameterOtherTable(request, sessionStatsTableTag)}"
+
+  override val dataSource = new SessionStatsTableDataSource(data, pageSize, sortColumn, desc)
+
+  override def tableId: String = sessionStatsTableTag
+
+  override def tableCssClass: String =
+    "table table-bordered table-sm table-striped table-head-clickable table-cell-width-limited"
+
+  override def pageLink(page: Int): String = {
+    parameterPath +
+      s"&$pageNumberFormField=$page" +
+      s"&$sessionStatsTableTag.sort=$encodedSortColumn" +
+      s"&$sessionStatsTableTag.desc=$desc" +
+      s"&$pageSizeFormField=$pageSize" +
+      s"#$sessionStatsTableTag"
+  }
+
+  override def pageSizeFormField: String = s"$sessionStatsTableTag.pageSize"
+
+  override def pageNumberFormField: String = s"$sessionStatsTableTag.page"
+
+  override def goButtonFormPath: String =
+    s"$parameterPath&$sessionStatsTableTag.sort=$encodedSortColumn" +
+      s"&$sessionStatsTableTag.desc=$desc#$sessionStatsTableTag"
+
+  override def headers: Seq[Node] = {
+    val sessionTableHeadersAndTooltips: Seq[(String, Boolean, Option[String])] =
+      Seq(
+        ("User", true, None),
+        ("Session ID", true, None),
+        ("Start Time", true, None),
+        ("Finish Time", true, None),
+        ("Duration", true, Some(SPARK_CONNECT_SESSION_DURATION)),
+        ("Total Execute", true, Some(SPARK_CONNECT_SESSION_TOTAL_EXECUTE)))
+
+    isSortColumnValid(sessionTableHeadersAndTooltips, sortColumn)
+
+    headerRow(
+      sessionTableHeadersAndTooltips,
+      desc,
+      pageSize,
+      sortColumn,
+      parameterPath,
+      sessionStatsTableTag,
+      sessionStatsTableTag)
+  }
+
+  override def row(session: SessionInfo): Seq[Node] = {
+    val sessionLink = "%s/%s/session/?id=%s".format(
+      UIUtils.prependBaseUri(request, parent.basePath),
+      parent.prefix,
+      session.sessionId)
+    <tr>
+      <td> {session.userId} </td>
+      <td> <a href={sessionLink}> {session.sessionId} </a> </td>
+      <td> {formatDate(session.startTimestamp)} </td>
+      <td> {if (session.finishTimestamp > 0) formatDate(session.finishTimestamp)} </td>
+      <td> {formatDurationVerbose(session.totalTime)} </td>
+      <td> {session.totalExecution.toString} </td>
+    </tr>
+  }
+}
+
+private[ui] class SqlStatsTableRow(
+    val jobTag: String,
+    val jobId: Seq[String],
+    val sqlExecId: Seq[String],
+    val duration: Long,
+    val executionTime: Long,
+    val executionInfo: ExecutionInfo,
+    val detail: String)
+
+private[ui] class SqlStatsTableDataSource(
+    info: Seq[ExecutionInfo],
+    pageSize: Int,
+    sortColumn: String,
+    desc: Boolean)
+    extends PagedDataSource[SqlStatsTableRow](pageSize) {
+
+  // Convert ExecutionInfo to SqlStatsTableRow which contains the final contents to show in
+  // the table so that we can avoid creating duplicate contents during sorting the data
+  private val data = info.map(sqlStatsTableRow).sorted(ordering(sortColumn, desc))
+
+  override def dataSize: Int = data.size
+
+  override def sliceData(from: Int, to: Int): Seq[SqlStatsTableRow] = data.slice(from, to)
+
+  private def sqlStatsTableRow(executionInfo: ExecutionInfo): SqlStatsTableRow = {
+    val duration = executionInfo.totalTime(executionInfo.closeTimestamp)
+    val executionTime = executionInfo.totalTime(executionInfo.finishTimestamp)
+    val detail = Option(executionInfo.detail)
+      .filter(!_.isEmpty)
+      .getOrElse(executionInfo.executePlan)
+    val jobId = executionInfo.jobId.toSeq.sorted
+    val sqlExecId = executionInfo.sqlExecId.toSeq.sorted
+
+    new SqlStatsTableRow(
+      executionInfo.jobTag,
+      jobId,
+      sqlExecId,
+      duration,
+      executionTime,
+      executionInfo,
+      detail)
+  }
+
+  /**
+   * Return Ordering according to sortColumn and desc.
+   */
+  private def ordering(sortColumn: String, desc: Boolean): Ordering[SqlStatsTableRow] = {
+    val ordering: Ordering[SqlStatsTableRow] = sortColumn match {
+      case "User" => Ordering.by(_.executionInfo.userId)
+      case "Job Tag" => Ordering.by(_.executionInfo.jobTag)
+      case "Job ID" => Ordering by (_.jobId.headOption)

Review Comment:
   <img width="354" alt="image" src="https://github.com/apache/spark/assets/1097932/62c47d17-56b2-4187-ab59-f96a5af1b240">
   Is the ordering working?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.
gengliangwang commented on PR #41964:
URL: https://github.com/apache/spark/pull/41964#issuecomment-1654615213

   In the screenshot above, the close time and finish time column seems duplicated. And the execution time and duration also seems quite similar. Why do we have so many columns about the timings?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] juliuszsompolski commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "juliuszsompolski (via GitHub)" <gi...@apache.org>.
juliuszsompolski commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1277334310


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerAppStatusStore.scala:
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+import com.fasterxml.jackson.annotation.JsonIgnore
+
+import org.apache.spark.status.KVUtils
+import org.apache.spark.status.KVUtils.KVIndexParam
+import org.apache.spark.util.kvstore.{KVIndex, KVStore}
+
+class SparkConnectServerAppStatusStore(store: KVStore) {
+  def getSessionList: Seq[SessionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[SessionInfo]))
+  }
+
+  def getExecutionList: Seq[ExecutionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[ExecutionInfo]))
+  }
+
+  def getOnlineSessionNum: Int = {
+    KVUtils.count(store.view(classOf[SessionInfo]))(_.finishTimestamp == 0)
+  }
+
+  def getSession(sessionId: String): Option[SessionInfo] = {
+    try {
+      Some(store.read(classOf[SessionInfo], sessionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  def getExecution(executionId: String): Option[ExecutionInfo] = {
+    try {
+      Some(store.read(classOf[ExecutionInfo], executionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  /**
+   * When an error or a cancellation occurs, we set the finishTimestamp of the statement.
+   * Therefore, when we count the number of running statements, we need to exclude errors and
+   * cancellations and count all statements that have not been closed so far.
+   */
+  def getTotalRunning: Int = {
+    KVUtils.count(store.view(classOf[ExecutionInfo]))(_.isExecutionActive)
+  }
+
+  def getSessionCount: Long = {
+    store.count(classOf[SessionInfo])
+  }
+
+  def getExecutionCount: Long = {
+    store.count(classOf[ExecutionInfo])
+  }
+}
+
+private[connect] class SessionInfo(
+    @KVIndexParam val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val finishTimestamp: Long,
+    val totalExecution: Long) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L) finishTimestamp else -1L
+  def totalTime: Long = {
+    if (finishTimestamp == 0L) {
+      System.currentTimeMillis - startTimestamp
+    } else {
+      finishTimestamp - startTimestamp
+    }
+  }
+}
+
+private[connect] class ExecutionInfo(
+    @KVIndexParam val jobTag: String,
+    val statement: String,
+    val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val operationId: String,
+    val sparkSessionTags: Set[String],
+    val finishTimestamp: Long,
+    val closeTimestamp: Long,
+    val detail: String,
+    val state: ExecutionState.Value,
+    val jobId: ArrayBuffer[String],
+    val sqlExecId: mutable.Set[String]) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L && !isExecutionActive) {
+    finishTimestamp
+  } else -1L
+
+  @JsonIgnore @KVIndex("isExecutionActive")
+  def isExecutionActive: Boolean = {
+    state == ExecutionState.STARTED ||
+    state == ExecutionState.COMPILED ||
+    state == ExecutionState.READY
+  }
+
+  def totalTime(endTime: Long): Long = {
+    if (endTime == 0L) {
+      System.currentTimeMillis - startTimestamp
+    } else {
+      endTime - startTimestamp
+    }
+  }
+}
+
+private[connect] object ExecutionState extends Enumeration {
+  val STARTED, COMPILED, READY, CANCELED, FAILED, FINISHED, CLOSED = Value
+  type ExecutionState = Value
+}

Review Comment:
   For potential followup:
   I find it a bit weird, that it doesn't have a RUNNING execution state, but it's actually the same as in Thriftserver:
   https://github.com/apache/spark/blob/master/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2.scala#L119C58-L119C58
   
   Spark Connect records that it's "ready for execution" at the end of planner (or before eager commands), and we record "finished" when the result stage completes... Should we have another event "RUNNING" that e.g. for SparkConnectPlanExecution would be right before `spark.sparkContext.submitJob`, and for Commands would need to be determined on a per command basis. WDYT @jdesjean ; that would be different from Thriftserver, but it seems to me like that "RUNNING" state is missing from Thriftserver... cc @wangyum @yaooqinn  @bogdanghit WDYT from the Thriftserver perspective?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] bogdanghit commented on a diff in pull request #41964: [SPARK-44394][CONNECT][WEBUI] Add a Spark UI page for Spark Connect

Posted by "bogdanghit (via GitHub)" <gi...@apache.org>.
bogdanghit commented on code in PR #41964:
URL: https://github.com/apache/spark/pull/41964#discussion_r1277558257


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/ui/SparkConnectServerAppStatusStore.scala:
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connect.ui
+
+import scala.collection.mutable
+import scala.collection.mutable.ArrayBuffer
+
+import com.fasterxml.jackson.annotation.JsonIgnore
+
+import org.apache.spark.status.KVUtils
+import org.apache.spark.status.KVUtils.KVIndexParam
+import org.apache.spark.util.kvstore.{KVIndex, KVStore}
+
+class SparkConnectServerAppStatusStore(store: KVStore) {
+  def getSessionList: Seq[SessionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[SessionInfo]))
+  }
+
+  def getExecutionList: Seq[ExecutionInfo] = {
+    KVUtils.viewToSeq(store.view(classOf[ExecutionInfo]))
+  }
+
+  def getOnlineSessionNum: Int = {
+    KVUtils.count(store.view(classOf[SessionInfo]))(_.finishTimestamp == 0)
+  }
+
+  def getSession(sessionId: String): Option[SessionInfo] = {
+    try {
+      Some(store.read(classOf[SessionInfo], sessionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  def getExecution(executionId: String): Option[ExecutionInfo] = {
+    try {
+      Some(store.read(classOf[ExecutionInfo], executionId))
+    } catch {
+      case _: NoSuchElementException => None
+    }
+  }
+
+  /**
+   * When an error or a cancellation occurs, we set the finishTimestamp of the statement.
+   * Therefore, when we count the number of running statements, we need to exclude errors and
+   * cancellations and count all statements that have not been closed so far.
+   */
+  def getTotalRunning: Int = {
+    KVUtils.count(store.view(classOf[ExecutionInfo]))(_.isExecutionActive)
+  }
+
+  def getSessionCount: Long = {
+    store.count(classOf[SessionInfo])
+  }
+
+  def getExecutionCount: Long = {
+    store.count(classOf[ExecutionInfo])
+  }
+}
+
+private[connect] class SessionInfo(
+    @KVIndexParam val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val finishTimestamp: Long,
+    val totalExecution: Long) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L) finishTimestamp else -1L
+  def totalTime: Long = {
+    if (finishTimestamp == 0L) {
+      System.currentTimeMillis - startTimestamp
+    } else {
+      finishTimestamp - startTimestamp
+    }
+  }
+}
+
+private[connect] class ExecutionInfo(
+    @KVIndexParam val jobTag: String,
+    val statement: String,
+    val sessionId: String,
+    val startTimestamp: Long,
+    val userId: String,
+    val operationId: String,
+    val sparkSessionTags: Set[String],
+    val finishTimestamp: Long,
+    val closeTimestamp: Long,
+    val detail: String,
+    val state: ExecutionState.Value,
+    val jobId: ArrayBuffer[String],
+    val sqlExecId: mutable.Set[String]) {
+  @JsonIgnore @KVIndex("finishTime")
+  private def finishTimeIndex: Long = if (finishTimestamp > 0L && !isExecutionActive) {
+    finishTimestamp
+  } else -1L
+
+  @JsonIgnore @KVIndex("isExecutionActive")
+  def isExecutionActive: Boolean = {
+    state == ExecutionState.STARTED ||
+    state == ExecutionState.COMPILED ||
+    state == ExecutionState.READY
+  }
+
+  def totalTime(endTime: Long): Long = {
+    if (endTime == 0L) {
+      System.currentTimeMillis - startTimestamp
+    } else {
+      endTime - startTimestamp
+    }
+  }
+}
+
+private[connect] object ExecutionState extends Enumeration {
+  val STARTED, COMPILED, READY, CANCELED, FAILED, FINISHED, CLOSED = Value
+  type ExecutionState = Value
+}

Review Comment:
   I think a `RUNNING` state should cover the period between it enters `STARTED` state until it enters one of the terminal states. Using this definition, in ThriftServer we can infer the execution is in running / active state if it's not `FAILED, CANCELED, TIMEDOUT or CLOSED.`
   
   Nit: It would be good to keep these states consistent. The SparkConnect execution state has a `READY` state which is missing in ThriftServer. 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org