You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2016/05/19 05:44:11 UTC

[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/13189

    [SPARK-14670][SQL][WIP] allow updating driver side sql metrics

    ## What changes were proposed in this pull request?
    
    On the SparkUI right now we have this SQLTab that displays accumulator values per operator. However, it only displays metrics updated on the executors, not on the driver. It is useful to also include driver metrics, e.g. broadcast time.
    
    This is a different version from https://github.com/apache/spark/pull/12427. This PR sends driver side accumulator updates right after the updating happens, not at the end of execution. But it has some drawback:
    
    1. If there is no update, we won't send zero value updates, and in web UI the operator will be empty, no metrics info in displayed.
    2. We need to trigger the event explicitly, not as simply as just update the accumulator.
    3. maybe hard to use it inside whole stage codegen.
    
    ## How was this patch tested?
    
    TODO
    
    
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark metrics

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13189.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13189
    
----
commit 8db358f801f3dbd9f5eacf20dc10ef773c0d7ccb
Author: Wenchen Fan <we...@databricks.com>
Date:   2016-05-19T05:36:34Z

    allow updating driver side sql metrics

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60089 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60089/consoleFull)** for PR 13189 at commit [`3d1eb36`](https://github.com/apache/spark/commit/3d1eb3622aba30d13b7ed3692f0ac34d71d3dda3).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13189#issuecomment-220238104
  
    **[Test build #58848 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58848/consoleFull)** for PR 13189 at commit [`8db358f`](https://github.com/apache/spark/commit/8db358f801f3dbd9f5eacf20dc10ef773c0d7ccb).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerDriverAccumUpdates(executionId: Long, accumUpdates: Seq[AccumulableInfo])`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60204/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13189: [SPARK-14670][SQL] allow updating driver side sql...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13189#discussion_r66315896
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala ---
    @@ -92,6 +93,14 @@ case class BroadcastExchangeExec(
     
             val broadcasted = sparkContext.broadcast(relation)
             longMetric("broadcastTime") += (System.nanoTime() - beforeBroadcast) / 1000000
    +
    +        // There are some cases we don't care about the metrics and call `SparkPlan.doExecute`
    +        // directly without setting an execution id. We should be tolerant to it.
    +        if (executionId != null) {
    --- End diff --
    
    nit: `executionId` should not be `null`. It's better to use `assert`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60155/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60155 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60155/consoleFull)** for PR 13189 at commit [`bc8d102`](https://github.com/apache/spark/commit/bc8d10232c83828e68337297e880803611f8fec5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60089 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60089/consoleFull)** for PR 13189 at commit [`3d1eb36`](https://github.com/apache/spark/commit/3d1eb3622aba30d13b7ed3692f0ac34d71d3dda3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60111/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13189#discussion_r63914100
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala ---
    @@ -66,25 +67,25 @@ case class BroadcastExchangeExec(
         // broadcastFuture is used in "doExecute". Therefore we can get the execution id correctly here.
         val executionId = sparkContext.getLocalProperty(SQLExecution.EXECUTION_ID_KEY)
         Future {
    -      // This will run in another thread. Set the execution id so that we can connect these jobs
    -      // with the correct execution.
    -      SQLExecution.withExecutionId(sparkContext, executionId) {
    --- End diff --
    
    Why remove this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    `QueryExecution.hiveResultString` will call `SparkPlan.executeCollect` without setting an execution id. This method is only used in test, should we just stop reporting metrics for this case, or create execution id in `hiveResultString`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13189#discussion_r63914891
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala ---
    @@ -360,17 +370,27 @@ private[spark] class SQLHistoryListener(conf: SparkConf, sparkUI: SparkUI)
     /**
      * Represent all necessary data for an execution that will be used in Web UI.
      */
    -private[ui] class SQLExecutionUIData(
    -    val executionId: Long,
    -    val description: String,
    -    val details: String,
    -    val physicalPlanDescription: String,
    -    val physicalPlanGraph: SparkPlanGraph,
    -    val accumulatorMetrics: Map[Long, SQLPlanMetric],
    -    val submissionTime: Long,
    -    var completionTime: Option[Long] = None,
    -    val jobs: mutable.HashMap[Long, JobExecutionStatus] = mutable.HashMap.empty,
    -    val stages: mutable.ArrayBuffer[Int] = mutable.ArrayBuffer()) {
    +private[ui] case class SQLExecutionUIData(
    +    executionId: Long,
    +    description: String,
    +    details: String,
    +    physicalPlanDescription: String,
    +    physicalPlanGraph: SparkPlanGraph,
    +    accumulatorMetrics: Map[Long, SQLPlanMetric],
    +    submissionTime: Long) {
    +
    +  var completionTime: Option[Long] = None
    +
    +  val jobs: mutable.HashMap[Long, JobExecutionStatus] = mutable.HashMap.empty
    +
    +  val stages: mutable.ArrayBuffer[Int] = mutable.ArrayBuffer()
    +
    +  // TODO (cloud-fan): Image that one operator sends driver accumulator updates more than once, we
    +  // should overwrite previous updates with the latest one. But different operators should have
    +  // separated updates, so ideally this should be a map instead of array, and use operator id as
    +  // key, accumulator updates as value. It's ok for now as we only have one operator and only send
    +  // updates once.
    +  val driverAccumUpdates: mutable.ArrayBuffer[AccumulableInfo] = mutable.ArrayBuffer()
    --- End diff --
    
    Could you do you now? We could add some metrics for subqueries.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60204 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60204/consoleFull)** for PR 13189 at commit [`44cd5d1`](https://github.com/apache/spark/commit/44cd5d133a055eb33b89cd50fe3947bfb64bfbd5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60204 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60204/consoleFull)** for PR 13189 at commit [`44cd5d1`](https://github.com/apache/spark/commit/44cd5d133a055eb33b89cd50fe3947bfb64bfbd5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13189: [SPARK-14670][SQL][WIP] allow updating driver sid...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13189#discussion_r65990142
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala ---
    @@ -361,17 +377,22 @@ private[spark] class SQLHistoryListener(conf: SparkConf, sparkUI: SparkUI)
     /**
      * Represent all necessary data for an execution that will be used in Web UI.
      */
    -private[ui] class SQLExecutionUIData(
    -    val executionId: Long,
    -    val description: String,
    -    val details: String,
    -    val physicalPlanDescription: String,
    -    val physicalPlanGraph: SparkPlanGraph,
    -    val accumulatorMetrics: Map[Long, SQLPlanMetric],
    -    val submissionTime: Long,
    -    var completionTime: Option[Long] = None,
    -    val jobs: mutable.HashMap[Long, JobExecutionStatus] = mutable.HashMap.empty,
    -    val stages: mutable.ArrayBuffer[Int] = mutable.ArrayBuffer()) {
    +private[ui] case class SQLExecutionUIData(
    +    executionId: Long,
    +    description: String,
    +    details: String,
    +    physicalPlanDescription: String,
    +    physicalPlanGraph: SparkPlanGraph,
    +    accumulatorMetrics: Map[Long, SQLPlanMetric],
    +    submissionTime: Long) {
    +
    +  var completionTime: Option[Long] = None
    +
    +  val jobs: mutable.HashMap[Long, JobExecutionStatus] = mutable.HashMap.empty
    +
    +  val stages: mutable.ArrayBuffer[Int] = mutable.ArrayBuffer()
    +
    +  val driverAccumUpdates: mutable.HashMap[Long, Long] = mutable.HashMap.empty
    --- End diff --
    
    [Long, String] ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13189: [SPARK-14670][SQL] allow updating driver side sql...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13189#discussion_r65990691
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala ---
    @@ -361,17 +377,22 @@ private[spark] class SQLHistoryListener(conf: SparkConf, sparkUI: SparkUI)
     /**
      * Represent all necessary data for an execution that will be used in Web UI.
      */
    -private[ui] class SQLExecutionUIData(
    -    val executionId: Long,
    -    val description: String,
    -    val details: String,
    -    val physicalPlanDescription: String,
    -    val physicalPlanGraph: SparkPlanGraph,
    -    val accumulatorMetrics: Map[Long, SQLPlanMetric],
    -    val submissionTime: Long,
    -    var completionTime: Option[Long] = None,
    -    val jobs: mutable.HashMap[Long, JobExecutionStatus] = mutable.HashMap.empty,
    -    val stages: mutable.ArrayBuffer[Int] = mutable.ArrayBuffer()) {
    +private[ui] case class SQLExecutionUIData(
    +    executionId: Long,
    +    description: String,
    +    details: String,
    +    physicalPlanDescription: String,
    +    physicalPlanGraph: SparkPlanGraph,
    +    accumulatorMetrics: Map[Long, SQLPlanMetric],
    +    submissionTime: Long) {
    +
    +  var completionTime: Option[Long] = None
    +
    +  val jobs: mutable.HashMap[Long, JobExecutionStatus] = mutable.HashMap.empty
    +
    +  val stages: mutable.ArrayBuffer[Int] = mutable.ArrayBuffer()
    +
    +  val driverAccumUpdates: mutable.HashMap[Long, Long] = mutable.HashMap.empty
    --- End diff --
    
    nvm


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/13189#issuecomment-220232797
  
    cc @andrewor14 @davies 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13189: [SPARK-14670][SQL] allow updating driver side sql...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13189#discussion_r66345250
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala ---
    @@ -92,6 +93,14 @@ case class BroadcastExchangeExec(
     
             val broadcasted = sparkContext.broadcast(relation)
             longMetric("broadcastTime") += (System.nanoTime() - beforeBroadcast) / 1000000
    +
    +        // There are some cases we don't care about the metrics and call `SparkPlan.doExecute`
    +        // directly without setting an execution id. We should be tolerant to it.
    +        if (executionId != null) {
    --- End diff --
    
    it can be null for some cases, see https://github.com/apache/spark/pull/13189#issuecomment-224450749


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60155 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60155/consoleFull)** for PR 13189 at commit [`bc8d102`](https://github.com/apache/spark/commit/bc8d10232c83828e68337297e880803611f8fec5).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60086/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60111 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60111/consoleFull)** for PR 13189 at commit [`2a5b35a`](https://github.com/apache/spark/commit/2a5b35a0e483a8e9e4a2fb1ae7ea74bd5b32f1f5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13189#issuecomment-220238181
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/58848/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60111 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60111/consoleFull)** for PR 13189 at commit [`2a5b35a`](https://github.com/apache/spark/commit/2a5b35a0e483a8e9e4a2fb1ae7ea74bd5b32f1f5).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    LGTM, merging this into master and 2.0, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL][WIP] allow updating driver side sql m...

Posted by andrewor14 <gi...@git.apache.org>.
Github user andrewor14 commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Also cc @zsxwing


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/13189#issuecomment-220238179
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13189: [SPARK-14670][SQL] allow updating driver side sql...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13189#discussion_r66349221
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala ---
    @@ -92,6 +93,14 @@ case class BroadcastExchangeExec(
     
             val broadcasted = sparkContext.broadcast(relation)
             longMetric("broadcastTime") += (System.nanoTime() - beforeBroadcast) / 1000000
    +
    +        // There are some cases we don't care about the metrics and call `SparkPlan.doExecute`
    +        // directly without setting an execution id. We should be tolerant to it.
    +        if (executionId != null) {
    --- End diff --
    
    I see. It will be fixed in #13115. I'm fine with the check now and will ask the contributor to remove it in #13115 after this one is merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60089/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/13189#issuecomment-220233150
  
    **[Test build #58848 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/58848/consoleFull)** for PR 13189 at commit [`8db358f`](https://github.com/apache/spark/commit/8db358f801f3dbd9f5eacf20dc10ef773c0d7ccb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13189: [SPARK-14670][SQL] allow updating driver side sql...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13189


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60086 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60086/consoleFull)** for PR 13189 at commit [`d950a76`](https://github.com/apache/spark/commit/d950a76049e6201fd30065ef29861949a4b6036b).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class SparkListenerDriverAccumUpdates(executionId: Long, accumUpdates: Seq[(Long, Long)])`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60201 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60201/consoleFull)** for PR 13189 at commit [`3cf6790`](https://github.com/apache/spark/commit/3cf6790beca07992a5aed1b7752db775c499e210).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL][WIP] allow updating driver side sql m...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60086 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60086/consoleFull)** for PR 13189 at commit [`d950a76`](https://github.com/apache/spark/commit/d950a76049e6201fd30065ef29861949a4b6036b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    LGTM except some nits


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-14670][SQL][WIP] allow updating driver ...

Posted by davies <gi...@git.apache.org>.
Github user davies commented on the pull request:

    https://github.com/apache/spark/pull/13189#issuecomment-220385998
  
    @cloud-fan Could you have a screen shot for metrics of BroadcastExchange ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    **[Test build #60201 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60201/consoleFull)** for PR 13189 at commit [`3cf6790`](https://github.com/apache/spark/commit/3cf6790beca07992a5aed1b7752db775c499e210).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13189: [SPARK-14670][SQL] allow updating driver side sql...

Posted by zsxwing <gi...@git.apache.org>.
Github user zsxwing commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13189#discussion_r66315667
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLListener.scala ---
    @@ -361,17 +374,22 @@ private[spark] class SQLHistoryListener(conf: SparkConf, sparkUI: SparkUI)
     /**
      * Represent all necessary data for an execution that will be used in Web UI.
      */
    -private[ui] class SQLExecutionUIData(
    -    val executionId: Long,
    -    val description: String,
    -    val details: String,
    -    val physicalPlanDescription: String,
    -    val physicalPlanGraph: SparkPlanGraph,
    -    val accumulatorMetrics: Map[Long, SQLPlanMetric],
    -    val submissionTime: Long,
    -    var completionTime: Option[Long] = None,
    -    val jobs: mutable.HashMap[Long, JobExecutionStatus] = mutable.HashMap.empty,
    -    val stages: mutable.ArrayBuffer[Int] = mutable.ArrayBuffer()) {
    +private[ui] case class SQLExecutionUIData(
    --- End diff --
    
    nit: Since you move some fields out of the constructor, it should not be a case class. Most of the generated methods, e.g., `toString`, `hashCode`, `equals` won't work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60201/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL] allow updating driver side sql metric...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    Seems it is fine to not have metrics when we use hiveResultString.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13189: [SPARK-14670][SQL][WIP] allow updating driver side sql m...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/13189
  
    cc @davies @andrewor14 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org