You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2017/11/28 15:01:18 UTC

[GitHub] spark pull request #19833: [SPARK-22605][SQL] SQL write job should also set ...

GitHub user cloud-fan opened a pull request:

    https://github.com/apache/spark/pull/19833

    [SPARK-22605][SQL] SQL write job should also set Spark task output metrics

    ## What changes were proposed in this pull request?
    
    For SQL write jobs, we only set metrics for the SQL listener and display them in the SQL plan UI. We should also set metrics for Spark task output metrics, which will be shown in spark job UI.
    
    ## How was this patch tested?
    
    test it manually. For a simple write job
    ```
    spark.range(1000).write.parquet("/tmp/p1")
    ```
    now the spark job UI looks like
    ![ui](https://user-images.githubusercontent.com/3182036/33326478-05a25b7c-d490-11e7-96ef-806117774356.jpg)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloud-fan/spark ui

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19833.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19833
    
----
commit e135bceaefade62ca8354f15f2900e8db7783eaa
Author: Wenchen Fan <we...@databricks.com>
Date:   2017-11-28T14:39:30Z

    SQL write job should also set Spark task output metrics

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84276/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by JasonMWhite <gi...@git.apache.org>.
Github user JasonMWhite commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    Thanks @cloud-fan !


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    **[Test build #84261 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84261/testReport)** for PR 19833 at commit [`e135bce`](https://github.com/apache/spark/commit/e135bceaefade62ca8354f15f2900e8db7783eaa).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19833: [SPARK-22605][SQL] SQL write job should also set ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19833


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    cc @rxin @hvanhovell @gatorsmile 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84261/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    **[Test build #84276 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84276/testReport)** for PR 19833 at commit [`be1c62e`](https://github.com/apache/spark/commit/be1c62e5b16a8b4a7bb177be5a5042c9006d2903).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19833: [SPARK-22605][SQL] SQL write job should also set ...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19833#discussion_r153687993
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala ---
    @@ -106,6 +105,13 @@ class BasicWriteTaskStatsTracker(hadoopConf: Configuration)
     
       override def getFinalStats(): WriteTaskStats = {
         statCurrentFile()
    +
    +    // Reports bytesWritten and recordsWritten to the Spark output metrics.
    +    Option(TaskContext.get()).map(_.taskMetrics().outputMetrics).foreach { outputMetrics =>
    --- End diff --
    
    Which test suite covers this newly added logic?
    Previously, BasicWriteTaskStatsTrackerSuite seems to fail (8 of 10 test cases) because TaskContext is absent in the test suite.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by hvanhovell <gi...@git.apache.org>.
Github user hvanhovell commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    LGTM


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19833: [SPARK-22605][SQL] SQL write job should also set ...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19833#discussion_r153758037
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala ---
    @@ -106,6 +105,13 @@ class BasicWriteTaskStatsTracker(hadoopConf: Configuration)
     
       override def getFinalStats(): WriteTaskStats = {
         statCurrentFile()
    +
    +    // Reports bytesWritten and recordsWritten to the Spark output metrics.
    +    Option(TaskContext.get()).map(_.taskMetrics().outputMetrics).foreach { outputMetrics =>
    --- End diff --
    
    I don't know if there is any Spark job/stage UI test in the SQL module, and since this is a pretty trivial change, I just tested it manually. Previously, `BasicWriteTaskStatsTrackerSuite` failed with NPE because I didn't handle null TaskContext here. Now I wrap it with `Option` to make it null safe.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    **[Test build #84276 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84276/testReport)** for PR 19833 at commit [`be1c62e`](https://github.com/apache/spark/commit/be1c62e5b16a8b4a7bb177be5a5042c9006d2903).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19833: [SPARK-22605][SQL] SQL write job should also set Spark t...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19833
  
    **[Test build #84261 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84261/testReport)** for PR 19833 at commit [`e135bce`](https://github.com/apache/spark/commit/e135bceaefade62ca8354f15f2900e8db7783eaa).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org